This is an extension to this accepted answer.
My DataFrame:
import pandas as pd
df = pd.DataFrame(
{
'a': [-3, -1, -2, -5, 10, -3, -13, -3, -2, 1, 2, -100],
'b': [1, 2, 3, 4, 5, 10, 80, 90, 100, 99, 1, 12]
}
)
Expected output:
a b
5 -3 10
6 -13 80
7 -3 90
8 -2 100
Logic:
a) Selecting the longest streak of negatives in a
.
b) If for example there are two streaks that has same size, I want the one that has a greater sum of b
. In df
there are two streaks with size of 4 but I want the second one because sum of b
is greater.
My Attempt:
import numpy as np
s = np.sign(df['a'])
df['g'] = s.ne(s.shift()).cumsum()
df['size'] = df.groupby('g')['g'].transform('size')
df['b_sum'] = df.groupby('g')['b'].transform('sum')