I am trying to track values that only change under certain conditions. Here is my data:
Variable = [10,20,30,40,50,22, 60,70,100]
Status = [-1,1,-1,-1,-1,1,1,1,1]
zipped = list(zip(Variable, Status))
df = pd.DataFrame(zipped, columns=['Variable', 'Status'])
Results:
Variable Status
0 10 -1
1 20 1
2 30 -1
3 40 -1
4 50 -1
5 22 1
6 60 1
7 70 1
8 100 1
What I want to end up with is:
Variable Status Carried_Value
0 10 -1 10
1 20 1 20
2 30 -1 30
3 40 -1 30
4 50 -1 30
5 22 1 22
6 60 1 22
7 70 1 22
8 100 1 22
Basically the ‘Carried_Value’ tracks ‘Variable’ and only changes when the ‘Status’ changes.
I have tried
col = 'Status'
conditions = [(df[col] == 1) & (df[col].shift(1) == -1),
(df[col] == -1) & (df[col].shift(1) == 1),
(df[col] == 1) & (df[col].shift(1) == 1),
(df[col] == -1) & (df[col].shift(1) == -1)]
choices = [df['Variable'], df['Variable'], df['Variable'].shift(1), df['Variable'].shift(1)]
df['Carried_Value'] = np.select(conditions, choices, default = 0)
But what i get is:
Variable Status Carried_Value
0 10 -1 0.0
1 20 1 20.0
2 30 -1 30.0
3 40 -1 30.0
4 50 -1 40.0
5 22 1 22.0
6 60 1 22.0
7 70 1 60.0
8 100 1 70.0
Ignoring the first row, the .shift(1) approach only seems to work the first time then fails to keep carrying the value i need…
I thought by using the code below it would go row by row and allow me to reference the prior row value in the same column as it was being created but got the same outcome.
for index, row in data.iterrows():
df['Carried_Value'] = np.select(conditions, choices, default=0)
Thanks for the help