Consider this df:
import pandas as pd
d = {"a":["Residency", "Citizenship", "Citizenship","Residency",
"Citizenship","Citizenship","Citizenship","Residency"],
"b":["UK","UK","UK","UK","UK","UK","UK","UK"]}
df = pd.DataFrame(d)
—
a b
0 Residency UK
1 Citizenship UK
2 Citizenship UK
3 Residency UK
4 Citizenship UK
5 Citizenship UK
6 Citizenship UK
7 Residency UK
I would like to replace Citizenship and UK with None if the previous value in a is not Residency. Bear in mind that there are many other columns in this df that would need to stay the same. So the end result would be:
a b
0 Residency UK
1 Citizenship UK
2 None None
3 Residency UK
4 Citizenship UK
5 None None
6 None None
7 Residency UK
Use a boolean mask and propagate it to the next row with shift
:
m = df['a'].eq('Residency')
df.loc[~(m|m.shift())] = None
If you don’t want to modify the input in place, you can create a copy wuith where
:
m = df['a'].eq('Residency')
out = df.mask(~(m|m.shift()), None, axis=0)
Output:
a b
0 Residency UK
1 Citizenship UK
2 None None
3 Residency UK
4 Citizenship UK
5 None None
6 None None
7 Residency UK