Let’s say we have the following example pandas dataframe:
import pandas as pd
data = {'A': [1, 2, 3, 4], 'B': [None, 'x', 'y', None]}
df = pd.DataFrame(data)
print(df)
# we get the following
# A B
# 0 1 None
# 1 2 x
# 2 3 y
# 3 4 None
However, problems occur when attempting to filter using df['column'] != None
print(df[df['B'] != None])
# we get the following
# A B
# 0 1 None
# 1 2 x
# 2 3 y
# 3 4 None
It only functions correctly when we use the specific method df['column'].notna()
print(df[df['B'].notna()])
# we get the following
# A B
# 1 2 x
# 2 3 y
Why? Can someone explain this behavior?