I would like to fill the NaN values of column Partner_salary
with 0 where Partner_working
is 'No'
and set the remaining NaN values with the mean of Partner_salary
column.
pd.DataFrame({
'Partner_working': ['Yes','No','Yes','Yes','No'],
'Partner_salary': [np.NaN,np.NaN,1500,1000,0]})
I have tried to use the loc
function to slice the data, but I am not able to continue to the next step
data.loc[data['Partner_salary'].isnull()==True,'Partner_working'].value_counts()
Output:
No 90,Yes 16
pravin panda is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
5
@Rishabh_KT way of appling a function might be easier to read.
If you want to stay with .loc logic, here is another way
# create example df
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Partner_working': ['Yes', 'No', 'Yes', 'Yes', 'No'],
'Partner_salary': [np.NaN, np.NaN, 1500, 1000, 0]
})
# Update 'Partner_salary' to 0 where 'Partner_working' is "No"
df.loc[df['Partner_working'] == "No", 'Partner_salary'] = 0
# Calculate the mean of non-null 'Partner_salary'
mean = df['Partner_salary'].loc[~df['Partner_salary'].isnull()].mean()
# Fill NaN 'Partner_salary' with the mean
df['Partner_salary'].fillna(mean, inplace=True)
I would like to fill ‘Partner_salary’ as 0, where Partner_working is
No.
In that case you can write a function which returns zero if Partner_working is No and return Salary otherwise, and then use a lambda function to modify salary column
def f(Partner_working, Partner_salary):
if Partner_working == 'No':
return 0
else:
return Partner_salary
df['Partner_salary'] = df.apply(lambda x: f(x.Partner_working, x.Partner_salary), axis=1)
Rishabh_KT is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.