I am trying to transfer values from one column to another column in a dataframe, with multiple conditions and not in the same row.
Values from Columns ‘BEGUZ_H’ and ‘ENDUZ_H’ to Columns ‘BEGUZ’ and ‘ENDUZ’ ,
if the Values in Columns ‘PERNR_H’ , ‘WORKDATE_H’ , ‘AWART_H’
are equal with the values in columns ‘PERNR’ , ‘WORKDATE’ , ‘AWART’.
data = {'PERNR': [138888, 138432, 138432, 138437, 138437, 126162, 222829, 234740],
'WORKDATE': ['2024-01-25', '2024-01-10', '2024-01-09', '2024-01-26', '2024-01-24', '2024-01-29', '2024-01-15', '2024-01-09'],
'AWART': [2235, 2235, 2235, 2235, 2235, 2235, 2235, 2235],
'BEGUZ': ['00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00'],
'ENDUZ': ['00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00', '00:00:00'],
'PERNR_H': [0, 0, 126162, 222829, 234740, 0, 0, 138888],
'WORKDATE_H': ['NaN', 'NaN', '2024-01-29', '2024-01-15', '2024-01-09', 'NaN', 'NaN', '2024-01-25'],
'AWART_H': [2235, 2235, 2235, 2235, 2235, 2235, 2235, 2235],
'BEGUZ_H': ['NaN', 'NaN', '07:59:00', '07:50:00', '07:03:00', 'NaN', 'NaN', '07:45:00'],
'ENDUZ_H': ['NaN', 'NaN', '17:30:00', '17:35:00', '18:28:00', 'NaN', 'NaN', '18:59:00']}
df = pd.DataFrame(data)
df
Out[11]:
PERNR WORKDATE AWART BEGUZ ENDUZ PERNR_H WORKDATE_H AWART_H BEGUZ_H ENDUZ_H
0 138888 2024-01-25 2235 00:00:00 00:00:00 0 NaN 2235 NaN NaN
1 138432 2024-01-10 2235 00:00:00 00:00:00 0 NaN 2235 NaN NaN
2 138432 2024-01-09 2235 00:00:00 00:00:00 126162 2024-01-29 2235 07:59:00 17:30:00
3 138437 2024-01-26 2235 00:00:00 00:00:00 222829 2024-01-15 2235 07:50:00 17:35:00
4 138437 2024-01-24 2235 00:00:00 00:00:00 234740 2024-01-09 2235 07:03:00 18:28:00
5 126162 2024-01-29 2235 00:00:00 00:00:00 0 NaN 2235 NaN NaN
6 222829 2024-01-15 2235 00:00:00 00:00:00 0 NaN 2235 NaN NaN
7 234740 2024-01-09 2235 00:00:00 00:00:00 138888 2024-01-25 2235 07:45:00 18:59:00
My approach is ‘np.where’ , but that only works on the same rows.
np.where((df['PERNR']==df['PERNR_H'])&(df['WORKDATE']==df['WORKDATE_H'])&(df['AWART']==df['AWART_H']),df['BEGUZ_H'], df['ENDUZ_H'])
I expect:
PERNR WORKDATE AWART BEGUZ ENDUZ PERNR_H WORKDATE_H AWART_H BEGUZ_H ENDUZ_H
0 138888 2024-01-25 2235 07:45:00 18:59:00 0 NaN 2235 NaN NaN
1 138432 2024-01-10 2235 00:00:00 00:00:00 0 NaN 2235 NaN NaN
2 138432 2024-01-09 2235 00:00:00 00:00:00 126162 2024-01-29 2235 07:59:00 17:30:00
3 138437 2024-01-26 2235 00:00:00 00:00:00 222829 2024-01-15 2235 07:50:00 17:35:00
4 138437 2024-01-24 2235 00:00:00 00:00:00 234740 2024-01-09 2235 07:03:00 18:28:00
5 126162 2024-01-29 2235 07:59:00 17:30:00 0 NaN 2235 NaN NaN
6 222829 2024-01-15 2235 07:50:00 17:35:00 0 NaN 2235 NaN NaN
7 234740 2024-01-09 2235 07:03:00 18:28:00 138888 2024-01-25 2235 07:45:00 18:59:00