I have multiple A* columns with corresponding B* columns (A and B have the corresponding numbers at the end of the column names).
When the REFNO value = A# value and the ‘MNGR’ is not BOB,
I need to put the value from corresponding B# column into the ‘AGE’ column and the corresponding # number in the ‘FLAG’ column
The next step is to set the same AGE and FLAG values that occurred first in the group to all employee in that group (group by MNGR, YEAR).
The A* and B* columns can be more than 10 columns.
But I got error:
Expected a 1D array, got an array with shape (8, 2)
df = pd.DataFrame({
'EMPLID': [12, 13, 14, 15, 16, 17, 18, 19],
'MNGR': ['BOB', 'JIM', 'RHONDA', 'RHONDA', 'JIM', 'RHONDA', 'RHONDA', 'BOB'],
'YEAR': [2012, 2013, 2012, 2012, 2012, 2013, 2012, 2012],
'REFNO': [2, 3, 4, 4, 5, 6, 4, 2],
'A1': [1,3,2,4,5,4,3,1],
'A2': [2,4,4,5,7,5,4,2],
'A3': [3,5,8,6,8,6,5,3],
'B1': [21,31,41,44,51,61,71,81],
'B2': [22,32,42,45,52,62,72,82],
'B3': [23,33,43,46,53,63,73,83]
})
for i in(1,3):
df['AGE', 'FLAG'] = df.loc[(df['REFNO'] == df[f'A{i}']) & (df['MNGR'] != 'BOB'),
[f'B{i}','i']]
df2 = df.groupby(['MNGR', 'YEAR']).first()
The next step is to fill every record in the group with the same AGE and FLAG that occurs first (this can be done by regular merging of df and df2)
Expected output:
EMPLID MNGR YEAR REFNO A1 A2 A3 B1 B2 B3 AGE FLAG
0 12 BOB 2012 2 1 2 3 21 22 23 0 0
1 13 JIM 2013 3 3 4 5 31 32 33 31 1
2 14 RHONDA 2012 4 2 4 8 41 42 43 42 2
3 15 RHONDA 2012 4 4 5 6 44 45 46 42 2
4 16 JIM 2012 5 5 7 8 51 52 53 51 1
5 17 RHONDA 2013 6 4 5 6 61 62 63 63 3
6 18 RHONDA 2012 4 3 4 5 71 72 73 42 2
7 19 BOB 2012 2 1 2 3 81 82 83 0 0
bione_g dgrace is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1