So consider I have two dataframes:
Old information:
Name Id Club Number
0 Ronaldo 12414 Al-Nassr 7
1 Messi 4344134 Miami 30
2 Shevchenko 1234435 Milan 7
3 Maradona 37346 Retired None
old = {'Name': {0: 'Ronaldo', 1: 'Messi', 2: 'Shevchenko', 3: 'Maradona'},
'Id': {0: 12414, 1: 4344134, 2: 1234435, 3: 37346},
'Club': {0: 'Al-Nassr', 1: 'Miami', 2: 'Milan', 3: 'Retired'},
'Number': {0: 7, 1: 30, 2: 7, 3: None}}
And new information:
Name Id Club Number
0 Ronaldo 12414 Al-Nassr 7
1 Messi 4344134 Miami 10
2 Shevchenko 1234435 Retired None
3 Neymar 423552 Al Hilal 10
new = {'Name': {0: 'Ronaldo', 1: 'Messi', 2: 'Shevchenko', 3: 'Neymar'},
'Id': {0: 12414, 1: 4344134, 2: 1234435, 3: 423552},
'Club': {0: 'Al-Nassr', 1: 'Miami', 2: 'Retired', 3: 'Al Hilal'},
'Number': {0: 7, 1: 10, 2: None, 3: 10}}
I want to compare new df (based on Name + ID) to old df and return a new df with a column that indicates changes:
Name Id Club Number Changes
0 Ronaldo 12414 Al-Nassr 7 No change
1 Messi 4344134 Miami 10 Number
2 Shevchenko 1234435 Retired None Club, Number
3 Neymar 423552 Al Hilal 10 New entry
I care mostly for new additions and changes, but if it is not too hard include the removed ones, that would be great. If it changes a lot, then no need. So the results with it will be sth like this:
Name Id Club Number Changes
0 Ronaldo 12414 Al-Nassr 7 No change
1 Messi 4344134 Miami 10 Number
2 Shevchenko 1234435 Retired None Club, Number
3 Neymar 423552 Al Hilal 10 New entry
4 Maradona 37346 Retired None Removed