I have two lists of people in two separate systems. One is our data warehouse, and the other is an autodialer. We recently had an issue where some of the data got duplicated in the dialer. We are trying to identify the problem records and I’m trying to create two lists, one that is a priority list with only active people, and the other is a complete list with everybody. I have two DataFrames that I merged into a single DataFrame and then substituted a blank string for the NaN’s so it’s more readable for the end users. However, now I’m trying to remove the blank records for the “priority” list (the ID gets blanked out as part of the process when the person is made inactive). However when I try it’s not working. Here is what I’ve got:
results.loc[(results['ID'] != results['SSID']) & (results['number1'] != '8888888888') & (results['number1'] != '9999999999') & (results['ID'] != '') & (results['SSID'] != '')].to_excel(writer, sheet_name = 'Priority', startrow=1)
However, when the list comes out it still gives me the blanks. It removes the ‘888’ and ‘999’ phone numbers from the list but not the blanks
I have also tried using != 0 and != NaN but neither of those work. Can someone point me in the right direction? Thanks!
4
I found out my problem. I was able to figure it out with the help of @Elison G. Lima above and also with some help from another person. I had a step above that previously read results.replace(np.nan, '')
and I changed that to results.fillna('', inplace=True)
and then used the code above to put it in a different DataFrame and now it’s working. Thanks everyone!
Try this
results_filtered = results.loc[
(results['ID'].str.strip() != results['SSID'].str.strip()) &
(results['number1'] != '8888888888') &
(results['number1'] != '9999999999') &
(results['ID'].str.strip() != '') &
(results['SSID'].str.strip() != '')
]
results_filtered.to_excel(writer, sheet_name=’Priority’, startrow=1)
3