I’m trying to create a new “New” column in the Data Frame based on the existing ones: if the “Min date” column is null, then we take the data from the “Start Date” column, otherwise we take the date from the “Min date” in string format, concatenate it with the time from the “Time” column. We format the result in the date-time.
fffg = pd.DataFrame({'N': [1, 2, 3], 'MinDate': [date(2023,1,2), None, date(2022,1,7)],
'Time': [datetime.time(8, 48, 0), datetime.time(8, 48, 0), datetime.time(8, 48, 0)],
'Start_Date': [datetime.datetime(2022,4,1,15,10), datetime.datetime(2023,4,1,15,10), datetime.datetime(2022,5,1,15,10)]})
fffg['MinDate'] = pd.to_datetime(fffg['MinDate'])
fffg['New'] = np.where(pd.isnull(fffg['MinDate']),
pd.to_datetime(fffg['MinDate'].astype(str)+' '+fffg['Time'].astype(str)),
fffg['Start_Date']
)
fffg
but I get an error:
“ValueError: time data “NaT 08:48:00” doesn’t match format “%Y-%m-%d %H:%M:%S”, at position 1. You might want to try:
– passing format
if your strings have a consistent format;
– passing format='ISO8601'
if your strings are all ISO8601 but not necessarily in exactly the same format;
– passing format='mixed'
, and the format will be inferred for each element individually. You might want to use dayfirst
alongside this.”
It seems as if filtering inside np.where() is being ignored. How can this mistake be avoided?
Anna is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.