I have a Pandas dataframe which looks like:
import pandas as pd
df = pd.DataFrame({'datetime':[
'Jan 20, 2000 12:00:00',
'Jan 20, 2000 12:00:00.5',
'Jan 20, 2000 12:00:01'
]})
df
datetime
0 Jan 20, 2000 12:00:00
1 Jan 20, 2000 12:00:00.5
2 Jan 20, 2000 12:00:01
I want to convert the column to datetime format. I tried:
pd.to_datetime(df['datetime'], format='%b %d %Y %H:%M:%S')
…but the .5
fractional second in row 1 causes the conversion to fail.
I next tried:
pd.to_datetime(df['datetime'], format='%b %d %Y %H:%M:%S.%f')
…but this causes rows 0 and 2 to fail due to absence of a fractional second.
How can I convert a Pandas dataframe column to datetime format when some entries contain fractional seconds and some do not?
You can use format=’mixed’ to infer the the format of each element.
df['datetime'] = pd.to_datetime(df['datetime'], format='mixed')
Note: 'mixed'
can be risky as it may infer a datetime in an unpredictable way. Best practice may be to explicitly normalize your dateset’s datetime string formats before converting:
df.loc[~df['datetime'].str.match('.*.d+$'), 'datetime'] += '.0'
This code effectively converts the ‘datetime’ column to datetime format while handling the mixed presence of fractional seconds.
import pandas as pd
# Sample DataFrame
df = pd.DataFrame({'datetime':['Jan 20, 2000 12:00:00', 'Jan 20, 2000 12:00:00.5', 'Jan 20, 2000 12:00:01']})
# Use errors='coerce' to handle missing fractional seconds
df['datetime'] = pd.to_datetime(df['datetime'], errors='coerce')
# Handle NaN values (missing fractional seconds) by filling with zeros
df['datetime'] = df['datetime'].fillna(pd.to_datetime(df['datetime'].dt.strftime('%Y-%m-%d %H:%M:%S') + '.000000'))
print(df)
1