Have a Python dataframe that I’m looking to forward fill across rows.
Data looks like this:
index trade_date clean_pub_date day_lag ticker transaction_type asset_type clean_amt d0 d0_1m d0_3m d0_6m d0_12m d0_t d0_1m_t d0_3m_t d0_6m_t d0_12m_t
136 2023-01-13 2023-02-09 27 UL P ST 32500 47.2510986328125 46.9969444274902 51.7693290710449 50.7017021179199 49.7376365661621 48.7582054138184 49.1847496032715 51.7501983642578 49.8611183166504 47.3593482971191
142 2023-01-13 2023-02-06 24 CRM P ST 7500 168.829467773438 182.711318969727 197.641815185547 215.778137207031 285.457092285156 149.31494140625 170.856811523438 193.766891479492 226.983489990234 268.838836669922
169 2023-06-09 2023-06-09 0 PTON P ST 7500 8.3100004196167 8.39000034332275 5.82000017166138 6.07999992370606 NaN 8.3100004196167 8.39000034332275 5.82000017166138 6.07999992370606 NaN
170 2023-06-09 2023-07-06 27 TMUS P ST 7500 138.005340576172 135.516159057617 136.914474487305 161.267501831055 NaN 130.270050048828 137.192138671875 136.14094543457 154.882934570313 NaN
171 2023-06-09 2023-06-12 3 EMR P ST 7500 82.3291549682617 90.2124633789063 98.5805587768555 88.8684310913086 NaN 82.4564666748047 87.5781631469727 97.8716278076172 86.8675918579102 NaN
And looking to replace the NaNs with preceding within-row entry via this:
t = t.ffill(axis=1)
This works as desired except data types change with the application of ffill
:
Before
t.dtypes
Out[278]:
name object
trade_date datetime64[ns]
clean_pub_date datetime64[ns]
day_lag int64
ticker object
transaction_type object
asset_type object
clean_amt int64
d0 float64
d0_1m float64
d0_3m float64
d0_6m float64
d0_12m float64
d0_t float64
d0_1m_t float64
d0_3m_t float64
d0_6m_t float64
And after:
t = t.ffill(axis=1)
t.dtypes
Out[280]:
name object
trade_date datetime64[ns]
clean_pub_date datetime64[ns]
day_lag object
ticker object
transaction_type object
asset_type object
clean_amt object
d0 object
d0_1m object
d0_3m object
d0_6m object
d0_12m object
d0_t object
d0_1m_t object
d0_3m_t object
d0_6m_t object
I don’t see why this would happen as all values replacing NaNs are floats. Also don’t see any option in the documentation to address this.
Any ideas how/why this is happening?