Relative Content

Tag Archive for pythonpyspark-pandas

Pandas on Spark API Date Operations

I am using Pandas in Spark API for some data preprocessing file which was initially in Pandas. I am seeing that the date operations are very slow and some are not compatible at all. For Eg: I cannot do this df[time_col] + pd.Timedelta(1, unit=’D’) instead I had to write the below operation: df[time_col ].apply(lambda x: x+timedelta(days=1)).

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for pythonpyspark-pandas

Pandas on Spark API Date Operations