why reading a parquet file creating a job in spark UI?
I am using this statement to read the parquet file in pyspark (without using any display function or show method later). When I go to Spark UI, I can see a job being created. How does not using any action create a job in spark?
Change default 30 minute timeout in spark
How do I change the default 30 minute timeout in pyspark sessions? Are there any config files where I have to change?
Typecast to timestamp from stringtype in Pyspark
How can I typecast a string type to timestamp in Pyspark
The stringtype data is like this 2024-04-02-19.02.20.000000. I need this column data in timestamp.
using spark2-shell, unable to access S3 path to having ORC file to create a dataframe
i have S3 access_key_id, secret_access_key and endpoint URL.