Spark EOF Error (Parquet Read from S3)- Spark to Pandas conversion
I am reading close to 1 million rows stored in S3 as parquet files into dataframe (900mb size data in bucket). Filtering the dataframe based on values and then later converting to pandas dataframe . There are 2 UDFs involved (classify and transformDate). I am getting the error eof while running this code snippet . What is wrong with this code? Is it some spark setting which I am missing or is it the improper use of UDF ? Code snippet below