Splitting Sorting and Repacking a column name
I trying to split the fileName and the columnName and then repack them. Instead of having File_Name_Alpha, I want FileNameAlpha. I’ve been trying using SQL in the notebook and it looked like it was working, but then I’d get some that would repack as AlphaNameFile or FileAlphaName when I want to retain the original order. Same issue for the columnName, Have_A_Great_Day would end up randomly as ADayGreatHave. To make matters worse rarely happens, but I can’t have it rarely happen, it needs to always be repacked in the correct order. The problem seems to be in the collect_set
where according to databricks is
How to work with complex data type in Pyspark
I am having below issue when I have one dataframe and it is having one column name is attribute and attribute type is