Relative Content

Tag Archive for pysparkapache-spark-sqldatabricks

Splitting Sorting and Repacking a column name

I trying to split the fileName and the columnName and then repack them. Instead of having File_Name_Alpha, I want FileNameAlpha. I’ve been trying using SQL in the notebook and it looked like it was working, but then I’d get some that would repack as AlphaNameFile or FileAlphaName when I want to retain the original order. Same issue for the columnName, Have_A_Great_Day would end up randomly as ADayGreatHave. To make matters worse rarely happens, but I can’t have it rarely happen, it needs to always be repacked in the correct order. The problem seems to be in the collect_set where according to databricks is