Relative Content

Tag Archive for pythonpyspark

Cleansing the data

Consider a Scenario i have a two files one is metadata file and another one some list of data frame values, in my metadata file i have the list column name and the data type as the values

Pyspark Error: incorrect call to ‘column’

Getting a Pyspark reference error here (full code below): ‘if fi.stat().st_ctime >= today_midnight:’. Working in Palantir Foundry, using a transform to rename a json file. Any idea what is causing this? Thx!!

Remove duplicate characters from a string : Pyspark

I want to keep only the unique alphabets in a pyspark string Column.Please suggest any solution without using udfs. I need a Pyspark solution, not the multiple pythonic solutions present on the forum. Thanks