I’m using Databricks with PySpark and encountering the following error:
Error:
[NOT_COLUMN_OR_STR] Argument `Condition` should be a Column or str, got Column.
Here is my code:
from pyspark.sql.functions import col
df = df.filter(col('foo_column').isNotNull())
This code works perfectly when executed in a Databricks notebook. However, when I run the same code from a .py file, I encounter the above error.
How can I resolve the error in the first example? Am I missing something?
Additional Information:
- Using Databricks Connect
- PySpark version: 3.5.0
- Python version: 3.10.12
Interestingly, the following code works without any issues:
df = df.filter(df['foo_column'].isNotNull())
New contributor
user25321164 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.