I have this sql query and I am running it in spark cluster jupyter notebook:
val df1 = spark.read.format("bigquery").option("query", """
select *
from table1
where col1 >= '2024-04-01'
and col2 = 'pp'
limit 5000
""").load()
df1.createOrReplaceTempView("fact1")
spark.sql("""
select *
from fact1
""").show(true)
But its not exactly as pandas dataframe so that I can apply pandas dataframe operations and save it in csv.
I tried this also:
val result_df: DataFrame = spark.sql("""
select *
from fact1
""").toDF()
result_df.show(true)
But still not exactly in pandas dataframe.