I have 2 spark dataset views
dataset dept – ds1_view has 371815 records
dept dataset has multiple rows with same userid
dataset user – ds2_view has 27217 records with unique userid
i need to combine all cols from both view
but not getting expected result
can any body let me know where i am doing wrong
i tried this sql query
select * from ds1_view v1, ds2_view v2 where v1.userid=v2.userid
it gives result of 35565075 rows
while expected result should be
371815 records
Chandeshwar Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.