I’m trying to recreate an excel pivot table in hive by selecting two variables based on a condition and then sum over the grouping variable.
I’m very new to Hive/sql so I would be grateful for advice on how to combine the queries:
Select where condition met:
SELECT group, var1, var 2, COUNT(*)
FROM data_set
WHERE var1 = 0
AND var2 = 0;
GROUP BY group;
Sum over grouping variable:
SELECT group, var1, var2, sum(emp) OVER(partition BY group) from data_set;