If I have eg a table as shown below:
ID | Threshold | Value |
---|---|---|
1 | 2 | 1 |
1 | 2 | 2 |
1 | 2 | 3 |
1 | 2 | 4 |
2 | 4 | 1 |
2 | 4 | 3 |
2 | 4 | 5 |
How could I use spark to obtain the following?
ID | Threshold | total_above_threshold |
---|---|---|
1 | 2 | 7 |
2 | 4 | 5 |
If I have eg a table as shown below:
ID | Threshold | Value |
---|---|---|
1 | 2 | 1 |
1 | 2 | 2 |
1 | 2 | 3 |
1 | 2 | 4 |
2 | 4 | 1 |
2 | 4 | 3 |
2 | 4 | 5 |
How could I use spark to obtain the following?
ID | Threshold | total_above_threshold |
---|---|---|
1 | 2 | 7 |
2 | 4 | 5 |