Given the following structure of data:
X | Y |
---|---|
1 | 10 |
1 | 15 |
1 | 12 |
2 | 32 |
2 | 11 |
2 | 45 |
2 | 55 |
3 | 23 |
3 | 34 |
3 | 56 |
3 | 23 |
3 | 45 |
4 | 32 |
4 | 84 |
4 | 24 |
4 | 23 |
I like to correlate X with Y. For the entire data this trivial, of course:
SELECT CORR(X,Y) FROM some_table;
But I like to correlate X with Y in a rolling form. So the result may look like this:
X | CORR(X,Y) |
---|---|
1 | 0.40 |
2 | 0.43 |
For row X=1, CORR(X,Y) reflects the correlation of X with Y but only for X being 1,2 or 3
Accordingly, for row X=2, the correlation is only for X being 2,3 or 4.
But to be honest, I have no clue how to cover this. First, I thought of using a kind of combination of GROUP BY, LAG and PARTITION, but quite clueless how to combine them in a senseful manner.