I’m trying to calculate the difference between two events where the condition X is happening and I also have the condition that if the events happens in the same day, the second event will take the previous date not the one earlier on that same date. (Does it make sense ?)
I found a way to do it with a subquery but it’s taking so much time, this is why I want to know if there is another way to do it.
(
SELECT MAX(ld2.date)
FROM table ld2
WHERE ld2.condition = 'X'
AND ld2.id = ld.id
AND ld2.date < ld.date
) AS last_date,
So I tried using a MAX function and I want to apply this condition to my window function :
MAX(CASE WHEN condition = 'X' AND date < LEAD(date) OVER (PARTITION BY id ORDER BY date) THEN date END) OVER (PARTITION BY id ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS last_date
I tried a lead function but i’ve got the error : may not be nested inside another window function.
I’m working with snowflake, does anyone have an idea of how I can do that?
Amina Ferraoun is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2
-- using row_num to order the rows, but date could equally have a time component.
with data(row_num, id, date, condition) as (
select * from values
(1, 1, '2024-09-25', 'X'),
(2, 2, '2024-09-25', 'X'),
(3, 2, '2024-09-26', 'X'),
(4, 3, '2024-09-25', 'X'),
(5, 3, '2024-09-26', 'X'),
(6, 3, '2024-09-26', 'X') -- these both want to use row 4
)
select d.*
,lead(d.date) over (partition by d.id order by d.date, d.row_num) as next_date
,iff(next_date = d.date, null, d.date) as not_today
from data as d
where d.condition = 'X'
order by d.row_num;
this gives us an edge when the NOT_TODAY is only present for the last of each day.
Now we can run a LAG that skips nulls over that edge.
select d.*
,lead(d.date) over (partition by d.id order by d.date, d.row_num) as next_date
,iff(next_date = d.date, null, d.date) as not_today
,lag(not_today) ignore nulls over (partition by d.id order by d.date, d.row_num) as prior_date
from data as d
where d.condition = 'X'
order by d.row_num;
but this makes the error:
Window function [LEAD(D.DATE) OVER (PARTITION BY D.ID ORDER BY D.DATE ASC NULLS LAST, D.ROW_NUM ASC NULLS LAST)]
may not be nested inside another window function.
so we layer the selects:
with data(row_num, id, date, condition) as (
select * from values
(1, 1, '2024-09-25', 'X'),
(2, 2, '2024-09-25', 'X'),
(3, 2, '2024-09-26', 'X'),
(4, 3, '2024-09-25', 'X'),
(5, 3, '2024-09-26', 'X'),
(6, 3, '2024-09-26', 'X') -- these both want to use row 4
)
select *
,lag(not_today) ignore nulls over (partition by id order by date, row_num) as prior_date
,datediff('days', prior_date, date) as days_gap
from (
select d.*
,lead(d.date) over (partition by d.id order by d.date, d.row_num) as next_date
,iff(next_date = d.date, null, d.date) as not_today
from data as d
where d.condition = 'X'
)
order by row_num;
1