I have a table and I am only interested in two columns to perform two actions (Ignore or Good), and based on actions I need to keep only “Good” rows. I have tried several SQL windows functions but not able to achieve what I am looking for.
Here is the data:
There are two actions I need to perform on each row. “cat” or “dog” will be “Good” only when there is value “animal” in the column before on the same date otherwise action will be “Ignore”. Similar to this:
Final result should be like this:
Data with all edge cases
4
Looks like you can just use a conditional windowed COUNT
.
SELECT
CASE WHEN (COUNT(*) FILTER (WHERE value = 'animal') OVER
(PARTITION BY date_trunc('day', Date) ORDER BY Date ROWS UNBOUNDED PRECEDING) = 0
AND value <> 'animal')
OR (value = 'animal' AND LEAD(value) OVER
(PARTITION BY date_trunc('day', Date) ORDER BY Date) <> 'animal')
THEN 'Good'
ELSE 'Ignore'
END AS action
FROM ...
Then simply put that into a subquery and filter on it.
SELECT
*
FROM (
SELECT
CASE WHEN (COUNT(*) FILTER (WHERE value = 'animal') OVER
(PARTITION BY date_trunc('day', Date) ORDER BY Date ROWS UNBOUNDED PRECEDING) = 0
AND value <> 'animal')
OR (value = 'animal' AND LEAD(value) OVER
(PARTITION BY date_trunc('day', Date) ORDER BY Date) <> 'animal')
THEN 'Good'
ELSE 'Ignore'
END AS action
FROM yourTable t
) t
WHERE action = 'good';
3
One option could be to self join your table data like in the subquery below and use LAG() Over() analytic function as a Case expression’s condition to get the “Good” rows.
Self inner join will eliminate the rows with val = ‘animal’ with no cat or dog or whatever on the same date. Case expression checks if the previous row with the same date has val = ‘animal’ or not. Outer query is filtered using Where clause and grouped by to get ridd of duplicates.
-- S a m p l e D a t a :
Create Table tbl (a_date Date, Val Varchar(32));
Insert Into tbl
Select To_Date('2023-12-30', 'yyyy-mm-dd'), 'animal' Union All
Select To_Date('2023-12-31', 'yyyy-mm-dd'), 'animal' Union All
Select To_Date('2024-01-01', 'yyyy-mm-dd'), 'cat' Union All
Select To_Date('2024-01-01', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-02', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-02', 'yyyy-mm-dd'), 'cat' Union All
Select To_Date('2024-01-02', 'yyyy-mm-dd'), 'cat' Union All
Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'animal' Union All
Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'cat' Union All
Select To_Date('2024-01-03', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-04', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-04', 'yyyy-mm-dd'), 'cat' Union All
Select To_Date('2024-01-05', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-06', 'yyyy-mm-dd'), 'animal' Union All
Select To_Date('2024-01-06', 'yyyy-mm-dd'), 'cat' Union All
Select To_Date('2024-01-06', 'yyyy-mm-dd'), 'cat' Union All
Select To_Date('2024-01-07', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'animal' Union All
Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'dog' Union All
Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'animal' Union All
Select To_Date('2024-01-08', 'yyyy-mm-dd'), 'cat';
-- S Q L :
SELECT a_date, val, status
FROM ( Select Date_Trunc('day', t0.a_date) as a_date, t0.val,
Case When t0.val = 'animal'
OR LAG(t0.val) Over(Partition By Date_Trunc('day', t1.a_date), t1.val
Order By Date_Trunc('day', t0.a_date)) = 'animal'
Then 'Good'
End as status
From tbl t0
Inner Join tbl t1 ON(Date_Trunc('day', t1.a_date) = Date_Trunc('day', t0.a_date) And
t1.val != 'animal')
)
WHERE status = 'Good'
GROUP BY a_date, val
ORDER BY a_date, val
/*
a_date val status
---------------------- -------- -------
2024-01-03 00:00:00+00 animal Good
2024-01-03 00:00:00+00 cat Good
2024-01-03 00:00:00+00 dog Good
2024-01-06 00:00:00+00 animal Good
2024-01-06 00:00:00+00 cat Good
2024-01-08 00:00:00+00 animal Good
2024-01-08 00:00:00+00 cat Good
2024-01-08 00:00:00+00 dog Good */
See the fiddle here.
Or a version with the timestamps – see the fiddle here: