There is one scenario where I am struggling to implement a logic in Teradata, I have a table as shown below.
My key columns are COLA
and COLB
, now the logic which I need to implement is that within group of COLA
and COLB
, if my count(distinct COLC) > 1
, then I need to look into COLD
within that group (COLA
and COLB
), and then do a different grouping by including COLC
.
Now I need to find the group (now here when I say group we need to consider COLA
, COLB
and COLC
) which have value for COLD
with R004. In this scenario we can see that the group 12345,asdf,1
have COLD
with value R004, so basically I need to ignore that complete group. So here the records with COLA
value 12345, COLB
value asdf and COLC
value 1 will be eliminated in the output.
The key point here is this condition needs to be checked only when this count(distinct COLC) > 1
within the group of COL A
and COL B
, otherwise rest of the records needs to be in the output as it is.
I tried to use count(distinct) over by partition
, but it looks like Teradata doesn’t support this function. I’m wondering if there is any way I can achieve this with a single query functions?
COLA | COLB | COLC | COLD | CO | COLF | COLG | COLH |
---|---|---|---|---|---|---|---|
12345 | asdf | 1 | R004 | r | tr | ||
12345 | asdf | 1 | R009 | r | tr | ||
12345 | asdf | 1 | R0101 | r | tr | ||
12345 | asdf | 1 | R453 | r | tr | ||
12345 | asdf | 1 | R5678 | r | tr | ||
12345 | asdf | 4 | R009 | r | tr | ||
12345 | asdf | 4 | R0101 | r | tr | ||
12345 | asdf | 4 | R453 | r | tr | ||
12345 | asdf | 4 | R5678 | r | tr | ||
12345 | asdf | 4 | R5434 | r | tr | ||
32145 | frte | 1 | R004 | r | tr | ||
32145 | frte | 1 | R453 | r | tr | ||
32145 | frte | 1 | R5678 | r | tr | ||
32145 | frte | 1 | R5434 | r | tr | ||
45678 | erwq | 4 | R4879 | r | tr | ||
45678 | erwq | 4 | R5654 | r | tr | ||
45678 | erwq | 4 | R5323 | r | tr |
The expected output should be like this:
COLA | COLB | COLC | COLD | CO | COLF | COLG | COLH |
---|---|---|---|---|---|---|---|
12345 | asdf | 4 | R009 | r | tr | ||
12345 | asdf | 4 | R0101 | r | tr | ||
12345 | asdf | 4 | R453 | r | tr | ||
12345 | asdf | 4 | R5678 | r | tr | ||
12345 | asdf | 4 | R5434 | r | tr | ||
32145 | frte | 1 | R004 | r | tr | ||
32145 | frte | 1 | R453 | r | tr | ||
32145 | frte | 1 | R5678 | r | tr | ||
32145 | frte | 1 | R5434 | r | tr | ||
45678 | erwq | 4 | R4879 | r | tr | ||
45678 | erwq | 4 | R5654 | r | tr | ||
45678 | erwq | 4 | R5323 | r | tr |
1
I manage to write the query and its working as expected. You will be able to see couple of tables extra, but the table which I given is a join of those two tables in SUBQUERY
SELECT
COLA,
COLB,
COLC
FROM (
SELECT
COLA,
COLB,
COLC,
COLD,
max(COLC) over (partition by COLA, COLB) max_index,
min(COLC) over (partition by COLA, COLB) min_index,
SUM(CASE WHEN COLD = 'R004' THEN 1 ELSE 0 END) OVER (PARTITION BY COLA, COLB, COLC) AS R004Flag
FROM
(sel COLA, COLB,COLC,COLD from
TAB1 INNER JOIN TAB2
ON TAB1.ID = TAB2.ID
AND TAB1.GId = 'R60' )aa
) AS subquery
WHERE
case when (max_index <> min_index AND R004Flag =0) then 1
When (max_index = min_index AND (R004Flag =1 or R004Flag =0)) then 1
else 0 end=1
1