New to SQL and trying to translate a QlikSense calculation into Snowflake SQL
I have a dataset ordered by date/timestamp where I am trying to group dates that fall within 3 days of each other into “chains”. This dataset is ordered by Brand, CustomerID_Brand, and call timestamp ascending.
I would like to work out the position of each call within a chain (See ‘Chain length’ column in image and table) and then create an additional call code field that takes the first call code in the chain and inputs for all calls within the same chain. (‘Call code for chain calc’ column in image and table)
Excel table – Desired result – calculate the 2 columns on the right hand side. Other columns already calculated
CustomerID_Brand | Customer ID | Brand | Call_Code | Call_Date | Call_Timestamp | Date_diff_after | Date_Diff_before | Chain | Chain_Length | Call_Code_for_Chain_Calc |
---|---|---|---|---|---|---|---|---|---|---|
1A | 1 | A | C01 | 28/01/2022 | 28/01/2022 15:15 | 28 | First Call | No Chain | 1 | C01 |
1A | 1 | A | C02 | 25/02/2022 | 25/02/2022 09:00 | 0 | -28 | Chain | 1 | C02 |
1A | 1 | A | C03 | 25/02/2022 | 25/02/2022 10:30 | 28 | 0 | Chain | 2 | C02 |
1A | 1 | A | C04 | 25/03/2022 | 25/03/2022 15:00 | 0 | -28 | Chain | 1 | C04 |
1A | 1 | A | C05 | 25/03/2022 | 25/03/2022 15:00 | 3 | 0 | Chain | 2 | C04 |
1A | 1 | A | C06 | 28/03/2022 | 28/03/2022 09:00 | 2 | -3 | Chain | 3 | C04 |
1A | 1 | A | C07 | 30/03/2022 | 30/03/2022 15:00 | 0 | -2 | Chain | 4 | C04 |
I have easily been able to do these calcs in MS Excel and QlikSense (Using the peek calculation in Qlik to reference the previous calculated value in the calculation itself.) New to SQL and it appears it is not possible to do the same thing with the lag function. Seems you can’t reference the row above within the calculation itself in SQL. (Error Message Invalid Identifier comes up)
if("Chain" = 'No Chain',1, if(fAbs([Date Difference Before])> 3,1, IF("Chain"= PREVIOUS("Chain"), rangesum(Peek("Chain Length"),1), Peek("Chain Length")))) AS "Chain Length",
IF("Chain" = 'No Chain',[Call Code], IF("Chain" = 'Chain' and "Chain" <> previous("Chain"), [Call Code], IF ("Chain" = 'Chain' and "Chain" = previous("Chain") and fAbs([Date Difference Before])<=3, Peek([Call Code For Chain Calc]), [Call_Code]))) as [Call_Code_For_Chain_Calc]
I have tried to translate the call_code_for_chain_calc this to SQL in Snowflake below
iff(Chain = 'No Chain',Call_Code,
iff(Chain = 'Chain' and Chain <> lag(Chain),Call_Code,
iff(Chain = 'Chain' and Chain = lag(Chain) and abs(Date_Diff_before) <=3,
lag(Call_code_for_chain_calc), Call_code)))
over (partition by Customer_ID || Brand
order by Brand asc,
Customer_ID || Brand,
Call_Timestamp asc,
Call_code asc)
as Call_Code_For_Chain_Calc,
Error: invalid identifier 'CALL_CODE_FOR_CHAIN_CALC'
I imagine the error is resulting from me referencing the calculation within the calculation itself. (I am attempting to take the calculated value from the row above. It works with Peek and in Excel but not in SQL.)
user25451338 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.