I am using SQL in databricks notebook.I have a table called change_table
. This table keeps track of all changes that occurs to an id, along with other columns. The result of this change may or may not result in change of the id itself.
If the id changes, the old id is shown in the column “prev_id”.
If the id doesn’t change the same id is shown in “prev_id”. Also, a row in this table is considered a change.
My goal is to keep track of the number of changes that has occurred to an id irrespective of the change to an id.
For example, If I query id = ‘B’ order by latest date, I should know that this id has had 1 change already.
If I query id = ‘2’ order by latest date, I should know that this id had 2 changes.
Below is the change_table and the new column called “number_of_changes” to be added to keep a track of the changes.
Is it possible to attain this in SQL?
If not possible by SQL, are there other options such as using pandas iterations to do so?
I had a requirement where the id was not changing and I was able to attain the result by using a windowing function on the id. But now the change in id requirement has made it complicated.
Thanks for your time!
The expected values in the new column “number_of_changes”