I need insights for how to do this in spark:
My dataframe is this
ID DATE State
X 20-01-2023 N
X 21-01-2023 S
X 22-01-2023 S
X 23-01-2023 N
X 24-01-2023 E
X 25-01-2023 E
Y 20-01-2023 S
Y 23-01-2023 S
the state is either : N neutral, S start, FS false start, E end or FE false end.
What i need is for each ID (X , Y …) to order dates and to change the states based on the previous state in the previous row, so the start is a false start if was preceded b a start and end it is a false end if it was preceded by end. while neutral doesnt change anything;
The output should be something like this :
ID DATE State
X 20-01-2023 N
X 21-01-2023 S
X 22-01-2023 FS
X 23-01-2023 N
X 24-01-2023 E
X 25-01-2023 FE
Y 20-01-2023 S
Y 23-01-2023 FS
Any help is appreciated !