[Sample data] (https://i.sstatic.net/6HQpFlLB.png)
I’ve a sample data where the data is ordered by Date of operation, vehicle number and timestamp of event column. Next_ts is a column derived by windows function (lead) to get the subsequent row timestamp. Difference (mins) column shows the time difference between the 2 time stamp columns.The ask here is to identify the start of day (SOD) and end of day for (EOD) any vehicle.
Criteria for SOD is the time difference between an Idle event and the next “non Idle event following an Idle event” is > 550 mins. In the sample data screenshot, row 18 is the first occurence of Idle event. Summing column F from F18 through F21 yeilds 737 which meets the > 550mins criteria. So D22 is the timestamp for start of day for following day since it is the first non idle event followed by the idle event and meets the 550min criteria.
In the above example, Criteria for EOD is the opposite of SOD logic. The timestamp associated with the “idle event which meets the 550min criteria from the non idle event”. To clarify further, cell D22 is the start of day for 4/10/2024. so the first idle event that meets 550min criteria before D22 is D19. so D19 is the EOD for 4/09/2024.
The actual data includes multiple vehicles and multiple dates. so this needs to be dynamically processed for every vehicle, date combination.
How do I achieve the above ask?