Hello I have a specific Stock movement scenario here, where I have 2 DataFrames. The first with the product enter process (df_enter) and another one with the cancel of a product in stock.
I need to find the respective enter movement for each cancel based on columns MOVEMENT, PRODUCT and DATE_TIME. Sometimes there are multiple rows for same MOVEMENT (see Movement 1 and 4). In this case, I need to find the previous Enter Movement based on Cancel Date_time.
I was able to do that searching inside the df_enter for each df_cancel row, but it’s terrible slow.
Any idea? I’m getting crazy
df_enter (The product 2 is just to show that I could have different PRODUCTS and it’s part of the key to search)
MOVEMENT | PRODUCT | MOVEMENT_TYPE | DATE_TIME | QUANTITY |
---|---|---|---|---|
1 | 1 | ENTER | 2024-05-01 11:00 | 20 |
1 | 1 | ENTER | 2024-05-01 12:00 | 10 |
1 | 2 | ENTER | 2024-05-01 13:00 | 50 |
3 | 1 | ENTER | 2024-05-01 13:00 | 30 |
4 | 1 | ENTER | 2024-05-01 13:10 | 10 |
4 | 1 | ENTER | 2024-05-01 13:50 | 30 |
df_cancel
MOVEMENT | PRODUCT | MOVEMENT_TYPE | DATE_TIME | QUANTITY |
---|---|---|---|---|
1 | 1 | CANCEL | 2024-05-01 11:40 | 5 |
1 | 1 | CANCEL | 2024-05-01 12:10 | 10 |
3 | 1 | CANCEL | 2024-05-01 13:40 | 30 |
4 | 1 | CANCEL | 2024-05-01 13:20 | 40 |
expected result:
MOVEMENT | PRODUCT | MOVEMENT_TYPE | DATE_TIME | QUANTITY | QTY_CANCELLED |
---|---|---|---|---|---|
1 | 1 | ENTER | 2024-05-01 11:00 | 20 | 5 |
1 | 1 | ENTER | 2024-05-01 12:00 | 10 | 10 |
1 | 2 | ENTER | 2024-05-01 13:00 | 50 | |
3 | 1 | ENTER | 2024-05-01 13:00 | 30 | 30 |
4 | 1 | ENTER | 2024-05-01 13:10 | 10 | 10 |
4 | 1 | ENTER | 2024-05-01 13:50 | 30 | 10 |