The Code
import polars as pl
...
# Sort by date, then pick the first row for each UID (earliest date)
sample_frame=sample_frame.sort(by=DATE_COL).unique(subset=UID_COL, keep='first')
Question
I expected the resulting frame after the above operation to be sorted in order of date, but seems not the case.
So does the deduplication operation mess up the order of the remaining rows as well? Do the polars documentation or its maintainers provide any guarantee on the row ordering after calling unique
?