I am trying to filter a dataframe to find the first occurrence of a maximum value over a category column. In my data there is no guarantee that there is a single unique maximum value, there could be multiple values, but i only need the first occurance.
Yet I can’t seem to find a way to limit the max part of the filter, currently I am then adding a further filter on another column generally a time based one and taking the minimum value.
df
.filter(pl.col(max_column) == pl.col(max_column).max().over(category_column))
.filter(pl.col(min_column) == pl.col(min_column).min().over(category_column))
However, I’d prefer to simplify the above to only require passing in references to the max and category columns.
Am I missing something obvious here?