I have two Dataframes – graph
and search
with the same schema
Schema for graph:
SCHEMA = {
START_RANGE: pl.Int64,
END_RANGE: pl.Int64,
}
Schema for search:
SCHEMA = {
START: pl.Int64,
END: pl.Int64,
}
I want to search the graph dataframe.
For every row in search dataframe, I want to find all the rows in the graph dataframe where the START_RANGE
and END_RANGE
values are strictly within the range of the search
dataframe.
How to achieve this using polars only?
Example –
graph = pl.DataFrame(
{
"START_RANGE": [1, 10, 20, 30],
"END_RANGE": [5, 15, 25, 35],
},
)
search = pl.DataFrame(
{
"START": [2, 6, 7],
"END": [5, 9, 12],
},
)
# Expected output
[2,5] in range [1,5]
[6,9] in range [10,15]
[7,12] is not in any range
output = pl.DataFrame(
{
"START_RANGE": [1, 10],
"END_RANGE": [5, 15],
},
)