My timeseries table T has the columns: location, sensor_id, timestamp and value. The table has thousands of sensor_id, billions of values per year and about 100 locations. The 100 locations are in 6 countries.
For table T there exist about 10 user groups and each user group are only allowed to see locations from particular countries. For example:
- user group 1 can only see locations in Spain and France
- user group 2 can only see locations in Spain and Germany
To ensure that we are data compliant we want to use row level security. One approach is to hard code the locations in for each user group:
.create-or-alter function My_RLS_function(TableName:string) {
table(TableName)
| where (current_principal_is_member_of('aadgroup=user_group_1') and location in ('Location1, Location3, ..., location 50')
or ...
or ...
}
We think that this is the most query optimized way, however typing in each location would be messy and error prone. It would also be cumbersome to add or remove a location.
Therefore, our approach is to create a new table called location_and_countries. This is a table of
100 rows where the first column is country, and the second column is location. Assuming user group 1 can see locations in Spain and France the RLS becomes:
.create-or-alter function My_RLS_function(TableName:string) {
table(TableName)
| where (current_principal_is_member_of('aadgroup=user_group_1') and toscalar(location_and_countries| where countries in ('Spain', 'France') | summarize make_list(location))
or ...
or ...
}
What we are wondering about now is the following:
- Does anyone know a more query optimized way to store and retrieve metadata to be used in KQL queries?
- Is there a way we can hide the table location_and_countries so that the user cannot query it?