I am wanting to determine the number of distinct user_ids
that have the same 3 animals as any user (where order doesn’t matter).
For example below, user 11 and 13 both have dog, cat, bird so they would both be counted.
user_id | Col1 | Col2 | Col3 |
---|---|---|---|
11 | dog | cat | bird |
12 | cow | dog | bird |
13 | cat | bird | dog |
The desired output here would be
distinct_users | distinct_users_with_a_match |
---|---|
3 | 2 |
Consider below approach (BigQuery)
select
count(distinct user_id) as distinct_users,
count(distinct (select string_agg(col order by col) from unnest([Col1, Col2, Col3]) col)) as distinct_users_with_a_match
from your_table
if applied to sample data in your question – output is
1