I’m trying to get a single line item per id with the largest count, like below:
Primary | Name | Count |
---|---|---|
1a | AXA | 11 |
1a | FLO | 1 |
1a | FLO | 60 |
1a | AXA | 14 |
2e | AXA | 1 |
2e | ROT | 1 |
2e | ROT | 6 |
2e | ROT | 4 |
tb: ([] Primary:`1a`1a`1a`1a`2e`2e`2e`2e; Name:`AXA`FLO`FLO`AXA`AXA`ROT`ROT`ROT; Count: 11 1 60 14 1 1 6 4)
My first step would be to order the table, descending by count but only within Primary, like this:
Primary | Name | Count |
---|---|---|
1a | FLO | 60 |
1a | AXA | 14 |
1a | AXA | 11 |
1a | FLO | 1 |
2e | ROT | 6 |
2e | ROT | 4 |
2e | AXA | 1 |
2e | ROT | 1 |
I’m not sure how to do this, I was thinking ““Count xdesc by Primary from tb“`, but this gives me an error:
q))a: `Count xdesc by Primary from tb;
'from
[4] a: `Count xdesc by Primary from tb;
^
My ultimate goal is to get one line per Primary id with the highest number of Count:
Primary | Name | Count |
---|---|---|
1a | FLO | 60 |
2e | ROT | 6 |
I’ve spent some time on this and I’m not sure where to go from the xdesc step, does anyone know how I can do it better?
Filter-by fby
is what you want for this:
https://code.kx.com/q/ref/fby/
q)select from tb where Count=(max;Count) fby Primary
Primary Name Count
------------------
1a FLO 60
2e ROT 6
But if the max happens more than once you would get multiple rows:
q)tb,:tb //Duplicate the table in itself
q)select from tb where Count=(max;Count) fby Primary
Primary Name Count
------------------
1a FLO 60
2e ROT 6
1a FLO 60
2e ROT 6
Nested fby
can further resolve this: (i
being the virtual index column)
q)select from
(select from tb where Count=(max;Count) fby Primary)
where i=(first;i) fby Primary
Primary Name Count
------------------
1a FLO 60
2e ROT 6
To neaten it up fby
can be fed sub-tables for each Primary:
q)select from tb
where i=({exec first ind from x where Count=max Count};
([]Count;ind:i)) fby Primary
Primary Name Count
------------------
1a FLO 60
2e ROT 6
A different method without fby
:
q)tb value exec first i where Count=max Count by Primary from tb
Primary Name Count
------------------
1a FLO 60
2e ROT 6
Another trick possible is that select by X from Y
queries with nothing between select
and by
will return only the last row per X
. So you can sort the table and use this:
q)select by Primary from `Count xasc tb
Primary| Name Count
-------| ----------
1a | FLO 60
2e | ROT 6
However note this can only get you the last
max row whereas in the above queries you can easily choose between first
and last
. Also better to avoid sorting the whole table if you don’t need to so this query is not suggested as the preferred method.