Is there a more elegant way to perform calculations of total number of orders per person?
My dataset contains a list of people , date when order was placed, order method, product description. I don’t have an unique order id column (which would make my life easier). I have 1 order per row.
In my sql code i assign a row number to each order in 1rst nested query, then I count max number of orders per person in the 2nd nested query.
At the end, I perform calculations to see how many people placed 1 order only, 2 orders only, etc.
select distinct
max_nbr_of_devices_ordered
, count(distinct prsn_id) as numer
from ( select distinct
prsn_id
, max(nbr_of_devices_ordered) as max_nbr_of_devices_ordered
from ( select distinct
prsn_id
, Row_Number() Over (PARTITION BY prsn_id ORDER BY order_date, product_desc ASC) AS nbr_of_devices_ordered
from hsd_orders
) as a
group by 1
) as b
group by 1
2
this answer provided by Andrew worked:
select
order_count
, count(distinct prsn_id) as numer
from (select distinct prsn_id, count (*) as order_count from hsd_orders group by prsn_id) T
group by 1
2
As noted in comments the distincts can be dropped as they are not doing any work here:
select
order_count
,count(prsn_id) as numer
from (select prsn_id, count (*) as order_count from hsd_orders group by 1)
group by 1
works thus for some fake data:
with hsd_orders(ord_id, prsn_id, date) as (
select * from values
(1, 1, '2024-09-25'),
(2, 2, '2024-09-25'),
(3, 2, '2024-09-26'),
(4, 3, '2024-09-25'),
(5, 3, '2024-09-26'),
(6, 3, '2024-09-26'),
(7, 4, '2024-09-26'),
(8, 4, '2024-09-25')
)
select
order_count
,count(prsn_id) as numer
from (select prsn_id, count (*) as order_count from hsd_orders group by 1)
group by 1
we get:
ORDER_COUNT | NUMER |
---|---|
1 | 1 |
2 | 2 |
3 | 1 |
But it can be more simpler, because we are not using the person id’s we do not need to carry them, just to throw them away.
select
order_count
,count(1) as numer
from (
select count (*) as order_count
from hsd_orders
group by prsn_id
)
group by 1
If you really want to avoid a subquery
select distinct
count(*) as num_orders,
count(prsn_id) over (partition by num_orders) as num_people
from hsd_orders
group by prsn_id
order by num_orders;