I always need to do something like this to get the i.e. First customer by country from a customers table. Always wondered if there was a shorter way.
SELECT
country,
MAX(first_customer) AS first_customer
FROM (
SELECT
country
,FIRST_VALUE(customer) OVER(PARTITION BY country ORDER BY inserted_date) AS first_customer
FROM customers
) x
GROUP BY 1;
I’ve been playing around with the recently added ANY_VALUE function and it seems that it can give the same result?
SELECT
country
,ANY_VALUE(customer ORDER BY inserted_date) AS first_customer
FROM customers
GROUP BY 1;
I’m not well educated on how ORDER BY within agg functions works, so I’m not sure if the output is reliable, some tests I did were positive though.
This would be huge as I could combine this with a i.e. FILTER(WHERE age > 30)
without that much of code.
I’m not considering any null/not-null values as ANY_VALUE ignores nulls. (Thanks for the clarification @Adrian Klaver)
Jairo Pérez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
I’ve run multiple experiments using ANY_VALUE(... ORDER BY ...)
and it seems to consistently return the first value based on the ORDER BY
expression; however, I would be hesitant to rely on this behavior, especially since it runs counter to the function’s description. In the PostgreSQL documentation, section 9.21. Aggregate Functions, the description of ANY_VALUE
is:
any_value ( anyelement ) → same as input type Returns an arbitrary value from the non-null input values.
The concept of order is logically inconsistent with the concept of selecting an arbitrary value. I suspect that the observed behavior is an incidental result of using common code to handle aggregate functions and not a deliberate feature. The lack of regression tests in the PostgreSQL code base for ANY_VALUE
using ORDER BY
suggests that the behavior is indeed unintentional and should not be relied upon (the code base includes such regression tests for other aggregate functions; e.g., ARRAY_AGG
).