I have a table where I store data of some parameters which are read step-wise. The number of parameters to be read can vary from use-case to use-case. The dataset of one step can be identified by its step number, the use-case by a uuid (dataset_uuid). The parameter_names “NAMExy” are not known at design time and need to be determined dynamically.
dataset_uuid | other_uuid | type | parameter_name | timestamp (varchar) | step | value |
---|---|---|---|---|---|---|
uuid1 | uuid_a | TYPE1 | NAME1 | 2024-09-05T15:04:22.000 | 0 | 1147.0 |
uuid1 | uuid_b | TYPE2 | NAME2 | 2024-09-05T15:04:22.000 | 0 | 17.569149 |
uuid1 | uuid_c | TYPE2 | NAME3 | 2024-09-05T15:04:22.000 | 0 | 14.86613 |
uuid1 | uuid_a | TYPE1 | NAME1 | 2024-09-05T15:04:24.000 | 1 | 1147.0 |
uuid1 | uuid_b | TYPE2 | NAME2 | 2024-09-05T15:04:24.000 | 1 | 17.569149 |
uuid1 | uuid_c | TYPE2 | NAME3 | 2024-09-05T15:04:24.000 | 1 | 14.896163 |
uuid1 | uuid_a | TYPE1 | NAME1 | 2024-09-05T15:04:25.000 | 2 | 1148.0 |
uuid1 | uuid_b | TYPE2 | NAME2 | 2024-09-05T15:04:25.000 | 2 | 17.569149 |
… | ||||||
uuid2 | uuid_k | TYPE1 | NAME3 | … | 0 | 1147.0 |
uuid2 | uuid_l | TYPE2 | NAME4 | … | 0 | 1147.0 |
uuid2 | uuid_m | TYPE2 | NAME5 | … | 0 | 1147.0 |
uuid2 | uuid_n | TYPE2 | NAME6 | … | 0 | 1147.0 |
uuid2 | uuid_o | TYPE2 | NAME7 | … | 0 | 1147.0 |
uuid2 | uuid_k | TYPE1 | NAME3 | … | 1 | 1147.0 |
uuid2 | uuid_l | TYPE2 | NAME4 | … | 1 | 1147.0 |
uuid2 | uuid_m | TYPE2 | NAME5 | … | 1 | 1147.0 |
uuid2 | uuid_n | TYPE2 | NAME6 | … | 1 | 1147.0 |
uuid2 | uuid_o | TYPE2 | NAME7 | … | 1 | 1147.0 |
I would like to have an SQL-statement or function which transforms the data into something like this – rows contain parameter values and time stamps:
dataset_uuid | step | NAME1 | NAME2 | NAME3 | NAME4 | NAME5 | NAME6 | NAME7 | timestamp |
---|---|---|---|---|---|---|---|---|---|
uuid1 | 0 | 1147.0 | 17.569149 | 14.86613 | NULL | NULL | NULL | NULL | 2024-09-05T15:04:22.000 |
uuid1 | 1 | 1147.0 | 17.569149 | 14.896163 | NULL | NULL | NULL | NULL | 2024-09-05T15:04:24.000 |
uuid1 | 2 | 1147.0 | 17.569149 | … | NULL | NULL | NULL | NULL | 2024-09-05T15:04:25.000 |
… | |||||||||
uuid2 | 0 | NULL | NULL | NULL | val | val | val | val | timestamp |
uuid2 | 1 | NULL | NULL | NULL | val | val | val | val | timestamp |
uuid2 | 2 | NULL | NULL | NULL | val | val | val | val | timestamp |
… |
How can I achieve that in Postgresql ond Oracle?
SELECT
dataset_uuid,
step,
MAX(CASE WHEN parameter_name = 'NAME1' THEN value END) AS NAME1,
MAX(CASE WHEN parameter_name = 'NAME2' THEN value END) AS NAME2,
MAX(CASE WHEN parameter_name = 'NAME3' THEN value END) AS NAME3,
MAX(CASE WHEN parameter_name = 'NAME4' THEN value END) AS NAME4,
MAX(CASE WHEN parameter_name = 'NAME5' THEN value END) AS NAME5,
MAX(CASE WHEN parameter_name = 'NAME6' THEN value END) AS NAME6,
MAX(CASE WHEN parameter_name = 'NAME7' THEN value END) AS NAME7,
MAX(timestamp) AS timestamp
FROM table1
GROUP BY dataset_uuid, step
ORDER BY dataset_uuid, step;
https://dbfiddle.uk/W0qHVgFZ
1
See example with CROSSTAB for PostgreSql.
crosstab(source_sql, category_sql)
CROSSTAB source query will return rows with 3 column
row_name
– row idetifier ,category
– column name for transposed table, value
– value in result table.
Create identifier rn
for tuple(dataset_uuid,step,timestamp) as rank()over(dataset_uuid,step). Timestamp ignored:)
We use this identifier in CROSSTAB instead tuple.
with testN as(
select *,rank()over(order by dataset_uuid,step) rn
from test
)
select rn,parameter_name,value
from testN
rn | parameter_name | value |
---|---|---|
1 | NAME1 | 1147 |
1 | NAME2 | 17.569149 |
1 | NAME3 | 14.86613 |
4 | NAME1 | 1147 |
4 | NAME2 | 17.569149 |
4 | NAME3 | 14.896163 |
7 | NAME1 | 1148 |
7 | NAME2 | 17.569149 |
7 | NAME4 | 18.1 |
10 | NAME1 | 27.569149 |
10 | NAME3 | 24.86613 |
12 | NAME8 | 2147 |
Main query with crosstab.
category_sql as list of values
$$values(‘NAME1’),(‘NAME2’),(‘NAME3’),(‘NAME4’),(‘NAME5’),(‘NAME6’),(‘NAME7’),(‘NAME8’)$$
SELECT a.dataset_uuid,a.step,a.timestamp,ct.* FROM
crosstab(
$$with testN as(
select *,rank()over(order by dataset_uuid,step) rn
from test
)
select rn,parameter_name,value
from testN
$$
,$$values('NAME1'),('NAME2'),('NAME3'),('NAME4')
,('NAME5'),('NAME6'),('NAME7'),('NAME8')$$
) AS ct(rn bigint, "NAME1" float, "NAME2" float, "NAME3" float, "NAME4" float
, "NAME5" float, "NAME6" float, "NAME7" float
, "NAME8" float)
left join (
select distinct dataset_uuid,step,timestamp,rank()over(order by dataset_uuid,step) rn
from test
)a on ct.rn=a.rn
ORDER BY rn;
dataset_uuid | step | timestamp | rn | NAME1 | NAME2 | NAME3 | NAME4 | NAME5 | NAME6 | NAME7 | NAME8 |
---|---|---|---|---|---|---|---|---|---|---|---|
uuid1 | 0 | 2024-09-05T15:04:22.000 | 1 | 1147 | 17.569149 | 14.86613 | null | null | null | null | null |
uuid1 | 1 | 2024-09-05T15:04:24.000 | 4 | 1147 | 17.569149 | 14.896163 | null | null | null | null | null |
uuid1 | 2 | 2024-09-05T15:04:25.000 | 7 | 1148 | 17.569149 | null | 18.1 | null | null | null | null |
uuid2 | 0 | 2024-09-05T15:04:22.000 | 10 | 27.569149 | null | 24.86613 | null | null | null | null | null |
uuid2 | 1 | 2024-09-05T15:04:24.000 | 12 | null | null | null | null | null | null | null | 2147 |
Test data
dataset_uuid | other_uuid | type | parameter_name | timestamp | step | value |
---|---|---|---|---|---|---|
uuid1 | uuid_a | TYPE1 | NAME1 | 2024-09-05T15:04:22.000 | 0 | 1147 |
uuid1 | uuid_b | TYPE2 | NAME2 | 2024-09-05T15:04:22.000 | 0 | 17.569149 |
uuid1 | uuid_c | TYPE2 | NAME3 | 2024-09-05T15:04:22.000 | 0 | 14.86613 |
uuid1 | uuid_a | TYPE1 | NAME1 | 2024-09-05T15:04:24.000 | 1 | 1147 |
uuid1 | uuid_b | TYPE2 | NAME2 | 2024-09-05T15:04:24.000 | 1 | 17.569149 |
uuid1 | uuid_c | TYPE2 | NAME3 | 2024-09-05T15:04:24.000 | 1 | 14.896163 |
uuid1 | uuid_a | TYPE1 | NAME1 | 2024-09-05T15:04:25.000 | 2 | 1148 |
uuid1 | uuid_b | TYPE2 | NAME2 | 2024-09-05T15:04:25.000 | 2 | 17.569149 |
uuid1 | uuid_b | TYPE2 | NAME4 | 2024-09-05T15:04:25.000 | 2 | 18.1 |
uuid2 | uuid_b | TYPE2 | NAME1 | 2024-09-05T15:04:22.000 | 0 | 27.569149 |
uuid2 | uuid_c | TYPE2 | NAME3 | 2024-09-05T15:04:22.000 | 0 | 24.86613 |
uuid2 | uuid_a | TYPE1 | NAME8 | 2024-09-05T15:04:24.000 | 1 | 2147 |
with testN as(
select *,rank()over(order by dataset_uuid,step) rn
from test
)
select dataset_uuid,step,rn,parameter_name,value
from testN
dataset_uuid | step | rn | parameter_name | value |
---|---|---|---|---|
uuid1 | 0 | 1 | NAME1 | 1147 |
uuid1 | 0 | 1 | NAME2 | 17.569149 |
uuid1 | 0 | 1 | NAME3 | 14.86613 |
uuid1 | 1 | 4 | NAME1 | 1147 |
uuid1 | 1 | 4 | NAME2 | 17.569149 |
uuid1 | 1 | 4 | NAME3 | 14.896163 |
uuid1 | 2 | 7 | NAME1 | 1148 |
uuid1 | 2 | 7 | NAME2 | 17.569149 |
uuid1 | 2 | 7 | NAME4 | 18.1 |
uuid2 | 0 | 10 | NAME1 | 27.569149 |
uuid2 | 0 | 10 | NAME3 | 24.86613 |
uuid2 | 1 | 12 | NAME8 | 2147 |
If necessary, add extension tablefunc
CREATE EXTENSION IF NOT EXISTS tablefunc;
fiddle
1