entityId | parentId | percentContribution | emission | emission_locationbased | emission_marketbased | cumsum_locationbased | cumsum_marketbased |
---|---|---|---|---|---|---|---|
E1 | 60 | 30 | 60 | 70 | |||
E2 | E1 | 80 | 20 | 30 | 40 | ||
E3 | E2 | 80 | 10 | 20 | 10 | 20 |
This is a sample of a tree structure in a single table that I’m working on. The table is a hierarchy of entities, each generating a certain amount of emissions. There are three fields relating to the amount of emissions generated by each entity: emission, emission_locationbased, and emission_marketbased. If an entity has a non-null emission value, then its emission_locationbased and emissions_marketbased values are null. If an entity has a null emission value, then it has both an emission_locationbased and emissions_marketbased value.
The issue is taking percentContribution into account. It indicate how many percentage of that entity’s cumulative emission (both _locationbased and _marketbased) that node contribute to its parent’s cumulative sum.
For example:
entityId | parentId | percentContribution | emission | emission_locationbased | cumsum_locationbased_percent |
---|---|---|---|---|---|
E1 | 60 | 30 | (28*80/100) + 30 = 52.4 | ||
E2 | E1 | 80 | 20 | (10 * 80/100) + 20 = 28 | |
E2 | E2 | 80 | 10 | 10 |
This is how I calculated cumsum_locationbased and cumsum_marketbased:
WITH base_data AS (
SELECT
entityId,
parentId,
emission,
emission_locationbased ,
emissions_marketbased ,
array_agg(entityId) OVER (ORDER BY parentId) AS arr
FROM entity
),
cumulative_sums AS (
SELECT
entityId,
parentId,
emission,
emission_locationbased ,
emissions_marketbased ,
SUM(COALESCE(emission, emission_locationbased , 0))
OVER (ORDER BY arr) AS cumsum_locationbased,
SUM(COALESCE(emission, emissions_marketbased , 0))
OVER (ORDER BY arr) AS cumsum_marketbased
FROM base_data
GROUP BY entityId, parentId, emission, emission_locationbased, emissions_marketbased, arr
)
SELECT
entityId,
parentId,
emission,
CASE WHEN emission IS NOT NULL THEN NULL ELSE emission_locationbased END AS emission_locationbased ,
CASE WHEN emission IS NOT NULL THEN NULL ELSE emissions_marketbased END AS emissions_marketbased,
cumsum_locationbased,
cumsum_marketbased
FROM cumulative_sums
I has been a while since I last touch SQL so I can’t figure out what to do.
Ngo Chi Binh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.