My team is trying to create a Data Lineage SQL query using the metadata tables from the ODI-12c work repository. The idea is to input a specific column into our query and return all the stages/transformations that column has undergone until the final output, or vice versa. At the moment, we have managed to build a basic query that considers every mapping within an ODI package, but this query fails when a ‘ReusableMapping’ component is present in our mappings. We have not been able to find a workaround for extracting the transformations performed in that reusable mapping.
The sql query we have is:
-- Define the user's input for the initial output column
DECLARE @UserOutputColumn NVARCHAR(255) = '<datastore.output_column>';
DECLARE @UserPackage NVARCHAR(255) = '<your_ODI_package>';
-- Create a CTE to precompute the necessary columns
WITH PrecomputedData AS (
SELECT
map.I_MAPPING,
STEP.NNO as MAPPING_ORDER,
MAP.[NAME] as Mapping,
CASE
WHEN ME.TXT IS NOT NULL THEN ME.TXT
ELSE CONCAT(MR.QUALIFIED_NAME, '.', MA.NAME)
END as origin,
mc.[NAME] as Component,
mc.[TYPE_NAME] as Comp_Type,
MA.[NAME] as OUTPUT_ATTRIBUTE,
CASE
WHEN mc.TYPE_NAME = 'DATASTORE' AND ME.TXT IS NOT NULL AND MR.QUALIFIED_NAME IS NOT NULL THEN CONCAT(MR.QUALIFIED_NAME, '.', MA.NAME)
ELSE CONCAT(mc.[NAME], '.', MA.[NAME])
END as table_output
FROM
[db_odiwork].[dbo].SNP_MAP_COMP mc
JOIN
[db_odiwork].[dbo].SNP_MAPPING MAP ON mc.I_OWNER_MAPPING = MAP.I_MAPPING
JOIN
[db_odiwork].[dbo].SNP_MAP_CP CP ON CP.I_OWNER_MAP_COMP = mc.I_MAP_COMP
JOIN
[db_odiwork].[dbo].SNP_MAP_ATTR MA ON CP.I_MAP_CP = MA.I_OWNER_MAP_CP -- only 'Output' direction
LEFT JOIN
[db_odiwork].[dbo].SNP_MAP_EXPR ME ON MA.I_MAP_ATTR = ME.I_OWNER_MAP_ATTR
LEFT JOIN
[db_odiwork].dbo.SNP_MAP_REF MR ON mc.I_MAP_REF = MR.I_MAP_REF
JOIN
[db_odiwork].[dbo].[SNP_FOLDER] folder ON map.I_FOLDER = folder.I_FOLDER
JOIN
[db_odiwork].[dbo].SNP_STEP STEP ON STEP.I_MAPPING = MAP.I_MAPPING
JOIN
[db_odiwork].[dbo].SNP_PACKAGE package ON STEP.I_PACKAGE = package.I_PACKAGE
WHERE
1=1
-- AND I.NAME = '<COMP_NAME>'
-- AND MAP.NAME LIKE '%142P031%'
AND package.PACK_NAME LIKE @UserPackage
)
-- Recursive CTE to trace lineage
, LineageCTE AS (
-- Anchor member: select the initial row based on the user-specified final column
SELECT
I_MAPPING,
MAPPING_ORDER,
Mapping,
origin,
Component,
Comp_Type,
OUTPUT_ATTRIBUTE,
table_output,
0 as LEVEL -- initial level
FROM
PrecomputedData
WHERE
table_output = @UserOutputColumn
UNION ALL
-- Recursive member: select the preceding rows based on the current origin
SELECT
PC.I_MAPPING,
pc.MAPPING_ORDER,
pc.Mapping,
pc.origin,
pc.Component,
pc.Comp_Type,
pc.OUTPUT_ATTRIBUTE,
pc.table_output,
cte.LEVEL + 1 -- increment level
FROM
PrecomputedData pc
INNER JOIN
LineageCTE cte ON cte.origin = pc.table_output AND cte.MAPPING_ORDER >= pc.MAPPING_ORDER
)
SELECT DISTINCT
Mapping,
origin,
CONCAT(Component, '(', Comp_Type, ')') AS COMPONENT,
table_output,
MAPPING_ORDER as MAPPING_STEP,
LEVEL
FROM LineageCTE lineage
JOIN [db_odiwork].[dbo].SNP_STEP step ON step.I_MAPPING = lineage.I_MAPPING
ORDER BY MAPPING_STEP, LEVEL DESC, Mapping, Component;
This query works well for multiple mappings inside a package, considering the order of each mapping and of each mapping’s component, and returning the complete lineage for a given output column (datastore.column) when:
- The expression of each transformation is like <table.column>.
- We consider simple transformations that manipulate columns, like AGGREGATE & DATASTORE.
So right now the problem is when we have a REUSABLEMAPPING component. We would want to ‘enter’ to that mapping, and continue with our lineage. For example:
enter image description here
Taking the ‘SAL’ column as our target column, we would want to get a lineage like this:
Mapping | origin | Component | Comp_Type | OUTPUT_ATTRIBUTE | table_output |
---|---|---|---|---|---|
Mapping1 | DATASTORE.EXAMPLE.EMP.SAL | EMP | DATASTORE | SAL | EMP.SAL |
REUSABLEMAPPING | EMP.SAL | FIRST_TRANSFORMATION | XXXXXXXX | XXXXXX | XXXXXXXXXXXX |
… | … | … | … | … | … |
REUSABLEMAPPING | XXXXXXX | LAST_TRANSFORMATION | XXXXXXXX | COMM_SAL | DATASTORE.REUSABLEMAPPING.LAST_TRANSFORMATION.COMM_SAL |
Mapping1 | DATASTORE.REUSABLEMAPPING.LAST_TRANSFORMATION.COMM_SAL | EMPLOEE_1 | DATASTORE | SAL | DATASTORE. EXAMPLE.EMPLOEE_1.SAL |
Easier, we would like to achieve something with this logic:
EMP.SAL ---> {SAL in REUSABLEMAPPING} ---> EMPLOEE_1.SAL
Any guidance or advice you can give us will be very much appreciate. Thank you all.
NachoPerez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.