TNM Staging
I am doing my first project. I have some data t_stage is from t0 to t4, n_stage n0 to n3 and m_stage is m0 and m1.
Different combos of these give different results. So t1,n2,m0 is Stage3a, t4,n1,m1 would be stage 4.
What I’m looking to do is for a query to have these specific patterns already in the syntax so when the query is run there’s an extra column specifying the Stage dependent upon the t, n and m numbers that are in the table for each record.
So as above if t is t1, n is n2 and m is 0 there would be an extra column titled Stage with stage 3a for that record and so on for the variations. (See image for combinations).
I just can’t seem to figure out how to create that extra ‘Stage’ column and populate with the answer when the query is run.
I’m probably missing something very obvious. As I say it’s my first project with mySQL. I’m used to MSAccess I keep thinking a lot of the automation will still be there.
I HAVE to keep the separate T,N and M scores for each record within the schema for potential interrogation later.
I’ve looked at DERIVED, UNION, WHERE and HAVING and I just don’t know if I’m on the right track.
CREATE TABLE pathology (
pathology_id VARCHAR(100) NOT NULL PRIMARY KEY,
medical_data_id VARCHAR(100),
grade INT
CONSTRAINT grade CHECK (grade IN (1, 2, 3, 4)),
tumour_weight DECIMAL (3,3) -- in grams
tumour_circumference DECIMAL (2,2) -- in cms
tumour_stage ENUM('t0','t1', 't2', 't3', 't4'),
node_stage ENUM('n0', 'n1', 'n2', 'n3'),
met_stage ENUM('m0', 'm1')
);
7
If you have a table similar to this:
-- S a m p l e D a t a :
Create Table your_table ( id Int, T_stage Text, N_stage Text, M_stage Text );
Insert Into your_table
Select 1, 'T0', 'N0', 'M0' Union All
Select 2, 'T0', 'N1', 'M0' Union All
Select 3, 'T0', 'N2', 'M0' Union All
Select 4, 'T0', 'N2', 'M1' Union All
Select 5, 'T1', 'N0', 'M0' Union All
Select 6, 'T1', 'N1', 'M0' Union All
Select 7, 'T1', 'N2', 'M0' Union All
Select 8, 'T3', 'N1', 'M0' Union All
Select 9, 'T3', 'N1', 'M1' Union All
Select 10, 'T3', 'N3', 'M0' Union All
Select 11, 'T4', 'N2', 'M0' Union All
Select 12, 'T4', 'N3', 'M0';
… you can create a cte (or a table) that defines different combinations of T, N and M stages giving them overall stage (as in your link above). I commented out the last 3 rows because they are distinctive and will be processed by Case expression in the Main SQL code below …
WITH
stages ( overall_stage, T_category, N_category, M_category ) AS
( Select 'Stage 0', 'T0', 'N0', 'M0' Union All
Select 'Stage 1', 'T1', 'N0', 'M0' Union All
Select 'Stage 2A', 'T0', 'N1', 'M0' Union All
Select 'Stage 2A', 'T1', 'N1', 'M0' Union All
Select 'Stage 2A', 'T2', 'N0', 'M0' Union All
Select 'Stage 2B', 'T2', ' N1', 'M0' Union All
Select 'Stage 2B', 'T3', 'N0', 'M0' Union All
Select 'Stage 3A', 'T0', 'N2', 'M0' Union All
Select 'Stage 3A', 'T1', 'N2', 'M0' Union All
Select 'Stage 3A', 'T2', 'N2', 'M0' Union All
Select 'Stage 3A', 'T3', 'N1', 'M0' Union All
Select 'Stage 3A', 'T3', 'N2', 'M0'
-- Union All
-- Select 'Stage 3B', 'T4', 'Any N', 'M0' Union All
-- Select 'Stage 3C', 'Any T', 'N3', 'M0' Union All
-- Select 'Stage 4', 'Any T', 'Any N', 'M1'
)
… this cte, when left joined to your_table, should give you the expected overall_stage for every row leaving your individual stages intact …
-- M a i n S Q L :
Select t.id, t.T_stage, t.N_stage, t.M_stage,
Case When t.M_stage = 'M1' Then 'Stage 4'
When t.N_stage = 'N3' And t.M_stage = 'M0' Then 'Stage 3C'
When t.T_stage = 'T4' And t.M_stage = 'M0' Then 'Stage 3B'
When s.overall_stage Is Not Null Then s.overall_stage
Else 'N/A'
End as overall_stage
From your_table t
Left Join stages s ON( s.T_category = t.T_stage And
s.N_category = t.N_stage And
s.M_category = t.M_stage )
Order By t.id
/* R e s u l t :
id T_stage N_stage M_stage overall_stage
-- --------- ---------- ---------- -------------
1 T0 N0 M0 Stage 0
2 T0 N1 M0 Stage 2A
3 T0 N2 M0 Stage 3A
4 T0 N2 M1 Stage 4
5 T1 N0 M0 Stage 1
6 T1 N1 M0 Stage 2A
7 T1 N2 M0 Stage 3A
8 T3 N1 M0 Stage 3A
9 T3 N1 M1 Stage 4
10 T3 N3 M0 Stage 3C
11 T4 N2 M0 Stage 3B
12 T4 N3 M0 Stage 3C */
See the fiddle here.
NOTE:
Case expression is sequential – that is – the first When condition satisfied will return it’s Then value and exit Case. Only issue could be which overall_stage has to be selected if you have a combination of T4-N3-M0 is it Stage 3C or 3B – I put 3C if that is opposite just switch those two rows within the Case expression….
P.S. The best solution would be to create a permanent table with all explicite combinations of the three stages with apropriate overall_stage for each. Then you would’t need Case expression at all – just the join selecting overall_stage from that new table for every joined row in your_table.
4
You should have the table every possible combination written in your DB. Then, you can join the cases table with the matching rows in the combinations table, using fields T, N and M, which should give you the result you are looking for.
If your cases table name was X and you combinations table was Y, the predicate in you SQL should read somthing similar to this:
SELECT (...), Y.STAGE
FROM X, Y
WHERE X.T=Y.T AND X.N=Y.N AND X.M=Y.M
According to the table you posted, there are some cases which allow any T or any N. For such cases, you should create some additional OR groups. Let me know if you need more elaboration on this.
4
SECOND ANSWER (after a lot of comments):
First create the tnm_staging table with all 40 possible explicite combination of the three column values defining the overall_stage column for each:
-- Combination generator for T 0-4 / N 0-3 / M 0-1
CREATE TABLE tnm_staging
WITH
t_stages ( T_cat ) AS
( Select 't0' Union All
Select 't1' Union All
Select 't2' Union All
Select 't3' Union All
Select 't4'
),
n_stages ( N_cat ) AS
( Select 'n0' Union All
Select 'n1' Union All
Select 'n2' Union All
Select 'n3'
),
m_stages ( M_cat ) AS
( Select 'm0' Union All
Select 'm1'
),
stages as
( Select Row_Number() Over(Order By tn.T_cat, tn.N_cat, m.M_cat) as id,
'Stage x' as overall_stage, tn.T_cat, tn.N_cat, m.M_cat
From ( Select t.T_cat, n.N_cat
From t_stages t
CROSS JOIN n_stages n
) tn
CROSS JOIN m_stages m
),
stage_names ( overall_stage, T_category, N_category, M_category ) AS
( Select 'Stage 0', 't0', 'n0', 'm0' Union All
Select 'Stage 1', 't1', 'n0', 'm0' Union All
Select 'Stage 2A', 't0', 'n1', 'm0' Union All
Select 'Stage 2A', 't1', 'n1', 'm0' Union All
Select 'Stage 2A', 't2', 'n0', 'm0' Union All
Select 'Stage 2B', 't2', 'n1', 'm0' Union All
Select 'Stage 2B', 't3', 'n0', 'm0' Union All
Select 'Stage 3A', 't0', 'n2', 'm0' Union All
Select 'Stage 3A', 't1', 'n2', 'm0' Union All
Select 'Stage 3A', 't2', 'n2', 'm0' Union All
Select 'Stage 3A', 't3', 'n1', 'm0' Union All
Select 'Stage 3A', 't3', 'n2', 'm0'
)
Select t.id, t.T_cat as tumour_stage, t.N_cat as node_stage, t.M_cat as met_stage,
Case When t.M_cat = 'm1' Then 'Stage 4'
When t.N_cat = 'n3' And t.M_cat = 'm0' Then 'Stage 3C'
When t.T_cat = 't4' And t.M_cat = 'm0' Then 'Stage 3B'
When s.overall_stage Is Not Null Then s.overall_stage
Else 'N/A'
End as overall_stage
From stages t
Left Join stage_names s ON( s.T_category = t.T_cat And
s.N_category = t.N_cat And
s.M_category = t.M_cat )
Order By t.id;
SELECT * FROM tnm_staging; -- this should resut with 40 different combinations each with it's own overall_stage following your rules
… and with your pathology table and data as :
CREATE TABLE pathology ( pathology_id VARCHAR(100) NOT NULL PRIMARY KEY,
tumour_stage ENUM('t0','t1', 't2', 't3', 't4'),
node_stage ENUM('n0', 'n1', 'n2', 'n3'),
met_stage ENUM('m0', 'm1')
);
-- S a m p l e D a t a :
pathology_id tumour_stage node_stage met_stage
------------ ------------ ---------- ---------
01 t0 n0 m0
02 t0 n1 m0
03 t0 n2 m0
04 t0 n2 m1
05 t1 n0 m0
06 t1 n1 m0
07 t1 n2 m0
08 t3 n1 m0
09 t3 n1 m1
10 t3 n3 m0
11 t4 n2 m0
12 t4 n3 m0
13 t2 n1 m0
… join your pathology and tnm_staging tables to get the overal_stage column for every row in pathology table:
-- S Q L :
Select t.pathology_id, t.tumour_stage, t.node_stage, t.met_stage,
s.overall_stage
From pathology t
Left Join tnm_staging s ON( s.tumour_stage = t.tumour_stage And
s.node_stage = t.node_stage And
s.met_stage = t.met_stage )
Order By t.pathology_id
See the fiddle here (with the results).
1