Background: I have a unique setup with multiple tables, each with circa 1,600 columns, which I am trying to clean up (it creates benefits upstream). I know that around 80% of these columns will always be [NULL]
, but I need to know precisely which ones (millions of rows).
I could do the following 1,600 times, but it seems very inefficient:
select
max(column1) as 'column1',
max(column2) as 'column2'
from mytable;
If it makes a difference, it’s Redshift (but I don’t have access to AWS).
So far, I have tried the above as a proof of concept, but it doesn’t scale well. So, my only other option is to copy and paste the headers into Excel using a transverse paste and then write a concat based on the numbers.
I appreciate any help you can provide otherwise – I will accept my fate 🙂
If you wanted to generate a query like yours:
SELECT MAX(column1) AS column1, MAX(column2) AS column2, ... FROM mytable;
Try this:
SELECT 'SELECT ' ||
LISTAGG('MAX(' || column_name || ') AS ' || column_name, ', ')
WITHIN GROUP (ORDER BY ordinal_position) ||
' FROM your_schema.mytable;' AS generated_query
FROM information_schema.columns
WHERE table_schema = 'your_schema'
AND table_name = 'mytable';