I want to write a postgres function that makes a column by looking up the values of other columns in a different table.
As an example if I have the tables:
create schema if not exists example;
CREATE TABLE example.price_table (
symbol text NOT NULL,
time_of timestamptz NOT NULL,
price float8 null
volume float8 null
)
INSERT INTO example.price_table VALUES
('GME', '2016-01-01 00:00:00', 10.0, 10),
('GME', '2016-01-02 00:00:00', 11.0, 3),
('GME', '2016-01-03 00:00:00', 9.0 , 7),
('GME', '2016-01-03 00:40:00', 10.5, 76),
('GME', '2016-01-04 00:00:00', 11.0, 10),
('BBBY', '2016-01-01 00:00:00', 20.0, 5),
('BBBY', '2016-01-02 02:30:00', 2123.0, 1),
('BBBY', '2016-01-02 03:30:00', 2133.0, 1),
('BBBY', '2016-01-02 04:30:00', 2144.0, 100),
('BBBY', '2016-01-03 00:00:00', 29.0, 999),
('BBBY', '2016-01-03 03:40:00', 20.5, 1),
('BBBY', '2016-01-03 04:40:00', 20.6, 54),
('BBBY', '2016-01-04 00:00:00', 21.0, 34),
('BBBY', '2016-01-06 00:00:00', 666.0, 1);
CREATE TABLE example.interesting_times (
symbol text NOT NULL,
time_of timestamptz NOT NULL
)
INSERT INTO example.interesting_times VALUES
('GME', '2016-01-01 00:00:00'),
('GME', '2016-01-02 00:30:00'),
('GME', '2016-01-03 05:00:00'),
('GME', '2016-01-03 00:40:00'),
('GME', '2016-01-04 00:00:00'),
('BBBY', '2016-01-01 01:00:00'),
('BBBY', '2016-01-02 00:00:00'),
('BBBY', '2016-01-03 00:00:00'),
('BBBY', '2016-01-03 07:40:00'),
('BBBY', '2016-01-04 00:00:00');
Then I want to be able to query the interesting_times times table while adding a column that gives the earliest market price at least 3 hours (but not more than 3 hours + 24 hours) into the future. Ideally I would like to do this in a simple way via a new markout_price_table
function like:
select symbol, time_of, markout_price_table(symbol, time_of, 'price', '3 hours', '24 hours') as price_3_hours_later
from example.interesting_times order by symbol, time_of;
and I could similarly get a different markout by adding something like markout_price_table(symbol, time_of, 'price', '6 hours', '24 hours') as price_6_hours_later
or markout_price_table(symbol, time_of, 'volume', '0 hours', '24 hours') as volume_at_time
in the query.
Is it possible to encapsulate the logic in a function like this and if so would it be as fast as getting all of the data downloaded and doing the merges locally?
I can get the desired result of the above using a few merges but I don’t like this as a solution as I would like to encapsulate good logic for the merges rather than copy pasting the below lots of times into each query.
with tabA as (
select a.symbol as symbol, a.time_of as time_of, b.price as price_3_hours_later, b.time_of - (a.time_of + '3 hours') as timeliness
from example.interesting_times a left join example.price_table b on (a.symbol = b.symbol) and (a.time_of + '3 hours' < b.time_of) and (a.time_of + '3 hours' + '24 hours' > b.time_of) order by b.time_of - (a.time_of + '3 hours')
), tabB as (
select symbol, time_of, price_3_hours_later, timeliness, min(timeliness) over (partition by symbol, time_of order by timeliness) as min_tim from tabA
) select symbol, time_of, price_3_hours_later from tabB where (timeliness = min_tim) or (timeliness is null) order by symbol, time_of;