We have a method which executes a for loop which iterates 24 times (for every hour of a given day) and executes 4 queries each (so 24 * 4) against a postgres db:
- delete a partition for that hour if exists
- create a partition for that hour
- attach that partition to the parent table
- execute a “INSERT … SELECT …” into the just generated partition, through the parent table
Every time that method gets called, the first 3 queries execute very fast (<<1s) and the 4th query executes 23 times quite fast (<5s) and once extremely long. Currently it runs for about 16+ minutes, this also seems to increase the more often data is inserted. E.g. in the following example it is the 9th hour:
finished hour: 0, insertDuration: 1376, createPartitionDuration: 2943
finished hour: 1, insertDuration: 747, createPartitionDuration: 112
finished hour: 2, insertDuration: 653, createPartitionDuration: 103
finished hour: 3, insertDuration: 943, createPartitionDuration: 146
finished hour: 4, insertDuration: 1175, createPartitionDuration: 223
finished hour: 5, insertDuration: 1716, createPartitionDuration: 119
finished hour: 6, insertDuration: 2864, createPartitionDuration: 168
finished hour: 7, insertDuration: 2720, createPartitionDuration: 133
finished hour: 8, insertDuration: 3074, createPartitionDuration: 226
finished hour: 9, insertDuration: 1146266, createPartitionDuration: 120
finished hour: 10, insertDuration: 2246, createPartitionDuration: 434
finished hour: 11, insertDuration: 2952, createPartitionDuration: 118
finished hour: 12, insertDuration: 2231, createPartitionDuration: 149
finished hour: 13, insertDuration: 2594, createPartitionDuration: 130
finished hour: 14, insertDuration: 2313, createPartitionDuration: 221
finished hour: 15, insertDuration: 2167, createPartitionDuration: 117
finished hour: 16, insertDuration: 2217, createPartitionDuration: 119
finished hour: 17, insertDuration: 2397, createPartitionDuration: 113
finished hour: 18, insertDuration: 2456, createPartitionDuration: 85
finished hour: 19, insertDuration: 2565, createPartitionDuration: 133
finished hour: 20, insertDuration: 2389, createPartitionDuration: 108
finished hour: 21, insertDuration: 2574, createPartitionDuration: 131
finished hour: 22, insertDuration: 1787, createPartitionDuration: 144
finished hour: 23, insertDuration: 1739, createPartitionDuration: 191
The query that gets executed in step 4 looks like this:
INSERT INTO table_insert(title, time, count, position)
SELECT *, row_number() over () as rn
FROM (
SELECT title,
p1.time as time,
sum(count) as count
FROM table_select_a pt
JOIN table_select_b p1
ON pt.title = p1.title
AND p1.time = ?
GROUP BY title, p1.time
order by count DESC
) a
The table_select_a is very plain and contains only a single column which is of type varchar.
The table_select_b is partitioned twice and looks like this (there are many more leaf partitions, but only 3 partitions by foo, also all the data lands in those leaf tables):
The table_insert looks like table_select_b except the column foo, so the partitions by time are directly under the parent partition.
The postgres db is running inside a docker environment, which runs on a ubuntu Ubuntu 22.04.4. My docker compose configuration looks like this:
postgres:
networks:
- my_network
image: postgres:16.1
container_name: postgres
hostname: postgres
shm_size: 4gb
ports:
- "5432:5432"
volumes:
- ./data/postgres:/var/lib/postgresql/data
networks:
my_network:
attachable: true
The docker logs of postgres in the same time of above logs for the 4th query looks like this (first and last checkpoint are before and after above query executions and only added for completness):
postgres | 2024-05-30 06:25:00.453 UTC [27] LOG: checkpoint starting: wal
postgres | 2024-05-30 06:27:53.246 UTC [27] LOG: checkpoint complete: wrote 3854 buffers (23.5%); 0 WAL file(s) added, 58 removed, 7 recycled; write=68.415 s, sync=98.809 s, total=172.793 s; sync files=298, longest=8.815 s, average=0.332 s; distance=1069040 kB, estimate=1171170 kB; lsn=2AE/74048F40, redo lsn=2AE/3CDFDAE8
postgres | 2024-05-30 06:32:53.348 UTC [27] LOG: checkpoint starting: time
postgres | 2024-05-30 06:37:24.730 UTC [27] LOG: checkpoint complete: wrote 470 buffers (2.9%); 0 WAL file(s) added, 10 removed, 46 recycled; write=269.654 s, sync=0.065 s, total=271.382 s; sync files=292, longest=0.017 s, average=0.001 s; distance=903515 kB, estimate=1144404 kB; lsn=2AE/852C04C0, redo lsn=2AE/74054A88
postgres | 2024-05-30 06:52:40.410 UTC [27] LOG: checkpoint starting: wal
postgres | 2024-05-30 06:53:13.560 UTC [27] LOG: checkpoint complete: wrote 262 buffers (1.6%); 0 WAL file(s) added, 1 removed, 32 recycled; write=23.175 s, sync=7.554 s, total=33.151 s; sync files=91, longest=0.708 s, average=0.083 s; distance=542911 kB, estimate=1084255 kB; lsn=2AE/B4BF43D0, redo lsn=2AE/95284A50
postgres | 2024-05-30 06:53:18.196 UTC [27] LOG: checkpoint starting: wal
postgres | 2024-05-30 06:54:17.006 UTC [27] LOG: checkpoint complete: wrote 143 buffers (0.9%); 0 WAL file(s) added, 0 removed, 33 recycled; write=45.805 s, sync=8.251 s, total=58.811 s; sync files=94, longest=2.774 s, average=0.088 s; distance=539581 kB, estimate=1029788 kB; lsn=2AE/D4FAFAC0, redo lsn=2AE/B6174068
postgres | 2024-05-30 06:54:31.207 UTC [27] LOG: checkpoint starting: wal
postgres | 2024-05-30 06:59:01.768 UTC [27] LOG: checkpoint complete: wrote 6 buffers (0.0%); 0 WAL file(s) added, 0 removed, 33 recycled; write=269.701 s, sync=0.010 s, total=270.562 s; sync files=10, longest=0.008 s, average=0.001 s; distance=539369 kB, estimate=980746 kB; lsn=2AE/DD0D70F0, redo lsn=2AE/D702E6C8
postgres | 2024-05-30 07:19:32.187 UTC [27] LOG: checkpoint starting: time
postgres | 2024-05-30 07:19:33.471 UTC [27] LOG: checkpoint complete: wrote 8 buffers (0.0%); 0 WAL file(s) added, 0 removed, 6 recycled; write=0.978 s, sync=0.040 s, total=1.284 s; sync files=19, longest=0.019 s, average=0.003 s; distance=99658 kB, estimate=892637 kB; lsn=2AE/DD180F70, redo lsn=2AE/DD180F38
The query from above also exclusively runs on this db, so no other query would slow it down. This was also confirmed by checking pg_stat_activity, which shows only that query and some other connections which are in state idle.
It also does not seem that an autovacuum is running in the meantime as the history table of that (found a query for that somewhere here on SO but don’t have it anymore) shows no actions in that timeframes.
The server itself also seems to be somewhat idle, from it’s 8 threads is one thread on 100% utilization (so htop shows like 90-100%) and completely used on postgres, the rest is idle. Also there’s still like 50gb memory free of 64 gb.
The data in the tables is also somewhat equally distributed, there’s a bit more in the hours around noon time and less in the night hours. When the queries are done, in each of the partitions of table_insert are around 350k rows.
So what is happening here and why does a single execution takes so much more time? We are somewhat out of ideas of what we even could check and would be grateful on further things to check.