I have table containing 3.2TB data in partition named default_partition, & all writes/read goes to this partition, as we have defined partitions on table like below :
PARTITION BY RANGE (to_days(created_at))
PARTITION 2024_03_19 VALUES LESS THAN (739330) ENGINE = InnoDB,
PARTITION default_partition VALUES LESS THAN MAXVALUE ENGINE = InnoDB)
We want to remove data older than 7 days from default_partition and want to save new data coming into separate daily partitions based on created_at column, like “2024_06_06”, “2024_06_07”, “2024_06_08”.. etc. The purpose of this is we only want to keep last 7 days data, & we will keep dropping partitions older than 7 days via application scheduler if they exist.
But it seems to even achieve that, I have to re-organize default_partition using:
Query 1: ALTER TABLE mytable REORGANIZE PARTITION default_partition INTO (
PARTITION past VALUES LESS THAN (TO_DAYS('2022-04-05')),
PARTITION default_partition VALUES LESS THAN MAXVALUE
)
Once above query get executed, default_partition should become empty now, then running everyday scheduler we can run keep creating future partitions as given mention in below 2 queries, so that all new data for 6th/7th/8th…etc will go in their own partitions.
ALTER TABLE mytable REORGANIZE PARTITION default_partition INTO (
PARTITION '2022-04-05' VALUES LESS THAN (TO_DAYS('2022-04-06')),
PARTITION default_partition VALUES LESS THAN MAXVALUE
)
ALTER TABLE mytable REORGANIZE PARTITION default_partition INTO (
PARTITION '2022-04-06' VALUES LESS THAN (TO_DAYS('2022-04-07')),
PARTITION default_partition VALUES LESS THAN MAXVALUE
)... etc.
Issue :-
Running Query 1 above will copy 3.2TB of data to partition: ‘2022-04-05’, which I am not sure if can be done in production, as it will cause downtime to mytable while it’s getting created.
-
Is there a way I can just create future partitions like ‘2024-04-07’, ‘2024-04-08’, etc without touching data in default_partition for now & future date data will be saved into these partitions & once we have enough data in existing partitions, say last 7 days. We will just delete the old data in default_partition by taking some downtime.
-
I was thinking if we should create a new table & start read/write on that table for next 7 days then delete old table, this could save me from any downtime as well, but this required code changes at many places.
What would be best way to remove data older than 7 days in default_partition with zero or no downtime at all ?