I have a table where I want to purge some old entries matched by the same “history_id” (so you could have 100 rows with the same history_id).
However, I want to always leave at least 2 for each history_id, and purge all the rest as long as they’re over a year old.
Example table:
CREATE TABLE `post_history` (
`id` int(11) UNSIGNED NOT NULL,
`history_id` int(11) UNSIGNED NOT NULL,
`user_id` int(11) UNSIGNED NOT NULL,
`date` int(11) UNSIGNED NOT NULL,
`text` text DEFAULT NULL
) ENGINE=InnoDB DEFAULT;
I’m unsure how to do this, a plain delete is easy of course:
DELETE FROM `post_history` WHERE `date` < UNIX_TIMESTAMP(DATE_SUB(NOW(),INTERVAL 1 YEAR));
But tweaking it to match the history_id rows for at least 2, and then do it over a year old is killing my brain.
10.6.18-MariaDB
2
You can assign a row identifier using a windowing function, like ROW_NUMBER()
:
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY date DESC) rowid
FROM post_history
This partitions by id
and orders by the date from newest to oldest. This means the newest two records for each ID will have rowid
equal to 1 or 2. You can then use this to delete the records, like so:
-- WARNING: Untested
WITH Numbered AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY date DESC) rowid
FROM post_history
)
DELETE R
FROM post_history R
JOIN Numbered N
ON
R.id = N.id
AND R.date = N.date
WHERE
N.rowid > 2
AND R.date < UNIX_TIMESTAMP(DATE_SUB(NOW(),INTERVAL 1 YEAR));
EDIT: misread the part about keeping 2 and getting rid of the rest only if they’re > 1 year old. This update only discards records if they’re outside the first 2 and they’re more than 1 year old.
Try this:
WITH numbered AS (
select id, date,
row_number() over (partition by history_id order by date desc) rn
from post_history
)
DELETE p
FROM post_history p
INNER JOIN numbered n on n.id=p.id
AND n.rn() > 2
AND n.date < UNIX_TIMESTAMP(DATE_SUB(NOW(),INTERVAL 1 YEAR))
And of course, for complicated deletes, you should run it as a SELECT first to be sure it targets the right records:
WITH numbered AS (
select id, date,
row_number() over (partition by history_id order by date desc) rn
from post_history
)
SELECT p.*
FROM post_history p
INNER JOIN numbered n on n.id=p.id
AND n.rn > 2
AND n.date < UNIX_TIMESTAMP(DATE_SUB(NOW(),INTERVAL 1 YEAR))
1