I have a table that contains 15 duplicate rows (for a total of 30 including originals + duplicates). I have been tasked with writing a script to grab the duplicate records and delete them. My problem is, the table does not contain an ID column like “shipment_item_id”. So I found a solution to use rowid to add my own id column and then I can distinguish the dupes from the originals.
I wrote a SELECT statement that gets the duplicates, but when I try to include it in a DELETE statement, I get an error :
ORA-00913 – too many values.
I find it hard to believe that 15 records is too many.
Here is the script. I’m sure it’s an easy fix but I have not been able to figure it out:
DELETE shipment_item
WHERE rowid NOT IN (
SELECT MAX(rowid),
po_number,
item_number
FROM shipment_item
GROUP BY po_number, item_number
HAVING COUNT(*) > 1
)
1
The “two many values” does not talk about the rows, but about the columns:
you do a:
where <1 column> not in (select <3 columns>)
Do keep only the max(rowid)
in your select and you’re good to go.
That said…
You shall remove your having count(*) > 1
to avoid deleting non duplicated entries.
Let’s say your table has:
rowid | po_number | item_number |
---|---|---|
1 | A | B |
2 | C | D |
3 | C | D |
Your subselect will return 3 for C,D (as intended),
but the having count(*) > 1
will filter out 1 for A,B because as you have only one A,B ,its count(*)
= 1.
Thus the subselect returns only 3 as an entry to preserve, and thus… pooof with A,B!
And so…
You’ll end up with:
DELETE
FROM SHIPMENT_ITEM
WHERE rowid not in (
SELECT max(rowid)
FROM shipment_item
GROUP BY
po_number,
item_number
)
As others have pointed out you have 3 columns in your subquery which should be one.
I would like to suggest an alternative query with EXISTS instead of NOT IN. Depending on table size this could lead to faster execution (YMMV, hard to judge without knowing specifics).
DELETE FROM shipment_item si
WHERE EXISTS (
SELECT 1
FROM shipment_item
WHERE po_number = si.po_number
AND item_number = si.item_number
AND rowid < si.rowid
);
2
It is not about number of rows, but number of columns (values you select
).
DELETE
FROM SHIPMENT_ITEM
WHERE rowid not in ( --> you are comparing ROWID with ...
SELECT max(rowid), --> ... max(rowid), but also ...
po_number, --> ... po_number and ...
item_number --> ... item_number
FROM shipment_item
GROUP BY
po_number,
item_number
HAVING COUNT(*) > 1
)
Remove superfluous columns from the subquery!
DELETE
FROM SHIPMENT_ITEM
WHERE rowid not in (
SELECT max(rowid)
FROM shipment_item
GROUP BY
po_number,
item_number
HAVING COUNT(*) > 1
)