Many modern Relational Database Management Systems automatically support referential integrity, i.e. when you try to delete a tuple which has a reference (in the form of foreign key, for example), the DBMS doesn’t complete the operation and brings an error.
Consider a database where every table has an attribute, which indicates if a tuple is deleted or not. So no data is actually deleted from the database, but is marked as deleted instead. If a tuple is marked as deleted, all its references need to be marked as deleted too or an error should occur. How can this be supported?
Is performing additional checks (programmatically or with triggers) before deleting a tuple the only way to have referential integrity? Are there any accepted practices or algorithms?
Edit: This flag is mostly used for statistics, and partially for data recovery after a long period of time. It is filter with a special meaning, and right now when queries are made, referential integrity is checked right in the query, which is extremely error prone and not reliable at all.
If you want to do this with just DRI, you can’t just use a flag. Or, you can, but you won’t like it.
Let’s define your “deleted flag” as a status flag Status CHECK IN ('OK', 'Deleted')
because negative flags are confusing. Add STATUS
to every table, and make it part of every FK constraint. That way, every row has a status that must match the status of anything it’s related to. But you won’t be able to “delete” rows on either end of the relationship while the FK constraint is in force!
You can get around that using triggers instead of DRI. It’s hard work, though, and tricky to get exactly right.
Instead of a flag column, define pairs of tables, one for active rows and one for inactive. When a referent row is to be deleted, first copy it and the related rows to the inactive table, then delete per normal. That gives you a chance to capture other aspects of deletion, such as when, why, and by whom.
Setting a “deleted” flag on a tuple doesn’t actually delete the tuple, nor does it necessarily mark the tuple for actual deletion. The “deleted flag” merely provides a way for applications to filter out the records that have been flagged as deleted.
So deleting a record from a table having records related by referential integrity still works the same way. All of the related records must still be deleted (including the ones “marked as deleted”).
Why would one do this? Because the “deleted flag” allows trivial recovery of the deleted record, by simply setting its “deleted flag” to false. Some systems allow “retirement” of such records by permanently removing them from the database after a certain period of time.
4