SQLAlchemy provides a few options for handling “cascaded” deletions – from what I can gather, either the ORM tries to handle cascading itself (i.e., where passive_deletes=False
[default] and cascade="{...}, delete"
is set in a relationship()
definition), or it may be “informed” of a database-side ON DELETE CASCADE
configuration by specifying passive_deletes=True
, in which the ORM may defer some deletes to the DB.
From my understanding of the documentation, the former case is the default because it’s safest not to assume anything about the backend’s cascade configuration (and some backends have limited/unusual/no support for this feature). The latter allows for optimisation and/or more complex scenarios, by (for instance) offloading “far-reaching” (both in volume of affected rows, and depth of cascading/recursion) cascades to the database side. In this case, when deleting a “parent” row, the ORM only issues explicit DELETE
statements for that row and any objects it has “loaded”, but relies on the DB to handle any other cascades.
There is a third option: passive_deletes="all"
. I’m not entirely clear what the intended use case is for this option, though from what I could gather (and verify through testing) it seems to cause the ORM not to issue DELETE
s for children (loaded or not) when deleting a parent. The resulting SQL issued is the same as if passive_deletes
was left unspecified (default=False) AND cascade="{...}, delete"
is also omitted (i.e., the ORM is “naive” to the DB config) AND {some other conditions are met – see example [A]}. In a scenario where ON DELETE CASCADE
is (known to be) specified in the backend DB, the resulting DB state (post-deletions) is therefore also identical in these cases, though my instinct is that the explicitly declared cascade behaviour is required so the ORM “knows about” the cascaded deletions – though I couldn’t construct a test scenario where this “mattered” (i.e., I couldn’t force an exception nor generate some inconsistent result).
I constructed several test cases, with various combinations of paramaters, and observed both the SQL issued and the DB achieved. I won’t include all of these, since there’s some redundancy in results, but the key examples are as follows (some SQLAlchemy boilerplate is omitted as per their docs)
[Init the DB with raw SQL – I’m using PostgreSQL]
CREATE TABLE IF NOT EXISTS parent (
id int PRIMARY KEY
);
CREATE TABLE IF NOT EXISTS child (
id int PRIMARY KEY,
parent_id int REFERENCES parent ON DELETE CASCADE
);
INSERT INTO parent VALUES (1);
INSERT INTO child VALUES (1, 1);
INSERT INTO child VALUES (2, 1);
INSERT INTO child VALUES (3, 1);
Case A (control) – no cascade information, no relationship(), only a ForeignKey() declared
class Parent(Base):
__tablename__ = "parent"
id: Mapped[int] = mapped_column(primary_key=True)
class Child(Base):
__tablename__ = "child"
id: Mapped[int] = mapped_column(primary_key=True)
parent_id: Mapped[int] = mapped_column(ForeignKey("parent.id"))
p1 = session.get(Parent, 1)
session.delete(p1)
session.commit()
This results in a cascaded delete of the parent and all children – the emitted SQL only DELETE
s Parent, with the DB handling the cascade.
Case B – add a relationship()
class Parent(Base):
__tablename__ = "parent"
id: Mapped[int] = mapped_column(primary_key=True)
children: Mapped[list["Child"]] = relationship()
# (... as above)
This time parent_id
is (explicitly) NULL
d for each child – as expected for default ORM behaviour.
Case C – add cascade="all, delete"
to relationship()
# (...)
children: Mapped[list["Child"]] = relationship(cascade="all, delete")
# (...)
Now the ORM issues DELETE on the children, via WHERE parent.id=1
.
Case D – add passive_deletes=True
# (...)
children: Mapped[list["Child"]] = relationship(
cascade="all, delete", passive_deletes=True)
# (...)
Now the ORM issues the same SQL as in Case A, though presumably this time it is “aware” of the child deletions (how/where/why does this matter?)
Case E – “Load” a child before deleting
# (...)
p1 = session.get(Parent, 1)
c2 = p1.children[1]
session.delete(p1)
# (...)
This result is similar to Case C, though instead of using WHERE parent.id=1
, the ORM uses WHERE child.id = %(id)s::INTEGER
; [{'id': 1}, {'id': 2}, {'id': 3}]
.
Case F – Change passive_deletes=True
to passive_deletes="all"
This raises an exception at the line p1 = session.get(Parent, 1)
, reading “can’t set passive_deletes=’all’ in conjunction with ‘delete’ or ‘delete-orphan’ cascade”; Does this indicate using passive_deletes='all'
is intended for cases where the DB (and only the DB) handles cascading, and thus it mustn’t be configured in relationship()?
Case G – remove cascade="all, delete"
from relationship()
Now the emitted SQL is per A (and D), though I don’t know how to test the ORM’s “awareness” in each of these three cases to determine the differences (and therefore the use case for/best practices around passive_deletes="all"
).
Some notes/caveats:
I did find this comment, which may suggest it’s depreciated/obsolete: https://github.com/sqlalchemy/sqlalchemy/issues/3844#issuecomment-441935264.
The docs here state it “disable[s] the ‘nulling out’ of the child foreign keys”, and that “this is a very special use-case setting”, though I can’t actually think of one.