I have a table in RDS (aurora postgres) that looks like this:
students:
student_id (pk) | student_name | number_of_classes_taken | one_violation | multiple_violation |
---|---|---|---|---|
1234 | joe_smith | 1 | False | False |
5678 | jane_smith | 9 | True | True |
I have a list of dictionaries that looks like this:
students = [
{"student_id": "1234", "name": "joe_smith", "classes_added": 1},
{"student_id": "5678", "name": "jane_smith", "classes_added": 3},
{"student_id": "9012", "name": "jenny_smith", "classes_added": 5}
]
I want to insert the students from the list of dictionaries that aren’t already in the table while updating the ones that are. If a student exists, I want to update the column for number_of_classes_taken
to be the sum of the existing integer in the row and the integer for classes_added
in the dictionary. So for joe_smith
, it would be 2
(1 existing + 1 added). For jane_smith
it would be 12
(9 existing + 3 added).
For students that aren’t in the table, I just want to insert them and set number_of_classes_taken
to the value for classes_added
.
Ideally, if this is possible, I would also like to update one_violation
to True
for anyone taking more than 6 classes and multiple_violation
to True
for anyone taking more than 7 classes. Otherwise set them both to False
. On inserts of new students, I’d like both of these to be set to False
by default. If this is not possible in the same query/call, It can be done separately.
So far I’ve tried:
from sqlalchemy.dialects.postgressql import insert
# Iterate the list of students to be added
for student in students:
insert_stmt = insert(students).values(
student_id={student['student_id']},
student_name={student['name']},
number_of_classes_taken={student['classes_added']},
one_violation={False},
multiple_violation={False})
# If a student exists based off the primary key (the student_id) update the row with the sum(classes_added + number_of_classes_taken) and update the one_violation and multiple_violation column if the student has more than 6 or more than 7 classes
do_update_stmt = insert_stmt.on_conflict_do_update(
constraint="student_id_pk",
set_=dict(number_of_classes_taken=student["classes_added"], one_violation=??, multiple_violation=??),
)
This attempt came from here: https://docs.sqlalchemy.org/en/20/dialects/postgresql.html#insert-on-conflict-upsert
I think I am close but I can’t figure it out. I also do not know how to do the sum() for classes taken or how to update the one_violation
and multiple_violation
off that value. Ideally, I’d like to be able to do this in one call to the DB, otherwise it’s fine to do the insert/update in one call and the updating of one_violation
and multiple_violation
in another.
I am also open to other approaches if there’s a better way to do this. My connection to RDS is done through SQLAlchemy create_engine
and this works.
I’m pretty new to SQLAlchemy so any help is appreciated.