Here is the problem: I have a bag (table A) of vegetables (table B) and fruits (table C). Multiple vegetables and multiple fruits can be in the bag. Each vegetable and fruit has a price.
I need my query to return me the total amount of item (int) in the bag and the total sum of the price (Decimal). The bag can have 0 vegetable but 3 fruits and vice versa, it can have items but be free and it can also be empty and has a price (price of the bag itself). So we only filter out the bags with 0 items and 0 total_price at the end.
The best I could do is four subqueries like this:
query = select(Bag.id, Bag.purchase_datetime)
vegetable_price = select(func.sum(Vegetable.amount)).where(Vegetable.bag_id == Bag.id).scalar_subquery().label("vegetable_price_total")
vegetable_count = select(func.count(Vegetable.id)).where(Vegetable.bag_id == Bag.id).scalar_subquery().label("vegetable_count_total")
fruit_price = select(func.count(Fruit.id)).where(Fruit.bag_id == Bag.id).scalar_subquery().label("fruit_price_total")
fruit_count = select(func.count(Fruit.id)).where(Fruit.bag_id == Bag.id).scalar_subquery().label("fruit_count_total")
query = add_columns(vegetable_price, vegetable_count, fruit_price, fruit_count)
And then depending on the parameter of the route (ask only for fruits, only for vegetables, or all types), I have a query.where(fruit_price + vegetable_price > 0).
And finally I add the columns together in the object returned by the route, like this:
result = session.execute(query.distinct())
for b in result:
yield BagObject(
id=b.id
item_count=(getattr(b, "vegetable_count_total", 0) or 0) + (getattr(b, "fruit_count_total", 0) or 0),
total_price=(getattr(b, "vegetable_price_total", 0) or 0) + (getattr(b, "fruit_price_total", 0) or 0),
)
Even as a beginner in backend development, I know this is horrible. There’s no way I can’t do this with a way more optimized solution with less code. May I get some help from someone with more experience in SQL?
I tried other stuff like join, outerjoin. But all of them resulted in getting duplicated rows, and the price/item_count being multiplied in the final result.
1
You say you tried joins, but got duplicates. This is probably because you joined vegetables and fruits to their bag. In SQL this gets you a table with rows containing a vegetable and a fruit each, so with a bag of 3 vegetables and 4 fruits you’d produce 3 x 4 = 12 rows.
What you want to do instead is joining the vegetable count and the fruit count to their bag.
It is not clear to me, which columns are in your table. From your code, I assume something like the following:
fruit table
(The table name “fruit” seems inappropriate, by the way. This table does not contain one row per fruit, but one per fruit purchase, so it should be called fruit_purchase or the like.)
column_name | content |
---|---|
bag_id | the purchase bag |
fruit_name | the name of the fruit |
amount | the total amount of this fruit bought in this purchase |
price | the total price for this fruit bought in this purchase |
vegetable table
(Again, this should be called vegetable_purchase or the like.)
column_name | content |
---|---|
bag_id | the purchase bag |
vegetable_name | the name of the vegetable |
amount | the total amount of this vegetable bought in this purchase |
price | the total price for this vegetable bought in this purchase |
Here is one way of writing this in SQL. (I don’t know SQLAlchemy, so I can’t help you on that.)
with
v as
(
select
bag_id,
sum(amount) as total_amount,
sum(price) as total_price
from vegetable
where amount > 0 or price > 0
group by bag_id
),
f as
(
select
bag_id,
sum(amount) as total_amount,
sum(price) as total_price
from fruit
where amount > 0 or price > 0
group by bag_id
)
select
bag_id,
coalesce(v.total_amount, 0) as vegetable_amount,
coalesce(f.total_amount, 0) as fruit_amount,
coalesce(v.total_amount, 0) + coalesce(f.total_amount, 0) as total_amount,
coalesce(v.total_price, 0) as vegetable_price,
coalesce(f.total_price, 0) as fruit_price,
coalesce(v.total_price, 0) + coalesce(f.total_price, 0) as total_price
from v full outer join f using (bag_id);