I’ve been building a schema for a hobby project. One part of the project is a product aggregator, and I think I’m having trouble of thinking how to reference these two things inside the database: my product listing and its specifications.
Here is my progress so far:
- Primary goal: I plan to scrape from different retailer sites then compare their prices
Each site has different variations in product names but I am highly certain they refer to the same thing (i.e. different sites have listings for an RX 580)- Product listing (that has the retailer information (referenced), product information (referenced?), its price and its date of when the price snapshot was taken.
- Secondary goal: I would also like to gather technical specifications for each product, to which I think I need to get from another site such as the official manufacturers’ website since some retailers display the technical specifications of its products, some don’t.
- To avoid redundancy of the columns for technical specifications, I opted to place them in a separate table.
- Though the products have different attributes, which calls for different separate tables according to product type.
- Now, the question is, how do I link the product specifications to the product listing? (With my limited knowledge, using CTE or triggers? Or if there is any other more efficient and/or concise approach to this.)
Photo attached in this post the design progress I’ve made so far, and I would appreciate any tips/helps and kind criticisms to learn from. Thank you.
product table schema
I tried designing with EAV (Entity-Attribute-Value) Model before, but I was advised not to use it since it has drawbacks on querying performance. Thus, I reverted back to Class Table Inheritance for handling product and product attributes through different tables.
I’ve done a sort of a prototype last year but I was not able to gather the product attributes, only the prices due to this problem. I’ve gotten only a free time right now to get back into this hobby project and came up still stuck.
What I have another thing in mind is:
- Scrape the product specifications first
- Clean the data (especially the product name, into something easily referrable such as
RTX 4060Ti
- Use the clean product model name to reference any product listing to it during the product listing scrapping process.
yura91 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.