Thiết kế website giá rẻ

Question

Imagine we have a CRUD system with products. Each Product can have Attributes that represent properties of a product, like size, color, weight, etc. It’s a typical EAV database case (Entity, Attribute, Value).
Keeping it simple, the database scheme is as follows:

product

id | name | description

attributes

id | name | data_type

eav_relations

id | entity_id | attribute_id

decimal_values

id | eav_relation_id | value (column type decimal)

varchar_values

id | eav_relation_id | value (column type varchar)

…_values

Similar as the other values tables but for different data types, like bool_values, file_values, etc.

To give context:

The table eav_relations keeps a connection between a product and its attributes.
The …_values tables contain an eav_relation_id, that points to a row in eav_relations. An attribute can have multiple values (like a collection of ‘tags/keywords’ or multiple files).

When loading a product, we can build the whole product aggregate from the database structure which is obvious.
But when creating/updating a product, the client sends the product representation as follows:

{
    "name": "The product name",
    "description": "Example product.",
    "attribute_groups": [
        {
            "group_name": "Product propeties",
            "attributes": [
                {
                    "attribute_id": 3,
                    "values": [433.9]
                },
                {
                    "attribute_id": 9,
                    "values": ["wifi", "LAN"]
                }
            ]
        },
        {
            "name": "Shipping",
            "attributes": [
                {
                    "attribute_id": 19,
                    "values": [12.5]
                },
                {
                    "attribute_id": 24,
                    "values": ["ECommercePost Ltd."]
                }
            ]
        }
    ]
}

When creating a product, it’s relatively simple to create the relational data in the database.
But when updating it’s more complex. Attributes could have been added or removed, values of attributes could be changed/added/removed. The easiest way is to just delete and re-create all rows in eav_relations that correspond to the product, and delete and re-create all rows in the …_values tables. When re-creating, auto increment will generate new id’s. The disadvantage is that row ids change on every update of the product while this is technically not very important as long as id<->id couplings stay true.

Otherwise, you have to provide row id’s of all eav_relations and …_values in the JSON of an update request. Then check if the rows with those id’s exist and if so, update them. When there are “null” id’s, it means the attribute is newly added to the product, and if an attribute value has a “null” id, it means the value is new. Also, before processing the request, you have to load the whole product aggregate, and check if there are rows in the database with an id that no more occurs in the request JSON. In that case, the row should be deleted (because an attribute or an attribute value is dropped from the product).

The JSON for an update request would then become something as follows:

{
    "name": "The product name",
    "description": "Example product.",
    "attribute_groups": [
        {
            "group_name": "Product propeties",
            "attributes": [
                {
                    "eav_relation_id": 2, // Existing eav_relation id provided in request, the attribute is preserved.
                    "attribute_id": 3, // attribute_id is the same. If the attribute id would have been changed, the column will be updated in eav_relations. It's like the attribute would be swapped with some other...
                    "values": [
                        { 
                            "id": 291, // Existing decimal_values row id is provided, the row will be kept but the value column will be updated.
                            "value": 123.9
                        }
                    ]
                },
                {
                    "eav_relation_id": null, // A null id means the attribute is added.
                    "attribute_id": 98,
                    "values": [
                        {
                            "id": null,
                            "value": "orange"
                        },
                        {
                            "id": null,
                            "value": "blue"
                        }
                    ]
                }
                
                // There was an attribute with id 9, (and values "wifi", "LAN") but this is no more present in the
                // request. It's considered as dropped and thus the corresponding rows will be deleted.
            ]
        },
        
        // The group "Shipping" is no more present in the request, and so the eav_relation_ids do not occur in the request.
        // All eav_relations and ..._values rows for the product, that have a row id that is no more present in the request
        // will be deleted.
    ]
}

It’s obvious that this brings a lot of complexity and it’s even confusing to understand.
My idea is to always keep the product row in products table and thus the product id never changes, because this is the actually important identifier that could be used in an URL for example. The other ‘deeper’ id’s are actually just for keeing relations between data internally.

My question is, is it common practise to just delete and re-create the whole aggregate, or should we update and “recycle” database rows to avoid auto-increment and keep the original row id’s that are internally used to connect rows.

Thiết kế website giá rẻ

Danh mục

Is it common practice to re-create database coupling rows when updating an aggregate?