I’m not asking if EAV tables are good or bad. I’m wondering if they are considered “normalized”, and if not, why? If they aren’t normalized, which normal form are they violating and why?
An EAV (aka Key-Question-Answer) table is technically in 3NF:
1NF
- Rows are order-independent – True, the rows can be stored in any order.
- Columns are order-independent – True, the column ordering has no effect on the state of the contained data, and columns can be referenced independently of any other.
- No duplicate rows – True, there exists a unique candidate key Entity+Attribute which cannot be duplicated.
- Every field (every column of every row) contains one and only one value – True, provided that “value” does not contain a list of values or a compound “structure” in serialized format.
- Nothing is hidden – true, the database contains all information necessary to follow the above four rules and does not hide any of it (it may hide implementation details not necessary to follow the above four rules).
2NF
- In 1NF – True
- All columns of the table that aren’t the candidate key must have a relationship (direct or indirect) with the entire candidate key – True, the only column that isn’t part of the key is related to the full key.
3NF
- In 2NF – True
- All columns of the table that aren’t the candidate key are directly related to the entire candidate key – True, the only column that isn’t part of the key is directly related to both parts of the key.
Thus, EAV adheres to third normal form. It does not meet 4NF, because the structure has multivalued dependencies (in fact it is designed to); adding a new Entity ID to the table requires creating a row for each Attribute, while adding a newly-supported Attribute requires adding a row for each Entity having that new Attribute. A more traditional “flattened” table structure may meet 4NF; adding a new Entity only requires adding one row, and that is the only reason to add a row to the table.
If there is an argument to say that it’s not 3NF, it would be that it violates 1NF’s “no hidden information” rule in that the structure of the data is not known to the DB. The “Value” field of an EAV table is required to be able to hold anything that may be used as a value; as such it is usually a string field of some type, holding string representations of data such as numbers. That data must be parsed into the proper types, and the knowledge to do so is not contained in an accessible place in the DB (unless incorporated into Value itself, violating 1NF’s “one and only one value” rule, or included as an extra column, violating 3NF’s “all fields directly related to candidate key” rule), therefore it is “hidden”.
9
It’s not Normalized
One of the Rules of Normalization is that : A field should have the same meaning in each row of the table.
Also the Table dosen’t represent any specific Entity..
SQL Database Normalization Rules
3
Asking whether it’s normalized or not is largely beside the point.
Instead, you should ask “is it based on a relational model of the subject matter”? The answer is a resounding NO. This is by intent. When new features of the subject matter are discovered, like new entities or attributes, the EAV can be updated with no changes to data definitions.
That means that the EAV is independent of the information requirements on the subject matter. That in turn means we can build an EAV without doing any data analysis of the subject matter. That’s both the great attraction and the terrible pitfall of EAV.