Suppose that I have the following logical information:
(tag_name, id). For example: (food, 1),(food, 2),(food, 3),(drink, 1),(drink, 2)
I’m trying to decide what is the best way to store it in a database. I can simply use a (STRING, INTEGER) table like in the example, or a more compact (STRING, INTEGER_LIST), like so: (food, [1,2,3]),(drink, [1,2]). This method is definitely more compact, but the other method seems more trivially flexible (for example, I can do a join by tag_name or id with another table, and I don’t need to parse the results of my SQL queries).
Which method is better, and why? Is the redundancy of the first method overwhelmingly wasteful, or are SQL databases smart enough to figure out efficient storage for that? Is the post-query processing that the second method needs (like checking if an id is already associated with a tag before adding it) more expensive than putting this burden on the database in the first method?
1
Never, ever try to stuff set-typed values into one attribute because it looks easier than creating another relation. Data bases live for handling relational data efficiently. Unless you think you could write a better storage method than e.g. the Postgres or Oracle maintainers, it is a bad idea to try to manage relations by serializing and processing them yourself.
1