I am working on developing a relational database that tracks transactions that occur on a device I’m working on for my company. There are different types of transactions that could occur on the device, so we have a “trans_type” field in one of our main record tables. My group has decided to make the type of this field an integer and treating it as an enumerated type. My intuition tells me that it would be a better idea to make this field a string so that our database data would be more readable and usable. My co-workers seem to be worried that this would cause more trouble than it is worth. That string comparisons are too costly and the possibility of typos is too great of a barrier.
So, in your opinion, when dealing with a field in a relational database that is essentially an enumerated value, is it a better design decision to make this field an integer or a string? Or is there some other alternative I’ve overlooked?
Note: explicit enumerated types are not supported by the database we are using. And the software we are developing that will interface with this database is written in C++.
1
Enumerated types should be a separate table in your database that have an id number and a string name and any other columns you might find useful. Then each type exists as a row in this table. Then in your table you are recording the transactions the “trans_Type” field should be a foreign key to the key of that reference table. This is a standard practice in database normalization.
This way you have stored the one official name string, get to use number comparisons for performance, and have referential integrity that every transaction has a valid type.
4
An common practice is to create a trans_types
table, and then have your main table reference it with foreign key named trans_type_id
. This ensures that your records will only reference valid enumerated types.
Example:
trans_type ---------- id name transactions ------------ id trans_date details trans_type_id (FK to trans_type.id)
Example Data:
trans_type ID | NAME ---------- 1 | SUBMIT 2 | CANCEL transactions ID | trans_date | trans_type_id --------------------------------- 1 | 2012-12-31 | 1 2 | 2013-01-09 | 2
If the values are coming into the database as integers, store them that way. There’s no need to put the over-head of converting to strings while writing to the database. You can always relate to a lookup table with the string/text values (More Normalized).
This has the added advantage of updating the string value in a single location instead of running some sort of update routine. Instead of 1 = ‘Red’ it could equal ‘Really Red’
This is not ideal for reporting performance compared to just needing one table with string values (Denormalized). An index on this field would make performance good enough.
Most RDBMS will allow enough horsepower. Although your idea of being able to ‘read’ the table in its plain data form, joining a table is no big deal. Just get in the habit of using a view or some similar object.
I have to disagree with the other answers to this question advocating the separate enumeration table approach.
However, I certainly am in favor of not repeating what has already been said, so I’ll simply refer to the accepted answer to (more or less) the same question on Stack Overflow: https://stackoverflow.com/a/229919/114626
1