In a recent project I was asked to implement an events system. An Event
had to have a Location
which was originally specced out as simply a physical location with some optional extra notes. Then the spec changed (as they have a habit of doing) and we needed to have “online” events too. These would not have a physical address but would still need notes on how to attend (e.g. the URL, joining instructions).
We decided to adapt the existing Location table by adding an IsOnline
field and it wound up looking like this:
+-----------------+ +---------------+
| Event | | Location |
+-----------------+ +---------------+
| Id | .--+ Id |
| Name | | | Name |
| Summary | | | Address |
| Date | | | Postcode |
| Capacity | | | IsOnline |
| LocationId +--' | Notes |
+-----------------+ +---------------+
An example of a physical and an online entry in the Location table look like this:
+----+----------------+--------------------------------------+----------+----------+---------------------------------------+
| Id | Name | Address | Postcode | IsOnline | Notes |
+----+----------------+--------------------------------------+----------+----------+---------------------------------------+
| 1 | Physical Event | 10 Downing Street, London | SW1A 2AA | 0 | Ask the policeman to let you in |
| 2 | Online Event | http://programmers.stackexchange.com | null | 1 | You will need a stackexchange account |
+----+----------------+--------------------------------------+----------+----------+---------------------------------------+
It works (currently) for our simple use case but it is clearly a bit of a hack and it got me thinking – what would be the correct, normalized way to model this kind of relationship (Where an entity must have an A or a B but not both)?
You can keep the fields common to all locations in that table, but have multiple tables handling different location types. You could add a third location-type table like GeoSpacialAddress which would have Lat/Longs instead of physical addresses.
This prevents having a lot of null fields. You just have to be aware in your querying and may need to use a UNION to get a complete list of all types addresses. “IsOnline” can be determined by a location having one or more OnlineAddress records.
Location
- ID
- Name
- Notes
OnlineAddress
- ID
- LocationID
- URL
PhysicalAddress
- ID
- LocationID
- Address
- PostCode
GeoSpacialAddress
- ID
- LocationID
- Lat
- Long
You shouldn’t burden your relational model with distinctions it can’t make.
By all means exploit all the power it brings you to ensure good design and good data; foreign keys, NULL constraints etc. are valid and useful components to build reliable systems on. However, a NULL
constraint is a relatively crude language element; it can be on or off, and that’s pretty much it. It is almost guaranteed that sooner or later your model will have to reflect more complicated conditions, and this is the point that you have just reached. It follows that you must enforce some conditions on a higher implementation level, and never mind that the data model cannot follow suit.
(That doesn’t mean that you should abandon the possibilities that SQL gives you, even when they end up being double-checked by business logic. Defense in depth adds a very real value to many designs – use whatever is worthwhile to defend yourself!)
2