In an ordinary application, the same information may be retrieved from the database in several ways. If all those ways implement caching, you need to invalidate all those cache entries when updating an entry in the database.
Some pairing between updates and selects is easy to do. Let’s take an example of a product 123 in the category 9 with a price of $12.50. If you select a product by ID:
select [...] from [Shop].[Product] where [ProductId] = 123
caching it with a key Product<Id:123>
, it’s not so difficult to invalidate this cache entry after updating a price of a product identified by its ID:
update [Shop].[Product] set [Price] = 14.99 where [ProductId] = 123
Things become more difficult when you have also to remember to invalidate the cache with the key MostExpensive<CategoryId:9>
corresponding to the query:
select max([Price]) from [Shop].[Product] where [CategoryId] = 9
What is an effective way to keep track of cache entries? How to find, during development or code review, what entries must be invalidated when one or another piece of data is changed?
MySQL invalidates its query caches based on the underlying table. If the table used in the query is edited with any INSERT, UPDATE, or DELETE, any query cached with references to that table is invalidated. (I know MySQL isn’t the shining example of best practices, but in this case it works very well.)
So, in your example, if you ever update [Shop].[Product] you could invalidate all caches with references to it. You would need a second set of meta-data that relates every table to related cache entry keys, e.g. [Shop].[Product] => (Product<Id:123>, MostExpensive<CategoryId:9>)
.
A slightly more efficient route would be to NOT invalidate those keys that reference the table’s primary key when those records are not updated. So if you update record with ProductId 345, do not invalidate key Product<Id:123>
.
If you pass all of your queries through a caching layer that quickly grabs table names out of each query and handles all cache entries it’s manageable. I feel like there’s probably a completely different solution that’s more direct, but I can’t think of one at the moment.