To give a bit of background, lets say it’s a generic results page, which is paginated so there are X results per page.
Generally to do this, I have two queries on the page:
- to get the total number of results
- to get the results, limiting by the correct page’s resultset
However, recently I’ve been trying to cut down on the queries the site is making, and I thought one way to do this would be to only do the query if any parameters to the page have changed (except of course the page number)?
This would then cache all the result id’s in a session, which can be sliced when I need to return the correct resultset for that page.
I was trying to look around the net to see if there are downsides of this method, but I’ve found very little information about it.
Has anyone done this before?
Is it a good idea?
4
It’s a bad idea.
Assuming the default setup for sessions, you’d be replacing calls to the database with calls to the filesystem, you’d be moving from a storage that scales to one that generally doesn’t. Sessions are extremely convenient, but they should be used as sparingly as possible, especially the (default) file based flavour. If you use them for caching, you’d be setting yourself for a world of trouble:
PHP writes its session data to a file by default. When a request is made to a PHP script that starts the session (session_start()), this session file is locked. What this means is that if your web page makes numerous requests to PHP scripts, for instance, for loading content via Ajax, each request could be locking the session and preventing the other requests from completing.
The other requests will hang on session_start() until the session file is unlocked. This is especially bad if one of your Ajax requests is relatively long-running.
And why would you want to cut down on the queries the site is making in the first place? If you have identified that the queries are somehow, you should first try to optimize them. Caching your result-sets in addition to optimizing your queries would be great, but do it properly:
- MySQL query cache, and/or
- memcached
There are various other cache backends that you could try (for example DynamoDB), but memcached should be enough for most common scenarios, and it would certainly be more fitting for pagination results than sessions.
If the resultset can be pretty large and the user most likely never wants to scan through all pages, it’s probably a big waste of resources to read and store all the results just in case.
A strategy that worked pretty well for me in the past is cache some results, and to double the cache size the further the user goes into the results. When a user goes past page 2, the system reads pages 3 and 4. If he goes past page 4, pages 5..8 will be read. This way, the number of queries is limited but I wont have to keep a cache with all millions of results.