I have action that fetches user model from db to check access.
Then I have method that in some cases requires same model.
function checkAccess(req, res, next) {
var data = Data.fetch(req.params._id);
if (data.userId === req.user._id) {
return res.sendStatus(401);
}
return next();
}
Router()
.get('/:_id',
checkAccess,
function (req, res, next) {
var data = Data.fetch(req.params._id);
update(data);
return res.json(data);
});
Question isn’t about how to reuse data in exactly that case (the simplest solution I found is use res.local
).
My confusion is about fetching same data from db twice, when it theoretically can be reused.
It isn’t heavy request, so caching layer unlikely increase access speed in comparison with access to db. But using cache can introduce many headache with invalidation.
I found that using request-related (res.local
) cache can particularly solve the problem, but it makes my code more verbose and complex. I check every piece of data whether it was fetched, and also should make decision is it possible that this data outdated. Also I pass arguments through several methods to reuse it, so it pollutes my internal API (e.g. when I send email I reuse data needed for rendering email template, so I can’t just use mailService.sendInvite(userId)
, but mailService.sendInvite(userId, {user: user, event: event})
).
Should I worry about fetching small pieces of the same data from db multiple times per request?
1
I’d suggest to make certain that your database is being leveraged to its full extent before conjuring up any kind of caching mechanism.
Just because a database is “big” that doesn’t necessarily mean querying for a bit will be “slow”. Slow is subjective, and if it can be done in the timeframe required for the application response time to be sufficient, then I’d say it isn’t “slow”. If the table being queried is indexed on the id column which you are querying, then checking the access will not take any longer than if the table holds 10 or 10 million.
Also, the database engine will be caching internally for you. It will be internally tracking if the row correlating with id X has been changed since it was last queried, and if it hasn’t been changed it will retrieve from its own cache.
Because of these points, I do not see it as being a problem to check access every request. If something external can change the access privileges to the data in the system, then doing your own caching runs the risk of someone’s access being severed but your app allowing access. The database layer is the best place for caching to take place, so let it.
It looks like you could integrate CheckAccess
into a higher-level data-fetching function.
var GetSomeData = function(req) {
var data = Data.fetch(req.param_id);
if (data.userId === req.user._id) {
return {'error': 401};
}
return {'success': data};
}
// ...
// in client code
var reply = GetSomeData(req);
if (reply.error) {
res.sendStatus(reply.error);
return res;
}
else {
return res.json(reply.success);
}
I see how attractive and reusable methods like checkAccess()
could be. But these need certain information to actually check access. You can either haul this security context around as an argument, or pass it into a constructor / factory method. Consider:
var securityContext = getSecurityContext(request); // fetches security data
var mailService = new MailService(securityContext);
var orderProcessor = new OrderProcessor(securityContext);
try {
var orderId = orderProcessor.createOrder(request);
mailService.sendConfirmationEmail(orderId, ...);
} catch (error) {
// report the details of the error
}
With securityContext
passed in constructor, any of your services can reuse it without re-fetching. getSecurityContext()
should fetch enough security-related data to avoid more round-trips for it, or at least make them improbable.
3
Database access can be slow if the database is big. So if you have challenging performance needs, you are right to think about this.
The first thing is to consider whether you can provide a single point where checkAccess is called. If not, your application has a security risk, which is somewhere where access should be checked, it has not been.
Secondly, if you need caching, then remember you aren’t the first person to face this question. Here is a general fix, for Postgresql users:
https://www.postgresql.org/about/news/1296/
If you aren’t using Postgresql, then there might be one for your database.
If the database is on another machine then each access involves the TCP stack, which is a resource hog. A query cache in the database won’t fix this.