I’ve read an article about Innodb performance optimization an in that post the author was repeatedly mentioning someting named a “battery back up cache”. It is not clear for me what he was talking about and google did not help either. My guess is that this is some kind of backup storage in case there is a power outage. Am I right?
1
RAID arrays use a battery backed cache, so they can process data faster than they can write it. Without the battery, they couldn’t do caching without risk of data loss during a power failure.
That’s the only instance I can think of where you’d need a cache that is specifically battery backed. If it was backing volatile memory, it wouldn’t matter.
I suppose I should note that this is a feature of a professional-grade hardware RAID setup, and not a software or consumer-grade “on board” RAID. The batteries are substantial, around the size of an altoids tin. They tend to go bad every 4-5 years, which kills your caching, and deals a performance hit to your backing store.
2
Magnetic storage is slow, so better put a cache in front of it. It’s easy to see how that accelerates reads, but can it accelerate writes too?
-
a writethrought cache doesn’t, because although it keeps a copy of the written data in the cache (for subsequent reads), it doesn’t acknowledge the operation as successful until it’s on real storage (“hits iron oxide”).
-
a writeback cache does, because it signals the host that the opration is finished as soon as it is on cache. It will write to permanent (magnetic) storage somewhat later, in the background.
At first, writeback caches sound better, but they introduces a vulnerability window: if there’s a power loss, any data acknowledged but not yet written would be lost. The filesystems or databases can’t prevent this, because any journalling, scheduler barriers, operation ordering, etc. depend on the write acknowledge to mean that data is already safely written.
The solution is to add a small battery to the cache, to allow it to survive a power loss. As soon as power is restored, any pending writes will be completed (even before the host is booted up).
Now there are also two other alternatives:
-
replace the RAM cache with non-volatile storage. SSD’s can be much bigger for the same cost, but they’re not as fast as RAM. Still, in many cases they’re faster than the data link, so they can be fast enough.
-
add a small non-volatile memory to the cache. In the event of power loss, a very small battery (or a supercapacitor) allows it just enough extra time to save the pending cache entries to flash. No need to keep RAM alive for an indeterminate amount of hours.
Caching is used to speed up operations. A typical usage pattern for a user is to access the same record a couple of times. Read it, review it, get some details on it, etc. If the underlying system keeps whatever record in memory, even briefly, the next time the user tries to interact with the record, the access is thousands of times faster than loading it from disk again.
the problem comes when engineers get the bright idea that, to improve system performance, they will put things the user has added to the record in memory for a while. The data is not written to disk until he cache is flushed, which could be a few seconds or maybe ten seconds, but because the cache is small and lots of changes won’t fit, the possibility remains that some changes will be written to disk, while others will remain in write cache briefly. If the power goes out while this is in limbo, the results can be devastating to the integrity of a file or database system.
So some systems implement a logical protection scheme in software, by dating changes or logging cached changes or all kind of slightly whacky things. But if you have a battery backed write cache, well you can dispense with those things and safely know, that the write cache will always be written through.