I am trying to figure out how to get to a stable ignite when persistence is enabled for a data region and WAL_MODE is NONE in case of graceful shutdown.
I have build a small reproducer at https://github.com/hostettler/ignite-consistency. The readme should be enough to quickly rebuild and retest it.
Basically my understanding is that when WAL_MODE is NONE, we should be able to remain consistent and clean if there is graceful shutdown (via SIGTERM and not SIGKILL)
The reproducer leads to the following error that is WAL inconsistency EVEN with WAL_MODE NONE. While the first run finishes cleanly and is graceful, the next run is corrupted.
Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=WALPointer [idx=0, fileOff=0, len=0], walPath=db/wal, walArchive=db/wal/archive]]]
class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to read checkpoint record from WAL, persistence consistency cannot be guaranteed. Make sure configuration points to correct WAL folders and WAL folder is properly mounted [ptr=WALPointer [idx=0, fileOff=0, len=0], walPath=db/wal, walArchive=db/wal/archive]
My questions:
- how can I have an inconsistent WAL is WAL is … disabled
- what am i missing in the gracefull shutdown to avoid that. Because I suspect I will have similar problems when I will used LOG_ONLY.