Is it possible in theory to recover after a process is mistakenly pointed-out to read from a wrong memory address, rather than terminating it?
Let say an error while working with registers lead the processor to read a random place on memory and therefore throwing an illegal instruction exception. At this point, are there any ways to recover to a stable state rather terminating the process?
Are there any processor architecture (specially for Embedded systems), with some extra features to deal with these issue?
Also are there any research papers trying to figure out what are valid return addresses for a function? For example if my function is trying to return to an address that is (virtually) invalid for my program to do its job, have been there any efforts to detect such a violation? Either on the programming language level or on the operating system and memory management?
Note: Illegal Instruction is an exception thrown by the processor.
Update
Would saving last known valid Instruction Pointer somewhere manually help? Maybe in an unused or reserved register, but it should be somehow guaranteed to stay untouched by the rest of the program.
5
When the processor throws up an illegal instruction error, there are usually so many unknowns about the program state that the easiest way to get into a known-good state is to let the process crash and to let the fail-safe mechanisms restart the it or to let a fall-back system take over. This might go as far as restarting the embedded device.
All processors that I know of signal errors like illegal instruction through interrupts. This is how an OS like Windows can inform the user that an application did something terribly wrong. In the firmware for embedded devices, you have enough control over the OS to hook up a different handler to that interrupt. This alternative handler might try to recover the situation, but in my experience, the only recovery used in practice is “terminate and restart”.
5
On embedded systems, where you can’t always just exit the process and let the OS clean up (what if there’s no OS?), an alternative method is to jump to a reinit()
handler which reinitializes RAM and jumps to the program’s entry point.
This is a reasonable technique if the CPU doesn’t need complex reinitialization after such an error. But you probably should have a fallback technique to reset the actual CPU if the simple reinit()
fails. Sometimes a full clean restart may be needed.