Say that I have the following situation:
void myFunc()
{
int x;
//Do something with x
}
“x” is placed on the stack which is no doubt fast.
Now, “myFunc” is called very frequently, lets say 10 times per second.
Is it plausible to do something like this:
int x;
void myFunc()
{
//Do something with x
}
so that x gets allocated in the applications data segment. but it is allocated only once. Since “myFunc” is called so frequently, does the second approach deliver any performance benefits?
19
Allocating a variable on the stack and deallocating it is a simple addition and subtraction of the stack pointer. Given that it happens anyway when entering a function means that local variables are so cheap that trying to optimize them to anything else will generally incur more cost.
Putting it in the data segment will incur a cache cost, the stack will usually be in cache.
The biggest disadvantage is that you lose reentrant properties. Meaning that recursion won’t work and that it’s not thread-safe.
The other big disadvantage is that letting the variable escape scope is that the optimizer can’t make it a register-only variable, which never reaches RAM to begin with.
1
It also depends on the architecture. In ARM the first few locals are stored in registers only. As long as you do not call any other functions then these will never go on to the stack.
You need to go the opposite direction. Rather than using globals you should start by limiting the scope if the variables as much as possible. Not only is this good coding practice but it also helps the compiler to better optimize your code. You should compile with maximum speed optimization then look at the generated assembler code. Even better is to step through this code if your IDE will show you the ASM at the same time as the C code. You will be surprised at what the compiler can do.
See here for a good discussion: https://stackoverflow.com/questions/15180309/local-variable-location-in-memory
Only after you have good readable code that had been optimized by the compiler should you start looking at custom hand coded optimizations. Normally this is only needed for special real-time applications with very strict time constraints.