This question is in regards to performance software. Yes, really. I’m dealing with files that are between 150Mb and 1Tb/avg 100Gb, (btw, yes, 1 terabyte), so you can imagine that having to wait 3 hours instead of 6 hours (as opposed to 3 minutes instead of 6 minutes) is worthwhile.
TL;DR;
Declaring all your variables at the start of your function and then reusing some of those variables seems to traditionally be how people coded — but these days, it seems like your C code will be faster if you only declare and use variables in the scope that they’re needed in. (And, AFAICT, it also improves bug-avoidance?)
Is this right?
What I learned in University
In university, “classically”/traditionally, I was taught to write C as follows:
void foo(int b, int c) {
int i, fa, fb;
for( i = 0; i < b; ++i ) {
// do stuff!
fa = ..
fb = ..
// etc..
}
for( i = 0; i < c; ++i ) {
// do more stuff
fa = ..
fb = ..
// etc..
}
In other words:
- Declare your variables at the top of the function
- Below your “variable declarations”, write your code
- Do not mix the 2 sections..
..vs default stack sizes and optimization
But, then I realized 2 things:
- The minimum stack size per function is 128kb, and, AFAIK, on linux something like 4 Mb or 8Mb.
- Reusing a variable is (probably?) confusing to the compiler (optimization module).
(Re: confusing the compiler and how difficult optimization is, see for example this paper.)
Given those two things, it actually makes more sense to write your code like this:
void foo(int b, int c) {
for(int i = 0; i < b; ++i ) {
// do stuff!
int fa = ..
int fb = ..
// etc..
}
for(int i = 0; i < c; ++i ) {
// do more stuff
int fa = ..
int fb = ..
// etc..
}
From what I understand, it doesn’t really matter in terms of stack space that I have 2 int i
declarations, because the compiler reuses that space on the stack.
And, even if this second piece of code does have a larger stack/footprint, it still doesn’t matter — the stack is something like 128kb – 4Mb, which means 2 things:
int
s are usually 4 bytes (and all other non-struct variables are at most 8 bytes), so I would have to have a lot of variables to reach 128kb.- The L2 cache on all modern machines (that I work on) is at least 1 MiB, so I’m also not in danger of taking a performance hit from hitting that limit.
Caveat: sure, if I’m using alloca
and have large local data structures, I might hit those limits. But I’m not.. so. 🙂