In many languages (a wide list, from C to JavaScript):
- commas
,
separate arguments (e.g.func(a, b, c)
), while - semicolons
;
separate sequential instructions (e.g.instruction1; instruction2; instruction3
).
So why is this mapping reversed in the same languages for for loops:
for ( init1, init2; condition; inc1, inc2 )
{
instruction1;
instruction2;
}
instead of (what seems more natural to me)
for ( init1; init2, condition, inc1; inc2 )
{
instruction1;
instruction2;
}
?
Sure, for
is (usually) not a function, but arguments (i.e. init
, condition
, increment
) behave more like arguments of a function than a sequence of instructions.
Is it due to historical reasons / a convention, or is there a good rationale for the interchange of ,
and ;
in loops?
6
So why in the same languages such mapping is reversed for for loops.
Technically, the mapping is not “reversed”.
- The things separated by commas are not parameters. In (at least) C++ and Java, they can be declarations, so they are not even expressions.
- The things separated by semicolons are not (single) statements either.
In reality what we have here is a different syntactic context where the same symbols are being used differently. We are not comparing like with like, so there is no mapping, and no strong argument for a consistent mapping based on semantic consistency.
So why not do it the other way around?
Well I think the reasons come from the “natural” meaning of ,
and ;
. In English written language, a semicolon is “stronger” break than a comma, and the glyph for semicolon is more visible than a comma. Those two things combine to make current arrangement seem (to me!) to be more natural.
But the only way to know for sure why the syntax choice was made would be if the C designers could tell us what they were thinking back in ~1970. I doubt that they have a clear memory of technical decisions made that far back in time.
Is it due to historical reasons / a convention
I’m not aware of any language before C that used a C-like syntax for “for” loops:
-
Donal Fellows notes that BCPL and B didn’t have an equivalent construct.
-
The FORTRAN, COBOL and Algol-60 (and Pascal) equivalents were less expressive, and had syntaxes that did not resemble C “for” syntax.
But languages like C, C++ and Java that came after C all clearly borrow their “for” syntax from C.
10
We write loops like:
for(x = 0; x < 10; x++)
The language could have been defined so that loops looked like:
for(x = 0, x < 10, x++)
However, think of the same loop implemented using a while loop:
x = 0;
while(x < 10)
{
x++;
}
Notice that the x=0
and x++
are statements, ended by semicolons. They aren’t expressions like you would have in a function call. Semicolons are used to separate statements, and since two of the three elements in a for loop are statements, that’s what is used there. A for loop is just a shortcut for such a while loop.
Additionally, the arguments don’t really act like arguments to a function. The second and third are repeatedly evaluated. It’s true they aren’t a sequence, but they also aren’t function arguments.
Also, the fact that you can use commas to have multiple statements in the for loop is actually something you can do outside the for loop.
x = 0, y= 3;
is a perfectly valid statement even outside of a for loop. I don’t know of any practical use outside the for loop though. But the point is that commas always subdivide statements; it’s not a special feature of the for loop.
9
In C and C++ this is the comma operator, not just a comma.
The grammar for a for
loop is something like
for ([pre-expression]; [terminate-condition]; [increment-expression]) body-expression
In the case of your question:
pre-expression -> init1, init2
terminate-condition -> condition
increment-expression -> inc1, inc2
Note that the comma-operator allows you to perform multiple actions in one statement (as the compiler sees it). If your suggestion was implemented there would be an ambiguity in the grammar as to when the programmer intended to write a comma-operator statement or a separator.
In short, ;
signifies the end of a statement. A for
loop is a keyword followed by a list of optional statements surrounded by ()
. The comma-operator statement allows the use of ,
in a single statement.
4
There is no conceptual reversal.
Semicolons in C represent more major divisions than commas. They separate statements and declarations.
The major divisions in the for loop is that there are three expressions (or a declaration and two expressions) and a body.
The commas you see in C for loops are not part of the syntax of the for loop specifically. They are just manifestations of the comma operator.
Commas are major separators between arguments in function calls and between parameters in function declarations, but semicolons are not used. The for loop is special syntax; it has nothing to do with functions or function calls.
Maybe this is something specific for C/C++, but I post this answer, because the syntax of the lagnuages you described is mostly influenced by the C-Syntax.
Besides the previously answered questions are true, from a technical point of view, that’s also because in C (and C++) the comma is actually an operator, that you can even overload. Using a semicolon-operator (operator;()
) would possibly make it harder to write compilers, since the semicolon is the axiomatic expression terminator.
What makes this intersting is the fact, that the comma is widely used as seperator all over the language. It seems like the comma operator is an exception, that is mainly used to get for
-loops with multiple conditions working, so what’s the deal?
In fact the operator,
is built to do the same thing like in definitions, argument lists, and so on: It has been build to seperate expressions – something the syntactic construct ,
cannot do. It can only seperate what has been defined in the standard.
However the semicolon does not seperate – it terminates. And this is also what leads us back to the original question:
for (int a = 0, float b = 0.0f; a < 100 && b < 100.0f; a++, b += 1.0f)
printf("%d: %f", a, b);
The comma seperates the expressions in the three loop parts, whereas the semicolon terminates a part (initialization, condition or afterthought) of the loop definition.
Newer programming languages (like C#) may not allow overloading the comma-operator, but they most likely kept the syntax, because changing it feels somehow unnatural.
1
For me they are used more less similar meaning to their linguistic sense. Commas are used with lists and semicolons with more separate parts.
In func(a, b, c)
we have a list of arguments.
instruction1; instruction2; instruction3
is maybe a list but a list of separate and independent instructions.
While in for ( init1, init2; condition; inc1, inc2 )
we have three separate parts – a list of initializations, a condition and a list of increment expressions.
The easiest way to see it is the following:
for(x = 0; x < 10; x++)
is:
for(
x = 0;
x < 10;
x++
)
In other words, those x = 0 thingy is actually a statement/instructions rather than a parameter. You insert a statement there. Hence they are separated by semicolon.
In fact there is no way they are separated by comma. When do the last time you insert things like x<10 as a parameter? You do that if you want to computer x<10 once and insert the result of that operation as parameter. So in comma world you would put x<10 if you want to pass on the value of x < 0 to a function.
Here you specify that the program should check x<10 every time the loop is passed. So that’s an instruction.
x++ is definitely another instructions.
Those are all instructions. So they are separated by semi colon.
6