The definition of “C-Style language” can practically be simplified down to “uses curly braces ({}
).” Why do we use that particular character (and why not something more reasonable, like []
, which doesn’t require the shift key at least on US keyboards)?
Is there any actual benefit to programmer productivity that comes from these braces, or should new language designers look for alternatives (i.e. the guys behind Python)?
Wikipedia tells us that C uses said braces, but not why. A statement in Wikipedia article on the List of C-based programming languages suggests that this syntax element is somewhat special:
Broadly speaking, C-family languages are those that use C-like block syntax (including curly braces to begin and end the block)…
5
Two of the major influences to C were the Algol family of languages (Algol 60 and Algol 68) and BCPL (from which C takes its name).
BCPL was the first curly bracket programming language, and the curly
brackets survived the syntactical changes and have become a common
means of denoting program source code statements. In practice, on
limited keyboards of the day, source programs often used the sequences
$( and $) in place of the symbols { and }. The single-line ‘//’
comments of BCPL, which were not taken up in C, reappeared in C++, and
later in C99.
From http://www.princeton.edu/~achaney/tmve/wiki100k/docs/BCPL.html
BCPL introduced and implemented several innovations which became quite
common elements in the design of later languages. Thus, it was the
first curly bracket programming language (one using { } as block
delimiters), and it was the first language to use // to mark inline
comments.
From http://progopedia.com/language/bcpl/
Within BCPL, one often sees curly braces, but not always. This was a limitation of the keyboards at the time. The characters $(
and $)
were lexicographically equivalent to {
and }
. Digraphs and trigraphs were maintained in C (though a different set for curly brace replacement – ??<
and ??>
).
The use of curly braces was further refined in B (which preceded C).
From Users’ Reference to B by Ken Thompson:
/* The following function will print a non-negative number, n, to
the base b, where 2<=b<=10, This routine uses the fact that
in the ASCII character set, the digits 0 to 9 have sequential
code values. */
printn(n,b) {
extern putchar;
auto a;
if(a=n/b) /* assignment, not test for equality */
printn(a, b); /* recursive */
putchar(n%b + '0');
}
There are indications that curly braces were used as short hand for begin
and end
within Algol.
I remember that you also included them in the 256-character card code
that you published in CACM, because I found it interesting that you
proposed that they could be used in place of the Algol ‘begin’ and
‘end’ keywords, which is exactly how they were later used in the C
language.
From http://www.bobbemer.com/BRACES.HTM
The use of square brackets (as a suggested replacement in the question) goes back even further. As mentioned, the Algol family influenced C. Within Algol 60 and 68 (C was written in 1972 and BCPL in 1966), the square bracket was used to designate an index into an array or matrix.
BEGIN
FILE F(KIND=REMOTE);
EBCDIC ARRAY E[0:11];
REPLACE E BY "HELLO WORLD!";
WRITE(F, *, E);
END.
As programmers were already familiar with square brackets for arrays in Algol and BCPL, and curly braces for blocks in BCPL, there was little need or desire to change this when making another language.
The updated question includes an addendum of productivity for curly brace usage and mentions python. There are some other resources that do this study though the answer boils down to “Its anecdotal, and what you are used to is what you are most productive with.” Because of the widely varying skills in programming and familiarity with different languages, these become difficult to account for.
See also: Stack Overflow Are there statistical studies that indicates that Python is “more productive”?
Much of the gains would be dependent on the IDE (or lack of) that is used. In vi based editors, putting the cursor over one matching open/close and pressing %
will then move the cursor to the other matching character. This is very efficient with C based languages back in the old days – less so now.
A better comparison would be between {}
and begin
/end
which was the options of the day (horizontal space was precious). Many Wirth languages were based on a begin
and end
style (Algol (mentioned above), pascal (many are familiar with), and the Modula family).
I have difficulty finding any that isolate this specific language feature – at best I can do is show that the curly brace languages are much more popular than begin end languages and it is a common construct. As mentioned in Bob Bemer link above, the curly brace was used to make it easier to program as shorthand.
From Why Pascal is Not My Favorite Programming Language
C and Ratfor programmers find ‘begin’ and ‘end’ bulky compared to { and }.
Which is about all that can be said – its familiarity and preference.
6
Square braces []
are easier to type, ever since IBM 2741 terminal that was “widely used on Multics” OS, which in turn had Dennis Ritchie, one of C language creators as dev team member.
Note the absence of curly braces at IBM 2741 layout!
In C, square braces are “taken” as these are used for arrays and pointers. If language designers expected arrays and pointers to be more important / used more frequently than code blocks (which sounds like a reasonable assumption at their side, more on historic context of coding style below), that would mean curly braces would go to “less important” syntax.
Importance of arrays is pretty apparent in the article The Development of the C Language by Ritchie. There’s even an explicitly stated assumption of “prevalence of pointers in C programs”.
…new language retained a coherent and workable (if unusual) explanation of the semantics of arrays… Two ideas are most characteristic of C among languages of its class: the relationship between arrays and pointers… The other characteristic feature of C, its treatment of arrays… has real virtues. Although the relationship between pointers and arrays is unusual, it can be learned. Moreover, the language shows considerable power to describe important concepts, for example, vectors whose length varies at run time, with only a few basic rules and conventions…
For further understanding of historical context and coding style of the time when C language was created, one needs to take into account that “origin of C is closely tied to the development of the Unix” and, specifically, that porting OS to a PDP-11 “led to the development of an early version of C” (quotes source). According to Wikipedia, “in 1972, Unix was rewritten in the C programming language”.
Source code of various old versions of Unix is available online, eg at The Unix Tree site. Of various versions presented there, most relevant seems to be Second Edition Unix dated 1972-06:
The second edition of Unix was developed for the PDP-11 at Bell Labs by Ken Thompson, Dennis Ritchie and others. It extended the First Edition with more system calls and more commands. This edition also saw the beginning of the C language, which was used to write some of the commands…
You can browse and study C source code from Second Edition Unix (V2) page to get an idea of typical coding style of the time.
A prominent example that supports the idea that back then it was rather important for programmer to be able to type square brackets with ease can be found in V2/c/ncc.c source code:
/* C command */
main(argc, argv)
char argv[][]; {
extern callsys, printf, unlink, link, nodup;
extern getsuf, setsuf, copy;
extern tsp;
extern tmp0, tmp1, tmp2, tmp3;
char tmp0[], tmp1[], tmp2[], tmp3[];
char glotch[100][], clist[50][], llist[50][], ts[500];
char tsp[], av[50][], t[];
auto nc, nl, cflag, i, j, c;
tmp0 = tmp1 = tmp2 = tmp3 = "//";
tsp = ts;
i = nc = nl = cflag = 0;
while(++i < argc) {
if(*argv[i] == '-' & argv[i][1]=='c')
cflag++;
else {
t = copy(argv[i]);
if((c=getsuf(t))=='c') {
clist[nc++] = t;
llist[nl++] = setsuf(copy(t));
} else {
if (nodup(llist, t))
llist[nl++] = t;
}
}
}
if(nc==0)
goto nocom;
tmp0 = copy("/tmp/ctm0a");
while((c=open(tmp0, 0))>=0) {
close(c);
tmp0[9]++;
}
while((creat(tmp0, 012))<0)
tmp0[9]++;
intr(delfil);
(tmp1 = copy(tmp0))[8] = '1';
(tmp2 = copy(tmp0))[8] = '2';
(tmp3 = copy(tmp0))[8] = '3';
i = 0;
while(i<nc) {
if (nc>1)
printf("%s:n", clist[i]);
av[0] = "c0";
av[1] = clist[i];
av[2] = tmp1;
av[3] = tmp2;
av[4] = 0;
if (callsys("/usr/lib/c0", av)) {
cflag++;
goto loop;
}
av[0] = "c1";
av[1] = tmp1;
av[2] = tmp2;
av[3] = tmp3;
av[4] = 0;
if(callsys("/usr/lib/c1", av)) {
cflag++;
goto loop;
}
av[0] = "as";
av[1] = "-";
av[2] = tmp3;
av[3] = 0;
callsys("/bin/as", av);
t = setsuf(clist[i]);
unlink(t);
if(link("a.out", t) | unlink("a.out")) {
printf("move failed: %sn", t);
cflag++;
}
loop:;
i++;
}
nocom:
if (cflag==0 & nl!=0) {
i = 0;
av[0] = "ld";
av[1] = "/usr/lib/crt0.o";
j = 2;
while(i<nl)
av[j++] = llist[i++];
av[j++] = "-lc";
av[j++] = "-l";
av[j++] = 0;
callsys("/bin/ld", av);
}
delfil:
dexit();
}
dexit()
{
extern tmp0, tmp1, tmp2, tmp3;
unlink(tmp1);
unlink(tmp2);
unlink(tmp3);
unlink(tmp0);
exit();
}
getsuf(s)
char s[];
{
extern exit, printf;
auto c;
char t, os[];
c = 0;
os = s;
while(t = *s++)
if (t=='/')
c = 0;
else
c++;
s =- 3;
if (c<=8 & c>2 & *s++=='.' & *s=='c')
return('c');
return(0);
}
setsuf(s)
char s[];
{
char os[];
os = s;
while(*s++);
s[-2] = 'o';
return(os);
}
callsys(f, v)
char f[], v[][]; {
extern fork, execv, wait, printf;
auto t, status;
if ((t=fork())==0) {
execv(f, v);
printf("Can't find %sn", f);
exit(1);
} else
if (t == -1) {
printf("Try againn");
return(1);
}
while(t!=wait(&status));
if ((t=(status&0377)) != 0) {
if (t!=9) /* interrupt */
printf("Fatal error in %sn", f);
dexit();
}
return((status>>8) & 0377);
}
copy(s)
char s[]; {
extern tsp;
char tsp[], otsp[];
otsp = tsp;
while(*tsp++ = *s++);
return(otsp);
}
nodup(l, s)
char l[][], s[]; {
char t[], os[], c;
os = s;
while(t = *l++) {
s = os;
while(c = *s++)
if (c != *t++) goto ll;
if (*t++ == '') return (0);
ll:;
}
return(1);
}
tsp;
tmp0;
tmp1;
tmp2;
tmp3;
It is interesting to note how pragmatic motivation of picking characters to denote language syntax elements based on their use in targeted practical applications resembles Zipf’s Law as explained in this terrific answer…
observed relationship between frequency and length is called Zipf’s Law
…with the only difference that length in above statement is substituted by / generalized as speed of typing.
23
C (and subsequently C++ and C#) inherited its bracing style from its predecessor B, which was written by Ken Thompson (with contributions from Dennis Ritchie) in 1969.
This example is from the Users’ Reference to B by Ken Thompson (via Wikipedia):
/* The following function will print a non-negative number, n, to
the base b, where 2<=b<=10, This routine uses the fact that
in the ASCII character set, the digits 0 to 9 have sequential
code values. */
printn(n,b) {
extern putchar;
auto a;
if(a=n/b) /* assignment, not test for equality */
printn(a, b); /* recursive */
putchar(n%b + '0');
}
B itself was again based on BCPL, a language written by Martin Richards in 1966 for the Multics Operating system. B’s bracing system used only round braces, modified by additional characters (Print factorials example by Martin Richards, via Wikipedia):
GET "LIBHDR"
LET START() = VALOF $(
FOR I = 1 TO 5 DO
WRITEF("%N! = %I4*N", I, FACT(I))
RESULTIS 0
)$
AND FACT(N) = N = 0 -> 1, N * FACT(N - 1)
The curly braces used in B and subsequent languages “{…}” is an improvement Ken Thompson made over the original compound brace style in BCPL “$(…)$”.
2