Why does 0
evaluate to false
and any other integer value to true
is most programming languages?
String comparison
First of all, it may seem evident to any programmer, but why wouldn’t there be a programming language – there may actually be, but not any I used – where 0
evaluates to true
and all the other integer values to false
? That one remark may seem random, but I have a few examples where it may have been a good idea. First of all, let’s take the example of strings three-way comparison, I will take C’s strcmp
as example: any programmer trying C as his first language may be tempted to write the following code:
if (strcmp(str1, str2)) { // Do something... }
Since strcmp
returns 0
which evaluates to false
when the strings are equal, what the beginning programmer tried to do fails miserably and he generally does not understand why at first. Had 0
evaluated to true
instead, this function could have been used in its most simple expression – the one above – when comparing for equality, and the proper checks for -1
and 1
would have been done only when needed. We would have considered the return type as bool
(in our minds I mean) most of the time.
Moreover, let’s introduce a new type, sign
, that just takes values -1
, 0
and 1
. That can be pretty handy. Imagine there is a spaceship operator in C++ and we want it for std::string
(well, there already is the compare
function, but spaceship operator is more fun). The declaration would currently be the following one:
sign operator<=>(const std::string& lhs, const std::string& rhs);
Had 0
been evaluated to true
, the spaceship operator wouldn’t even exist, and we could have declared operator==
that way:
sign operator==(const std::string& lhs, const std::string& rhs);
This operator==
would have handled three-way comparison at once, and could still be used to perform the following check while still being able to check which string is lexicographically superior to the other when needed:
if (str1 == str2) { // Do something... }
Old errors handling
We now have exceptions, so this part only applies to the old languages where no such thing exist (C for example). If we look at C’s standard library (and POSIX one too), we can see for sure that maaaaany functions return 0
when successful and any integer otherwise. I have sadly seen some people do this kind of things:
#define TRUE 0
// ...
if (some_function() == TRUE)
{
// Here, TRUE would mean success...
// Do something
}
If we think about how we think in programming, we often have the following reasoning pattern:
Do something
Did it work?
Yes ->
That's ok, one case to handle
No ->
Why? Many cases to handle
If we think about it again, it would have made sense to put the only neutral value, 0
, to yes
(and that’s how C’s functions work), while all the other values can be there to solve the many cases of the no
. However, in all the programming languages I know (except maybe some experimental esotheric languages), that yes
evaluates to false
in an if
condition, while all the no
cases evaluate to true
. There are many situations when “it works” represents one case while “it does not work” represents many probable causes. If we think about it that way, having 0
evaluate to true
and the rest to false
would have made much more sense.
Conclusion
My conclusion is essentially my original question: why did we design languages where 0
is false
and the other values are true
, taking in account my few examples above and maybe some more I did not think of?
Follow-up: It’s nice to see there are many answers with many ideas and as many possible reasons for it to be like that. I love how passionate you seem to be about it. I originaly asked this question out of boredom, but since you seem so passionate, I decided to go a little further and ask about the rationale behind the Boolean choice for 0 and 1 on Math.SE 🙂
21
0
is false
because they’re both zero elements in common semirings. Even though they are distinct data types, it makes intuitive sense to convert between them because they belong to isomorphic algebraic structures.
-
0
is the identity for addition and zero for multiplication. This is true for integers and rationals, but not IEEE-754 floating-point numbers:0.0 * NaN = NaN
and0.0 * Infinity = NaN
. -
false
is the identity for Boolean xor (⊻) and zero for Boolean and (∧). If Booleans are represented as {0, 1}—the set of integers modulo 2—you can think of ⊻ as addition without carry and ∧ as multiplication. -
""
and[]
are identity for concatenation, but there are several operations for which they make sense as zero. Repetition is one, but repetition and concatenation do not distribute, so these operations don’t form a semiring.
Such implicit conversions are helpful in small programs, but in the large can make programs more difficult to reason about. Just one of the many tradeoffs in language design.
8
Because the math works.
FALSE OR TRUE is TRUE, because 0 | 1 is 1.
... insert many other examples here.
Traditionally, C programs have conditions like
if (someFunctionReturningANumber())
rather than
if (someFunctionReturningANumber() != 0)
because the concept of zero being equivalent to false is well-understood.
22
As others have said, the math came first. This is why 0 is false
and 1 is true
.
Which math are we talking about? Boolean algebras which date from the mid 1800s, long before digital computers came along.
You could also say that the convention came out of propositional logic, which even older than boolean algebras. This is the formalization of a lot of the logical results that programmers know and love (false || x
equals x
, true && x
equals x
and so on).
Basically we’re talking about arithmetic on a set with two elements. Think about counting in binary. Boolean algebras are the origin of this concept and its theoretical underpinning. The conventions of languages like C are just a straightforward application.
9
I thought this had to do with the “inheritance” from electronics, and also boolean algebra, where
0
=off
,negative
,no
,false
1
=on
,positive
,yes
,true
strcmp returns 0 when strings are equal has to do with its implementation, since what it actually does is to calculate the “distance” between the two strings. That 0 also happens to be considered false is just a coincidence.
returning 0 on success makes sense because 0 in this case is used to mean no error and any other number would be an error code. Using any other number for success would make less sense since you only have a single success code, while you can have several error codes. You use “Did it work?” as the if statement expression and say 0=yes would make more sense, but the expression is more correctly “Did anything go wrong?” and then you see that 0=no makes a lot of sense. Thinking of false/true
doesn’t really make sense here, as it’s actually no error code/error code
.
6
As explained in this article, the values false
and true
should not be confused with the integers 0 and 1, but may be identified with the elements of the Galois field (finite field) of two elements (see here).
A field is a set with two operations that satisfy certain axioms.
The symbols 0 and 1 are conventionally used to denote the additive and multiplicative identities of a field because the real numbers are also a field (but not a finite one) whose identities are the numbers 0 and 1.
The additive identity is the element 0 of the field, such that for all x:
x + 0 = 0 + x = x
and the multiplicative identity is the element 1 of the field, such that for all x:
x * 1 = 1 * x = x
The finite field of two elements has only these two elements, namely the additive identity 0 (or false
), and the multiplicative identity 1 (or true
).
The two operations of this field are the logical XOR (+) and the logical AND (*).
Note. If you flip the operations (XOR is the multiplication and AND is the addition) then the multiplication is not distributive over addition and you do not have a field any more. In such a case you have no reason to call the two elements 0 and 1 (in any order).
Note also that you cannot choose the operation OR instead of XOR: no matter how you interpret OR / AND as addition / multiplication, the resulting structure is not a field (not all inverse elements exist as required by the field axioms).
Regarding the C functions:
- Many functions return an integer that is an error code. 0 means NO ERROR.
- Intuitively, the function
strcmp
computes the difference between two strings. 0 means that there is no difference between two strings, i.e. that two strings are equal.
The above intuitive explanations can help to remember the interpretation of the return values, but it is even easier to just check the library documentation.
10
You should consider that alternative systems can also be acceptable design decisions.
Shells: 0 exit status is true, non-zero is false
The example of shells treating a 0 exit status as true has already been mentioned.
$ ( exit 0 ) && echo "0 is true" || echo "0 is false"
0 is true
$ ( exit 1 ) && echo "1 is true" || echo "1 is false"
1 is false
The rationale there is that there is one way to succeed, but many ways to fail, so using 0 as the special value meaning “no errors” is pragmatic.
Ruby: 0 is just like any other number
Among “normal” programming languages, there are some outliers, such as Ruby, that treat 0 as a true value.
$ irb
irb(main):001:0> 0 ? '0 is true' : '0 is false'
=> "0 is true"
The rationale is that only false
and nil
should be false. For many Ruby novices, it’s a gotcha. However, in some cases, it’s nice that 0 is treated just like any other number.
irb(main):002:0> (pos = 'axe' =~ /x/) ? "Found x at position #{pos}" : "x not found"
=> "Found x at position 1"
irb(main):003:0> (pos = 'xyz' =~ /x/) ? "Found x at position #{pos}" : "x not found"
=> "Found x at position 0"
irb(main):004:0> (pos = 'abc' =~ /x/) ? "Found x at position #{pos}" : "x not found"
=> "x not found"
However, such a system only works in a language that is able to distinguish booleans as a separate type from numbers. In the earlier days of computing, programmers working with assembly language or raw machine language had no such luxuries. It is probably just natural to treat 0 as the “blank” state, and set a bit to 1 as a flag when the code detected that something happened. By extension, the convention developed that zero was treated as false, and non-zero values came to be treated as true. However, it doesn’t have to be that way.
Java: Numbers cannot be treated as booleans at all
In Java, true
and false
are the only boolean values. Numbers are not booleans, and cannot even be cast into booleans (Java Language Specification, Sec 4.2.2):
There are no casts between integral types and the type
boolean
.
That rule just avoids the question altogether — all boolean expressions have to be explicitly written in the code.
3
Before addressing the general case, we can discuss your counter examples.
String comparisons
The same holds for many sorts of comparisons, actually. Such comparisons compute a distance between two objects. When the objects are equal, the distance is minimal. So when the “comparison succeeds”, the value is 0. But really, the return value of strcmp
is not a boolean, it is a distance, and that what traps unaware programmers doing if (strcmp(...)) do_when_equal() else do_when_not_equal()
.
In C++ we could redesign strcmp
to return a Distance
object, that overrides operator bool()
to return true when 0 (but you would then be bitten by a different set of problems).
Or in plain C just have a streq
function that returns 1 when strings are equal, and 0 otherwise.
API calls/program exit code
Here you care about the reason something went wrong, because this will drive the decisions up on error. When things succeed, you don’t want to know anything in particular – your intent is realized. The return value must therefore convey this information. It is not a boolean, it is an error code. The special error value 0 means “no error”. The rest of the range represent locally meaningful errors you have to deal with (including 1, which often means “unspecified error”).
General case
This leaves us with the question: why are boolean values True
and False
commonly represented with 1 and 0, respectively?
Well, besides the subjective “it feels better this way” argument, here are a few reasons (subjective as well) I can think of:
-
electrical circuit analogy. The current is ON for 1s, and OFF for 0s. I like having (1,Yes,True,On) together, and (0,No,False,Off), rather than another mix
-
memory initializations. When I
memset(0)
a bunch of variables (be them ints, floats, bools) I want their value to match the most conservative assumptions. E.g. my sum is initally 0, the predicate is False, etc.
Maybe all these reasons are tied to my education – if I had been taught to associate 0 with True from the beginning, I would go for the other way around.
3
From a high-level perspective, you’re talking about three quite different data types:
-
A boolean. The mathematical convention in Boolean algebra is to use 0 for
false
and 1 fortrue
, so it makes sense to follow that convention. I think this way also makes more sense intuitively. -
The result of comparison. This has three values:
<
,=
and>
(notice that none of them istrue
). For them it makes sense to use the values of -1, 0 and 1, respectively (or, more generally, a negative value, zero and a positive value).If you want to check for equality and you only have a function that performs general comparison, I think you should make it explicit by using something like
strcmp(str1, str2) == 0
. I find using!
in this situation confusing, because it treats a non-boolean value as if it was a boolean.Also, keep in mind that comparison and equality don’t have to be the same thing. For example, if you order people by their date of birth,
Compare(me, myTwin)
should return0
, butEquals(me, myTwin)
should returnfalse
. -
The success or failure of a function, possibly also with details about that success or failure. If you’re talking about Windows, then this type is called
HRESULT
and a non-zero value doesn’t necessarily indicate failure. In fact, a negative value indicates failure and non-negative success. The success value is very oftenS_OK = 0
, but it can also be for exampleS_FALSE = 1
, or other values.
The confusion comes from the fact that three logically quite different data types are actually represented as a single data type (an integer) in C and some other languages and that you can use integer in an condition. But I don’t think it would make sense to redefine boolean to make using some non-boolean types in conditions simpler.
Also, consider another type that’s often used in a condition in C: a pointer. There, it’s natural to treat a NULL
-pointer (which is represented as 0
) as false
. So following your suggestion would also make working with pointers more difficult. (Though, personally, I prefer explicitly comparing pointers with NULL
, instead of treating them as booleans.)
1
There are a lot of answers that suggest that correspondance between 1 and true is necessitated by some mathematical property. I can’t find any such property and suggest it is purely historical convention.
Given a field with two elements, we have two operations: addition and multiplication. We can map Boolean operations on this field in two ways:
Traditionally, we identify True with 1 and False with 0. We identify AND with * and XOR with +. Thus OR is saturating addition.
However, we could just as easily identify True with 0 and False with 1. Then we identify OR with * and XNOR with +. Thus AND is saturating addition.
16
Strangely, zero is not always false.
In particular, the Unix and Posix convention is to define EXIT_SUCCESS
as 0 (and EXIT_FAILURE
as 1). Actually it is even a standard C convention!
So for Posix shells and exit(2) syscalls, 0 means “successful” which intuitively is more true than false.
In particular, the shell’s if
wants a process return EXIT_SUCCESS
(that is 0) to follow its “then” branch!
In Scheme (but not in Common Lisp or in MELT) 0 and nil (i.e. ()
in Scheme) are true, since the only false value is #f
I agree, I am nitpicking!
Zero can be false because most CPU’s have a ZERO flag that can be used to branch.
It saves a compare operation.
Lets see why.
Some psuedocode, as the audience probably don’t read assembly
c- source
simple loop calls wibble 10 times
for (int foo =10; foo>0; foo-- ) /* down count loop is shorter */
{
wibble();
}
some pretend assembly for that
0x1000 ld a 0x0a 'foo=10
0x1002 call 0x1234 'call wibble()
0x1005 dec a 'foo--
0x1006 jrnz -0x06 'jump back to 0x1000 if not zero
0x1008
c- source
another simple loop calls wibble 10 times
for (int foo =0; foo<10; foo-- ) /* up count loop is longer */
{
wibble();
}
some pretend assembly for this case
0x1000 ld a 0x00 'foo=0
0x1002 call 0x1234 'call wibble()
0x1005 dec a 'foo--
0x1006 cmp 0x0a 'compare foo to 10 ( like a subtract but we throw the result away)
0x1008 jrns -0x08 'jump back to 0x1000 if compare was negative
0x100a
some more c source
int foo=10;
if ( foo ) wibble()
and the assembly
0x1000 ld a 0x10
0x1002 jz 0x3
0x1004 call 0x1234
0x1007
see how short that is ?
some more c source
int foo=10;
if ( foo==0 ) wibble()
and the assembly (lets assume a marginally smart compiler that can replace ==0 with no compare)
0x1000 ld a 0x10
0x1002 jz 0x3
0x1004 call 0x1234
0x1007
Now lets try a convention of true=1
some more c source
#define TRUE 1
int foo=TRUE;
if ( foo==TRUE ) wibble()
and the assembly
0x1000 ld a 0x1
0x1002 cmp a 0x01
0x1004 jz 0x3
0x1006 call 0x1234
0x1009
see how short the case with nonzero true is ?
Really early CPU’s had small sets of flags attached to the Accumulator.
To check if a>b or a=b generally takes a compare instruction.
- Unless B is either ZERO – in which case the ZERO flag is set
Implemented as a simple logical NOR or all bits in the Accumulator. - Or NEGATIVE in which just use the “sign bit” i.e. the most significant bit of the Accumulator if you are using two’s complement arithmetic. (Mostly we do)
Lets restate this. On some older CPU’s you did not have to use a compare instruction for accumulator equal to ZERO, or accumulator less than zero.
Now do you see why zero might be false?
Please note this is psuedo-code and no real instruction set looks quite like this. If you know assembly you know I’m simplifying things a lot here. If you know anything about compiler design, you didn’t need to read this answer. Anyone who knows anything about loop unrolling or branch prediction, the advanced class is down the hall in room 203.
4
C is used for low-level programming close to hardware, an area in which you sometimes need to shift between bitwise and logical operations, on the same data. Being required to convert a numeric expression to boolean just to perform a test would clutter up the code.
You can write things like:
if (modemctrl & MCTRL_CD) {
/* carrier detect is on */
}
rather than
if ((modemctrl & MCTRL_CD) != 0) {
/* carrier detect is on */
}
In one isolated example it’s not so bad, but having to do that will get irksome.
Likewise, converse operations. It’s useful for the result of a boolean operation, like a comparison, to just produce a 0 or 1: Suppose we want to set the third bit of some word based on whether modemctrl
has the carrier detect bit:
flags |= ((modemctrl & MCTRL_CD) != 0) << 2;
Here we have to have the != 0
, to reduce the result of the biwise &
expression to 0
or 1
, but because the result is just an integer, we are spared from having to add some annoying cast to further convert boolean to integer.
Even though modern C now has a bool
type, it still preserves the validity of code like this, both because it’s a good thing, and because of the massive breakage with backward compatibility that would be caused otherwise.
Another exmaple where C is slick: testing two boolean conditions as a four way switch:
switch (foo << 1 | bar) { /* foo and bar booleans are 0 or 1 */
case 0: /* !foo && !bar */
break;
case 1: /* !foo && bar */
break;
case 2: /* foo && !bar */
break;
case 3: /* foo && bar */
break;
}
You could not take this away from the C programmer without a fight!
Lastly, C sometimes serves as a kind of high level assembly language. In assembly languages, we also do not have boolean types. A boolean value is just a bit or a zero versus nonzero value in a memory location or register. An integer zero, boolean zero and the address zero are all tested the same way in assembly language instruction sets (and perhaps even floating point zero). Resemblance between C and assembly language is useful, for instance when C is used as the target language for compiling another language (even one which has strongly typed booleans!)
I think the real answer, which others have alluded to, is simple, pragmatic, and very old:
Because that’s how you do it in assembly language.
Testing for 0 vs non-0 is done for you by almost all computer hardware (either by way of direct flag bits that track accumulator status or by condition-code registers that remember the result of a previous operation), and when combined with branches conditional on these bits results in smaller/faster programs. (Critically important back when memory and disks were small, and clock rates were low.) Counting and convergence loops both need this kind of decision for termination, and programs are usually filled with these. They need to be fast and efficient to be effective against the competition, so that’s how general-purpose CPU’s were built. By everybody.
Languages designed for systems programming tend to be lower level, less abstract (or capable of that, anyway), and have constructs that map fairly directly to their underlying assembly-language implementations. This encourages adoption of the language by those who might just as well have chosen to write in assembly language, but who are enticed by the numerous advantages of a (slightly?) higher-level language:
- Code portability;
- Storage allocation managed for you;
- Register allocation and lifetime, if applicable, managed for you;
- Branching and labels coded and managed for you;
- Slightly higher abstraction, but still recognizable as ‘the machine’.
Code written in these languages (BCPL, B, and early C, for example) is very ‘friendly’ for experienced assembly-language programmers. They’re comfortable with the code that they know will be generated for them, and thankful that they didn’t have to do it themselves. (And debug the inevitable mistakes they’d have made doing it.) Early adopters of said languages would have been poring over the code generated by the prospective compiler, until they became more comfortable just trusting it to do what they would otherwise have had to do the hard way. They would never have adopted the language if it did too many stupid things they didn’t expect, during the language’s probationary period with them. Basic decision making would have been high on their list of things that needed to be ‘done right’ if they were going to adopt the language.
All of BCPL, B, and C use the:
if (non-zero) then-its-True
construct, however it’s spelled. This results in a single conditional-branch instruction after the evaluation of the condition expression; you really can’t do it in less, so it would have had programmer approval. It’s an unlikely target machine that would not have a BZ (or equivalent) instruction.
The next crop of programmers, those for whom assembly-language was not the rock upon which all else was built, were just using the languages that had effectively been chosen for them by their predecessors, and perhaps did not understand and appreciate all the reasons their languages had the features they did.
I will submit that the rest of the languages (probably developed by these programmers) that treat (only) zero as false simply took it from C.
A boolean or truth value only has 2 values. True and false.
These should not be represented as integers, but as bits (0 and 1).
Saying any other integer beside 0 or 1 is not false is a confusing statement. Truth tables deal with truth values, not integers.
From a truth value prospective, -1 or 2 would break all truth tables and any boolean logic assoicated with them.
- 0 AND -1 == ?!
- 0 OR 2 == ?!
Most languages usually have a boolean
type which when cast to a number type such as integer reveals false to be cast as a integer value of 0.
1
0 is not false. 0 is an integer (except in Swift where it is an integer literal, that is something that hasn’t quite decided yet whether it should be converted to some integer or floating point type).
Now in older programming languages there is some unfortunate tendency, probably by inheritance from C, that integers with a value of zero, floating point numbers with a value zero, and null pointers, can be used in contexts where Boolean values are needed, and are treated as if they were false. In Java and Swift, that nonsense is stopped. If you want true or false, you use a Boolean value.
Try this in C++:
bool x = false; ++x; ++x; —-x;
printf(x ? “true” : “false”);
int y = false; ++y; ++y; —-y;
printf(y ? “true” : “false”);
Do you find the result surprising?
Ultimately, you are talking about breaking the core language because some APIs are crappy. Crappy APIs are not new, and you can’t fix them by breaking the language. It is a mathematical fact that 0 is false and 1 is true, and any language which does not respect this is fundamentally broken. The three-way comparison is niche and has no business having it’s result implicitly convert to bool
since it returns three possible results. The old C APIs simply have terrible error handling, and are also hamstrung because C does not have the necessary language features to not have terrible interfaces.
Note that I am not saying that for languages which do not have implicit integer->boolean conversion.
4