Programming languages like Scheme (R5RS) and Python (see this Question) round towards the nearest even integer when value is exactly between the surrounding integers.
What is the reasoning behind this?
Is there a mathematical idea that makes following calculations easier to reason about?
(R5RS references the IEEE floating point standard as source of this behaviour.)
8
A while ago I constructed a test program for successive rounding, because it’s basically a worst-case stress test for a rounding algorithm.
For each number from 0 to 9,999 it first rounds to the nearest 10, then to the nearest 100, then to the nearest 1000. (You could also think of this as 10,000 points in [0,1) being rounded to 3 places, then to 2, then to 1.) This set of numbers has a mean value of 4999.5.
If all three roundings are done using the method “round half up”, then the results are as follows (first column is the rounding result, second column is how many numbers rounded to that result — i.e. it’s a histogram).
0 445
1000 1000
2000 1000
3000 1000
4000 1000
5000 1000
6000 1000
7000 1000
8000 1000
9000 1000
10000 555
The result differs from a single “round half up” to the nearest thousand 550 times out of 10,000 and the average rounded value is 5055 (higher than the original average by 55.5).
If all three roundings are done by “round half down”, then the results are:
0 556
1000 1000
2000 1000
3000 1000
4000 1000
5000 1000
6000 1000
7000 1000
8000 1000
9000 1000
10000 444
The result differs from a single “round half down” to the nearest thousand 550 times out of 10,000 and the and the average rounded value is 4944 (too low by 55.5).
If all three roundings are done using “round half odd”, the result is:
0 445
1000 1111
2000 889
3000 1111
4000 889
5000 1111
6000 889
7000 1111
8000 889
9000 1111
10000 444
The result differs from a single “round half odd” to the nearest thousand 550 times out of 10,000 and the average rounded value is 4999.5 (correct).
Finally, if all three roundings are done using “round half even”, the results are:
0 546
1000 909
2000 1091
3000 909
4000 1091
5000 909
6000 1091
7000 909
8000 1091
9000 909
10000 1091
The result differs from a single “round half even” to the nearest thousand 450 times out of 10,000 and the average rounded value is 4999.5 (correct).
I think it’s obvious that round half up and round half down bias the rounded values, so that the average of rounded values no longer has the same expectation as the average of the original values, and that “round half even” and “round half odd” remove the bias by treating 5 one way half the time and the other way the other half. Successive rounding multiplies the bias.
Round half even and round half odd introduce their own kind of bias to the distribution: a bias towards even and odd digits, respectively. In both cases, again, this bias is multiplied by successive rounding, but it’s worse for round half odd. I think that the explanation in this case is simple: 5 is an odd number, so round half odd has more results ending in 5 than round half even — and therefore, more results that will have to be handled specially by the next rounding.
So anyway, of the four choices, only two are unbiased, and of the two unbiased choices, round half even gives the best-behaved distribution when subject to repeated rounding.
0
It’s called banker’s rounding. The idea is to minimize the cumulative error from many rounding operations.
Lets say you always rounded .5 down. Think of all those little interest payments, the bank pocketing half a cent each time…
Lets say you always rounded .5 up. Accounting is going to scream because you’re paying out more interest than you should.
14