Problem:
The Oscar Committee wants to decide which person should get the best actor award among the given N actors.For that they decided to use a random function random_bit() which returns either 0 or 1 with equal probablity. For the results to be fair,the committee was asked to generate a random number between 1 to N (inclusive) , such that all
actors have equal probability of being chosen (which in this case should always be equal to 1 / N.
First of all Committee wants to know the expected number of times they will need to call random_bit() function to generate a random number between 1 to N.Also, while calling the function, they should follow the optimal strategy (i.e. the strategy that minimizes the expected number of function calls)
This is problem of past contests at codechef.com and here’s the link to it.
This site allows to see other users solution and till now only one solution has been accepted. The solution is in C++ code:
#include <iostream>
using namespace std
int main() {
int T;
cin >> T;
for(int ii = 0; ii<T; ++ii)
{
int N;
cin >> N;
double p = 1, ans = 0;
int x = 1 % N;
for(int i = 0; i<100000; ++i)
{
ans += x * p;
x = (x * 2) % N;
p /= 2;
}
cout << ans;
}
return 0;
}
I know how the code runs and what it does, but I can’t find the logic behind it. Any help/ clue?
Also it would be nice if someone can post another algorithm.
It looks like the code interprets the string of random bits as a binary fraction, converts this to a N-ary fraction, and looks at the first digit of that N-ary fraction to determine the winner.
The question then becomes: How many random bits do we need until the first digit of the N-ary fraction is stable?
For all examples, I’ll use N=3. The 3-ary fraction will start with
- a 0 when the fraction is between 0 and 0.333333
- a 1 when between 0.333333 and 0.666667
- a 2 when between 0.666667 and 1.000000
E.g. 0.11… (binary) = at least 0.75… (decimal) = 0.2.. (3-ary) as it lays between 0.666667 and 1.
In this case, we could stop after 2 bits of input. However, in some cases, we can’t decide yet because the number is still too close to a boundary.
E.g. 0.01…. (binary) = between 0.25 and 0.5, so could be 0.0… (3-ary) or 0.1… (3-ary). So we need to look at more bits:
- If that next bit is 1: 0.011… (binary) = between 0.375 and 0.5, so will always start with 0.1… (3-ary).
- If that next bit is zero: 0.010… (binary) = between 0.25 and 0.375, so still we can’t decide whether it is 0.0… (3-ary) or 0.1… (3-ary). So we ask for another bit.
So we always need at least two bits, sometimes a third, in even less cases a fourth, and so on.
Apparently but I’m not sure, x*p
tracks the chance for needing yet-another-bit. ans
is the sum of those chances. I can’t fully explain this part of the algorithm, maybe somebody else can?
x*p
gets smaller over time. At first it is stable, as each iteration p gets halved and x gets doubled. But sometimes x is reduced by the % N
part. At some point, the current value of ans
will dwarf x*p
and the answer is as close as you can get. (However, 100000 iterations seems an awful lot. See below)
I hope this helps to understand what’s going on.
Some additional observations about the code:
- Once
x
becomes equal to zero, it will stay zero, andans
will not change anymore. The for loop could break at that point. This only happens when N is a power of 2, so is pretty rare (e.g. for N=3, x cycles through 1, 2, and back to 4=1). - 100000 looks like a insanely high upper limit.
p
is a double and starts at 1.0, so can be halved at most 1024 times before becoming denormalized. 53 halvations later,p
will be so small that it will be rounded to zero. - Even stronger: as x and p both start at 1,
ans
will be set to 1 during the first iteration. At some point,x*p
will become less than 2^-53, and when added toans
will be too small to matter and will be ignored due to rounding. As p will be halved each iteration, this will take between 53 and, say, 100 iterations max, depending on N.
4