I have just completed an exercise from my textbook which wanted me to write a program to check if a number is prime or not.
I have tested it and seems to work fine, but how can I be certain that it will work for every prime number?
public boolean isPrime(int n)
{
int divisor = 2;
int limit = n-1 ;
if (n == 2)
{
return true;
}
else
{
int mod = 0;
while (divisor <= limit)
{
mod = n % divisor;
if (mod == 0)
{
return false;
}
divisor++;
}
if (mod > 0)
{
return true;
}
}
return false;
}
Note that this question is not a duplicate of Theoretically Bug Free Programs because that question asks about whether one can write bug free programs in the face of the the limitative results such as Turing’s proof of the incomputability of halting, Rice’s theorem and Godel’s incompleteness theorems. This question asks how a program can be shown to be bug free.
5
You cannot establish correctness via testing. All that tests tell you is that the system responds correctly to the cases you exercised. (Even if there is a finite number of possible inputs, that is not enough, because strictly speaking, the test only tells you that the system responded correctly in this instance – it might be secretly non-deterministic, and you’d never know.)
To establish that your code is indeed correct for all possible inputs, verification is required. It amounts to a mathematical proof that the transformation that your code embodies is indeed the correct one for the specified problem.
Verification is perfectly possible, particularly for mathematical tasks, but usually not worth the added effort. Particularly, making sure that your requirement does indeed state what it seems to state and your code does indeed define what it seems to define is surprisingly difficult and hairy when all details of a real programming language, platform, standard library etc. are properly considered. That is why verification is not encountered in practice as much as copious amounts of regression tests.
2
There are a number of frameworks for proving that code is correct. You can read about some of them in books such as
- A practical theory of programming, Hehner, 1993-2014.
- The science of programming, Gries, 1981.
- A method of programming, Dijkstra and Feijn, 1988.
- Programming from specifications, Morgan, 1990, 1994, 1998.
For your program, I’ll assume that you don’t care about negative n
and that you can handle the n==0
, n==1
, and n==2
cases without any help; I’ll only look at the case where n>2
.
The key is to note that (in cases where n > 2
) every time the loop’s guard expression divisor <= limit
is evaluated, the following 5 facts are true of the program’s state:
n % i != 0, for all i from 2 up to, but not including divisor
limit == n-1
2 <= divisor && divisor <= n
mod != 0 || divisor == 2
n is the same as the argument // i.e. n hasn't been changed
These 5 facts are called the loop’s invariant. To trust that the loop’s invariant holds where I said it does, you can check
- that it holds the first time the guard is evaluated and
- that, if the invariant is true and the guard is true, then one execution of the loop’s body will leave the program in a state where the invariant is true again.
Therefore, when and if the loop terminates normally*, all the above are true and so is divisor > limit
. This tells you that divisor == n
and so you have
n % i != 0, for all i from 2 up to, but not including n
mod != 0
n is the same as the argument // i.e. n hasn't been changed
Therefor, if the loop terminate normally, the return true
after the loop will be executed and the argument is prime.
The only cases left to worry about are those where the loop does not terminate normally. There are two reasons a loop might not terminate normally: it is infinite or there is a jump out of it, such as a break, return, or exception. In your code, the loop can be shown not to be infinite and the only jump out is when n % divisor == 0
. Since, at that point, we also have
2 <= divisor // from the invariant
divisor <= limit // from the guard
limit == n-1 // from the invariant
we have 2<=divisor && divisor < n
. So if the return false
in the loop is executed, it is only when n % divisor == 0
and divisor
is greater than 1
and less than n
, and so the n
(and hence the argument) is not prime.
In summary
- for nonprimes (greater than 2), the loop can not terminate normally and so the
return false
must be executed, and - for primes (greater than 2) the loop must terminate normally and in a state where
mod != 0
and so thereturn true
will be executed.
The proof suggests a few ways you could make the code simpler. The books I mentioned provide ways to make the analysis more formal, although I think only Hehner’s provides a way to deal with early exits from loops.
[*] By “terminates normally”, I mean that it terminates because its guard evaluates to false.
Does this analysis give you absolute certainty that you code is flawless? I hope not. But it should give you reassurance. One can use tools such as ESC/Java2 or Spec# so that a computer can check that one hasn’t made any mistakes. That would raise the level of trust that the code is right to a very high level. Of course there could be bugs in the program one uses to check the verification or in the hardware it runs on. For practical purposes code such as yours can be verified and the verification can be checked by carefully written and trustworthy tools.
3
For something like prime numbers, you will never be sure that it will work for all primes, because the set of all prime numbers is infinitely long, and therefore you’d never stop coding your unit test. That being said, there’s many different ways to judge the quality of code, and passing a unit test is only one of them. For example, if you’re building a piece of code to find all the primes in the world, your solution will probably work fine for lower numbers, but as the size of the numbers gets larger, you will exceed the size of int
, which means you’ve got one small limitation in your program. Also, performance will degrade as the numbers get bigger and your loop takes longer.
One other small point, negative numbers can’t be prime, so instead of int, you could use unsigned int, or unsigned long to extend the range of numbers you can verify.
In your case though, what you’re doing is focusing on the wrong thing. No code is ever flawless. There will always be bugs. Does it meet the requirements you have set before you? Does it perform adequately well for your needs?
Then you’re good.
4