I’m looking for general testing strategies for writing tests to expose bugs that only appear for very large inputs. As an example I will look at the “binary search midpoint bug”.
In binary search, one of the steps in the algorithm is to compute the array index in the middle of the current bounds. The naive way is to add both bounds and divide by two. This works most of the time but fails when the sum overflows before dividing by two. Writing a test that detects this bug requires the creation of an array of the maximum allowed size.
In Java, the bug was present up to Java 5. It has since been fixed and tests have been added to detect the bug. The ‘issue’ I have with these tests is
- I can only run them if I have enough memory on my machine. The maximum size of an array (
Integer.MAX_VALUE
) happens to be of the same order of magnitude as the memory in a current computer. Even if I could not fit the imputs on my machine, I would like to give peace of mind to people I release my software to who can experience the problem. - These tests would be impossible to run if Java allowed a maximum array size of
Long.MAX_VALUE
. Although you can make the case that the tests would not be needed for a very long time.
I have looked into how other languages test for this bug. Go and Rust have struct{}
and ()
respectively which both take up zero bytes in size. It is then possible to create an array of maximal size which does not take up any space and pass in a custom made comparison function to trigger the bug. The closest I could come in Java is to implement List
in such a way that size()
returns the maximal value and get()
always returns the same element. But this does not extend to arrays.
My question is as follows. Are there (possibly language-agnostic) testing strategies known that enable you to prove the presence of bugs for inputs so large they might not fit on your machine?
2
As usual, the key to making impracticable set-ups practicable for testing is mocking.
In this case, you want to separate the sorting logic from the data store, because you don’t want to pay the price for a real data store just to test the logic. All the algorithm really needs to do its job is a collaborator that can respond to indexed ‘read’ and ‘write’ commands. Therefore, all you have to do is to create a pseudo-store that doesn’t store anything, but knows enough to give the correct answers when asked for the values that you know will come up in your test case. Ultimately, all you need to store are those index-value pairs (or perhaps step-index-value tuples to model the fact that things change over time).
You’re probably not going to get such a separation of concerns into the standard library of a programming language; many sorting algorithms are hard-coded to work on arrays for good reasons of efficiency. But at least you can prove that the algorithm is sound and then also verify that the tested algorithm is identical to the implemented one (perhaps the actual code could even be auto-generated…).