For example I would like to validate that name is letters only and is between 4 and 14 letters length. I have the following code in model:
validates: name, :format => { :with => /^[a-zA-Z]+$/,
:message => 'Name should be letters only. },
:length => { :minimum => 4, :maximum => 14 }
So it clearly lets me do what I want.
But as for unit tests, I have a bit too much perfectionism so I set something like
invalid_names = ['1234',
'qwe',
'1%#$#$',
'Sam1',
'%',
random_string(15)] #I also have a test method to create random string with parametrized length
valid_names = %w['test',
'Travis',
'John',
random_string(5),
random_string(14),
random_string(4)]
and test each of them in a loop with asserts, like
invalid_names.each do |name|
user = User.new(:name => name)
user.save
assert user.errors[:name].any?, "#{name} is valid."
end
So it definitely works great. But it is too verbose (because of valid/invalid names arrays, and added random_string
method), also I can’t be sure my test actually tests all symbols and their combinations possible and all lengths and stuff, even though I am definitely sure it works as expected.
So what is acceptable way to test my validation without being too much perfectionist, yet to leave most part of logic tested?
Am I just set in a mind trick of trying to write a perfect code just to write a perfect code and forgetting about main finish goal: working product?
3
Some suggestions:
-
Avoid to use a random generator directly inside unit tests if there is no compelling reason for it. This has a high risk of producing non-reproducible cases. If you want to use a random string, choose the string randomly once and hardcode that choosen string into your test.
-
An old wisdom of testing is that you cannot test every possible combination of input data if the allowed range of input values is too huge. Choosing good test cases is brainwork, the idea is to use as few test cases as needed to test the largest possible number of failure cases.
For example, classify your input domain into categories:
- string with length 0
- strings with length between 0 and 4
- strings with length 4
- strings with length between 4 and 14
- strings with length greater 14
And as a second classification
- strings with no legal characters
- strings with some legal characters
- string with only legal characters
Now, make sure you have at least one test case for each category. Of course, this example shows only finding tests by using “equivalence classes”. Other techniques are designing your tests for edge cases (like the ones in this answer), or tests for getting full code coverage and branch coverage.
4
What you should test
-
General, ordinary case. Just to be sure it works. This is the simplest case:
-
A name which contains only letters and has its length between 4 and 14.
-
A name which contains only letters and has its length between 1 and 4.
-
A name which contains only letters and has its length between 14 and the maximum authorized length.
-
A name which contains only letters and has exactly 4 characters.
-
A name which contains only letters and has exactly 14 characters.
-
A name which contains letters and something else and has its length between 4 and 14.
-
-
Edge cases
-
An empty string.
-
A null (if your language makes a difference between an empty string and null).
-
A bunch of unicode strings with different normalization and different characters. Those tests should match your documentation explaining what do you mean by “letters only”. Is
щ
a letter? Isᴴ
a letter? What aboutẀ
and its two possible unicode representations (combined and normalized)?
-
What you shouldn’t test
You shouldn’t test repetitive things. If you tested both “test” and “Travis”, testing also “John” brings nothing useful, since it contains only lowercase and uppercase letters, just like “Travis”, and is 4-characters long, just like “test”.
In the same way, %
, Sam1
are redundant.
As for random strings, you shouldn’t use them, since if your code is not working as expected, they might in some cases give you random results: sometimes the test will pass, the next time it will fail for the exact same code.
1
Your code should meet a use case and it should do it in a way that is relative to the risk created by missing something. Is it illegal in the context of your application to allow numbers/characters in a name (Maybe this is a database for the Names with Letters Only Society)? Then test it as much as possible.
Not that you want to invite someone to micro-manage you, but take into consideration how much of your time the people paying for this software want you to spend. They may not share your level perfection and won’t want to pay for it either.
1
So what is acceptable way to test my validation without being too much perfectionist, yet to leave most part of logic tested?
That depends. Different people consider different things to be “enough testing”. In some cases, some types of tests are imposed by the nature or boundaries of the project itself.
Examples:
-
We had one project where we had to offer clients a performance guarantee (min. X messages processed per minute). This was tested throughout the development, in nightly tests. Performance bottlenecks were identified and we designed/implemented around them.
-
We had one project that did XML parsing and had to provide security guarantees (won’t crash on invalid inputs). We ended up implementing a fuzzing library (that took each API, corrupted one of it’s parameters in a predefined way and called the API in a loop, for each corruption).
Normally, I want to see that the default trivial cases are implemented, corner cases are implemented, generic cases are implemented and maybe a negative case or two (check that correct data is in the generated exception, when the inputs are wrong). In my current project though, such a level of testing is not realistic.
We also had a project where (by agreement) solving a defect also implied adding a new unit test. When running the test suite, full regression would run automatically (i.e. we were able to guarantee that we didn’t re-create any of the old issues found in the application).
Am I just set in a mind trick of trying to write a perfect code just to write a perfect code and forgetting about main finish goal: working product?
Not from what I could tell.
Edit: a few other points:
-
If you need randomness in your tests, make sure it is repeatable (for example, initialize your random number generator with a constant seed)
-
At the very least, test for the most common case;
-
Adding unit tests for most corner cases is beneficial when the code changes (in your code, that would be lengths 4 and 14; not sure about the formats). When updating an algorithm, we tend to think (naturally) at the general case; These tests check things that you (the developer) tend not to think about at every edit.
0