1. Introduction.
I've been reading Fred Hebert's wonderful PropEr Testing online book about property-based testing. So to play with it a bit, I did a small exercise. This is its description:1. 1. The kata.
We'll implement a function that can tell if two sequences are equal regardless of the order of their elements. The elements can be of any type.
We'll use property-based testing (PBT). Use the PBT library of your language (bring it already installed).
Follow these constraints:
- You can't use or compute frequencies of elements.
- Work test first: write a test, then write the code to make that test pass.
- If you get stuck, you can use example-based tests to drive the implementation on. However, at the end of the exercise, only property-based tests can remain.
Use mutation testing to check if you tests are good enough (we'll do it manually injecting failures in the implementation code (by commenting or changing parts of it) and checking if the test are able to detect the failure to avoid using more libraries).
2. Driving a solution using both example-based and property-based tests.
I used Clojure and its test.check library (an implementation of QuickCheck) to do the exercise. I also used my favorite Clojure's test framework: Brian Marick's Midje which has a macro, forall, which makes it very easy to integrate property-based tests with Midje.So I started to drive a solution using an example-based test (thanks to Clojure's dynamic nature, I could use vectors of integers to write the tests. ):
which I made pass using the following implementation:
Then I wrote a property-based test that failed:
This is how the failure looked in Midje (test.check returns more output when a property fails, but Midje extracts and shows only the information it considers more useful):
the most useful piece of information for us in this failure message is the quick-check shrunken failing values. When a property-based testing library finds a counter-example for a property, it applies a shrinking algorithm which tries to reduce it to find a minimal counter-example that produces the same test failure. In this case, the
[1 0]
vector is the minimal counter-example found by the shrinking algorithm that makes this test fails.
Next I made the property-based test pass by refining the implementation a bit:
I didn't know which property to write next, so I wrote a failing example-based test involving duplicate elements instead:
and refined the implementation to make it pass:
With this, the implementation was done (I chose a function that was easy to implement, so I could focus on thinking about properties).
3. Getting rid of example-based tests.
Then the next step was finding properties that could make the example-based tests redundant. I started by trying to remove the first example-based test. Since I didn't know test.check's generators and combinators library, I started exploring it on the REPL with the help of its API documentation and cheat sheet.My sessions on the REPL to build generators bit by bit were a process of shallowly reading bits of documentation followed by trial and error. This tinkering sometimes lead to quick successes and most of the times to failures which lead to more deep and careful reading of the documentation, and more trial and error. In the end I managed to build the generators I wanted. The sample function was very useful during all the process to check what each part of the generator would generate.
For the sake of brevity I will show only summarized versions of my REPL sessions where everything seems easy and linear...
3. 1. First attempt: a partial success.
First, I wanted to create a generator that generated two different vectors of integers so that I could replace the example-based tests that were checking two different vectors. I used the list-distinct combinator to create it and the sample function to be able to see what the generator would generate:I used this generator to write a new property which made it possible to remove the first example-based test but not the second one:
In principle, we might think that the new property should have been enough to also allow removing the last example-based test involving duplicate elements. A quick manual mutation test, after removing that example-based test, showed that it wasn't enough: I commented the line
(= (count s1) (count s2))
in the implementation and the property-based tests weren't able to detect the regression.
This was due to the low probability of generating a pair of random vectors that were different because of having duplicate elements, which was what the commented line,
(= (count s1) (count s2))
, was in the implementation for. If we'd run the tests more times, we'd have finally won the lottery of generating a counter-example that would detect the regression. So we had to improve the generator in order to increase the probabilities, or, even better, make sure it'd be able to detect the regression.
In practice, we'd combine example-based and property-based tests. However, my goal was learning more about property-based testing, so I went on and tried to improve the generators (that's why this exercise has the constraint of using only property-based tests).
3. 2. Second attempt: success!
So, I worked a bit more on the REPL to create a generator that would always generate vectors with duplicate elements. For that I used test.check's let macro, the tuple, such-that and not-empty combinators, and Clojure's core library repeat function to build it.The following snippet shows a summary of the work I did on the REPL to create the generator using again the sample function at each tiny step to see what inputs the growing generator would generate:
Next I used this new generator to write properties that this time did detect the regression mentioned above. Notice how there are separate properties for sequences with and without duplicates:
After tinkering a bit more with some other generators like return and vector-distinct, I managed to remove a redundant property-based test getting to this final version:
4. Conclusion.
All in all, this exercise was very useful to think about properties and to explore test.check's generators and combinators. Using the REPL made this exploration very interactive and a lot of fun. You can find the code of this exercise on this GitHub repository.A couple of days later I proposed to solve this exercise at the last Clojure Developers Barcelona meetup. I received very positive feedback, so I'll probably propose it for a Barcelona Software Craftsmanship meetup event soon.
No comments:
Post a Comment