Saturday, May 19, 2018

Improving legacy Om code (II): Using effects and coeffects to isolate effectful code from pure code


In the previous post, we applied the humble object pattern idea to avoid having to write end-to-end tests for the interesting logic of a hard to test legacy Om view, and managed to write cheaper unit tests instead. Then, we saw how those unit tests were far from ideal because they were highly coupled to implementation details, and how these problems were caused by a lack of separation of concerns in the code design.

In this post we’ll show a solution to those design problems using effects and coeffects that will make the interesting logic pure and, as such, really easy to test and reason about.

Refactoring to isolate side-effects and side-causes using effects and coeffects.

We refactored the code to isolate side-effects and side-causes from pure logic. This way, not only testing the logic got much easier (the logic would be in pure functions), but also, it made tests less coupled to implementation details. To achieve this we introduced the concepts of coeffects and effects.

The basic idea of the new design was:

  1. Extracting all the needed data from globals (using coeffects for getting application state, getting component state, getting DOM state, etc).
  2. Using pure functions to compute the description of the side effects to be performed (returning effects for updating application state, sending messages, etc) given what was extracted in the previous step (the coeffects).
  3. Performing the side effects described by the effects returned by the called pure functions.

The main difference that the code of horizon.controls.widgets.tree.hierarchy presented after this refactoring was that the event handler functions were moved back into it again, and that they were using the process-all! and extract-all! functions that were used to perform the side-effects described by effects, and extract the values of the side-causes tracked by coeffects, respectively. The event handler functions are shown in the next snippet (to see the whole code click here):

Now all the logic in the companion namespace was comprised of pure functions, with neither asynchronous nor mutating code:

Thus, its tests became much simpler:

Notice how the pure functions receive a map of coeffects already containing all the extracted values they need from the “world” and they return a map with descriptions of the effects. This makes testing really much easier than before, and remove the need to use test doubles.

Notice also how the test code is now around 100 lines shorter. The main reason for this is that the new tests know much less about how the production code is implemented than the previous one. This made possible to remove some tests that, in the previous version of the code, were testing some branches that we were considering reachable when testing implementation details, but when considering the whole behaviour are actually unreachable.

Now let’s see the code that is extracting the values tracked by the coeffects:

which is using several implementations of the Coeffect protocol:

All the coeffects were created using factories to localize in only one place the “shape” of each type of coeffect. This indirection proved very useful when we decided to refactor the code that extracts the value of each coeffect to substitute its initial implementation as a conditional to its current implementation using polymorphism with a protocol.

These are the coeffects factories:

Now there was only one place where we needed to test side causes (using test doubles for some of them). These are the tests for extracting the coeffects values:

A very similar code is processing the side-effects described by effects:

which uses different effects implementing the Effect protocol:

that are created with the following factories:

Finally, these are the tests for processing the effects:


We have seen how by using the concept of effects and coeffects, we were able to refactor our code to get a new design that isolates the effectful code from the pure code. This made testing our most interesting logic really easy because it became comprised of only pure functions.

The basic idea of the new design was:

  1. Extracting all the needed data from globals (using coeffects for getting application state, getting component state, getting DOM state, etc).
  2. Computing in pure functions the description of the side effects to be performed (returning effects for updating application state, sending messages, etc) given what it was extracted in the previous step (the coeffects).
  3. Performing the side effects described by the effects returned by the called pure functions.

Since the time we did this refactoring, we have decided to go deeper in this way of designing code and we’re implementing a full effects & coeffects system inspired by re-frame.


Many thanks to Francesc Guillén, Daniel Ojeda, André Stylianos Ramos, Ricard Osorio, Ángel Rojo, Antonio de la Torre, Fran Reyes, Miguel Ángel Viera and Manuel Tordesillas for giving me great feedback to improve this post and for all the interesting conversations.

Improving legacy Om code (I): Adding a test harness


I’m working at GreenPowerMonitor as part of a team developing a challenging SPA to monitor and manage renewable energy portfolios using ClojureScript. It’s a two years old Om application which contains a lot of legacy code. When I say legacy, I’m using Michael Feathers’ definition of legacy code as code without tests. This definition views legacy code from the perspective of code being difficult to evolve because of a lack of automated regression tests.

The legacy (untested) Om code.

Recently I had to face one of these legacy parts when I had to fix some bugs in the user interface that was presenting all the devices of a given energy facility in a hierarchy tree (devices might be comprised of other devices). This is the original legacy view code:

This code contains not only the layout of several components but also the logic to both conditionally render some parts of them and to respond to user interactions. This interesting logic is full of asynchronous and effectful code that is reading and updating the state of the components, extracting information from the DOM itself and reading and updating the global application state. All this makes this code very hard to test.

Humble Object pattern.

It’s very difficult to make component tests for non-component code like the one in this namespace, which makes writing end-to-end tests look like the only option.

However, following the idea of the humble object pattern, we might reduce the untested code to just the layout of the view. The humble object can be used when a code is too closely coupled to its environment to make it testable. To apply it, the interesting logic is extracted into a separate easy-to-test component that is decoupled from its environment.

In this case we extracted the interesting logic to a separate namespace, where we thoroughly tested it. With this we avoided writing the slower and more fragile end-to-end tests.

We wrote the tests using the test-doubles library (I’ve talked about it in a recent post) and some home-made tools that help testing asynchronous code based on core.async.

This is the logic we extracted:

and these are the tests we wrote for it:

See here how the view looks after this extraction. Using the humble object pattern, we managed to test the most important bits of logic with fast unit tests instead of end-to-end tests.

The real problem was the design.

We could have left the code as it was (in fact we did for a while) but its tests were highly coupled to implementation details and hard to write because its design was far from ideal.

Even though, applying the humble object pattern idea, we had separated the important logic from the view, which allowed us to focus on writing tests with more ROI avoiding end-to-end tests, the extracted logic still contained many concerns. It was not only deciding how to interact with the user and what to render, but also mutating and reading state, getting data from global variables and from the DOM and making asynchronous calls. Its effectful parts were not isolated from its pure parts.

This lack of separation of concerns made the code hard to test and hard to reason about, forcing us to use heavy tools: the test-doubles library and our async-test-tools assertion functions to be able to test the code.


First, we applied the humble object pattern idea to manage to write unit tests for the interesting logic of a hard to test legacy Om view, instead of having to write more expensive end-to-end tests.

Then, we saw how those unit tests were far from ideal because they were highly coupled to implementation details, and how these problems were caused by a lack of separation of concerns in the code design.


In the next post we’ll solve the lack of separation of concerns by using effects and coeffects to isolate the logic that decides how to interact with the user from all the effectful code. This new design will make the interesting logic pure and, as such, really easy to test and reason about.

Monday, April 9, 2018

test-doubles: A small spying and stubbing library for Clojure and ClojureScript

As you may know from a previous post I’m working for GreenPowerMonitor as part of a team that is developing a challenging SPA to monitor and manage renewable energy portfolios using ClojureScript.

We were dealing with some legacy code that was effectful and needed to be tested using test doubles, so we explored some existing ClojureScript libraries but we didn't feel comfortable with them. On one hand, we found that some of them had different macros for different types of test doubles and this made tests that needed both spies and stubs become very nested. We wanted to produce tests with as little nesting as possible. On the other hand, being used to Gerard Meszaros’ vocabulary for tests doubles, we found the naming used for different types of tests doubles in some of the existing libraries a bit confusing. We wanted to stick to Gerard Meszaros’ vocabulary for tests doubles.

So we decided we'd write our own stubs and spies library.

We started by manually creating our own spies and stubs during some time so that we could identify the different ways in which we were going to use them. After a while, my colleague André Stylianos Ramos and I wrote our own small DSL to create stubs and spies using macros to remove all that duplication and boiler plate. The result was a small library that we've been using in our ClojureScript project for nearly a year and that we've recently adapted to make it work in Clojure as well:

I’m really glad to announce that GreenPowerMonitor has open-sourced our small spying and stubbing library for Clojure and ClojureScript: test-doubles.

In the following example written in ClojureScript, we show how we are using test-doubles to create two stubs (one with the :maps option and another with the :returns option) and a spy:

We could show you more examples here of how test-doubles can be used and the different options it provides, but we’ve already included a lot of explained examples in its documentation.

Please do have a look and try our library. You can get its last version from Clojars. We hope it might be as useful to you as it has been for us.

Friday, March 30, 2018

Kata: A small kata to explore and play with property-based testing

1. Introduction.

I've been reading Fred Hebert's wonderful PropEr Testing online book about property-based testing. So to play with it a bit, I did a small exercise. This is its description:

1. 1. The kata.

We'll implement a function that can tell if two sequences are equal regardless of the order of their elements. The elements can be of any type.

We'll use property-based testing (PBT). Use the PBT library of your language (bring it already installed).

Follow these constraints:

  1. You can't use or compute frequencies of elements.
  2. Work test first: write a test, then write the code to make that test pass.
  3. If you get stuck, you can use example-based tests to drive the implementation on. However, at the end of the exercise, only property-based tests can remain.

Use mutation testing to check if you tests are good enough (we'll do it manually injecting failures in the implementation code (by commenting or changing parts of it) and checking if the test are able to detect the failure to avoid using more libraries).

2. Driving a solution using both example-based and property-based tests.

I used Clojure and its test.check library (an implementation of QuickCheck) to do the exercise. I also used my favorite Clojure's test framework: Brian Marick's Midje which has a macro, forall, which makes it very easy to integrate property-based tests with Midje.

So I started to drive a solution using an example-based test (thanks to Clojure's dynamic nature, I could use vectors of integers to write the tests. ):

which I made pass using the following implementation:

Then I wrote a property-based test that failed:

This is how the failure looked in Midje (test.check returns more output when a property fails, but Midje extracts and shows only the information it considers more useful):

the most useful piece of information for us in this failure message is the quick-check shrunken failing values. When a property-based testing library finds a counter-example for a property, it applies a shrinking algorithm which tries to reduce it to find a minimal counter-example that produces the same test failure. In this case, the [1 0] vector is the minimal counter-example found by the shrinking algorithm that makes this test fails.

Next I made the property-based test pass by refining the implementation a bit:

I didn't know which property to write next, so I wrote a failing example-based test involving duplicate elements instead:

and refined the implementation to make it pass:

With this, the implementation was done (I chose a function that was easy to implement, so I could focus on thinking about properties).

3. Getting rid of example-based tests.

Then the next step was finding properties that could make the example-based tests redundant. I started by trying to remove the first example-based test. Since I didn't know test.check's generators and combinators library, I started exploring it on the REPL with the help of its API documentation and cheat sheet.

My sessions on the REPL to build generators bit by bit were a process of shallowly reading bits of documentation followed by trial and error. This tinkering sometimes lead to quick successes and most of the times to failures which lead to more deep and careful reading of the documentation, and more trial and error. In the end I managed to build the generators I wanted. The sample function was very useful during all the process to check what each part of the generator would generate.

For the sake of brevity I will show only summarized versions of my REPL sessions where everything seems easy and linear...

3. 1. First attempt: a partial success.

First, I wanted to create a generator that generated two different vectors of integers so that I could replace the example-based tests that were checking two different vectors. I used the list-distinct combinator to create it and the sample function to be able to see what the generator would generate:

I used this generator to write a new property which made it possible to remove the first example-based test but not the second one:

In principle, we might think that the new property should have been enough to also allow removing the last example-based test involving duplicate elements. A quick manual mutation test, after removing that example-based test, showed that it wasn't enough: I commented the line (= (count s1) (count s2)) in the implementation and the property-based tests weren't able to detect the regression.

This was due to the low probability of generating a pair of random vectors that were different because of having duplicate elements, which was what the commented line, (= (count s1) (count s2)), was in the implementation for. If we'd run the tests more times, we'd have finally won the lottery of generating a counter-example that would detect the regression. So we had to improve the generator in order to increase the probabilities, or, even better, make sure it'd be able to detect the regression.

In practice, we'd combine example-based and property-based tests. However, my goal was learning more about property-based testing, so I went on and tried to improve the generators (that's why this exercise has the constraint of using only property-based tests).

3. 2. Second attempt: success!

So, I worked a bit more on the REPL to create a generator that would always generate vectors with duplicate elements. For that I used test.check's let macro, the tuple, such-that and not-empty combinators, and Clojure's core library repeat function to build it.

The following snippet shows a summary of the work I did on the REPL to create the generator using again the sample function at each tiny step to see what inputs the growing generator would generate:

Next I used this new generator to write properties that this time did detect the regression mentioned above. Notice how there are separate properties for sequences with and without duplicates:

After tinkering a bit more with some other generators like return and vector-distinct, I managed to remove a redundant property-based test getting to this final version:

4. Conclusion.

All in all, this exercise was very useful to think about properties and to explore test.check's generators and combinators. Using the REPL made this exploration very interactive and a lot of fun. You can find the code of this exercise on this GitHub repository.

A couple of days later I proposed to solve this exercise at the last Clojure Developers Barcelona meetup. I received very positive feedback, so I'll probably propose it for a Barcelona Software Craftsmanship meetup event soon.

Wednesday, March 28, 2018

Examples lists in TDD

1. Introduction.

During coding dojos and some mentoring sessions I've noticed that most people just start test-driving code without having thought a bit about the problem first. Unfortunately, writing a list of examples before starting to do TDD is a practice that is most of the times neglected.

Writing a list of examples is very useful because having to find a list of concrete examples forces you to think about the problem at hand. In order to write each concrete example in the list, you need to understand what you are trying to do and how you will know when it is working. This exploration of the problem space improves your knowledge of the domain, which will later be very useful while doing TDD to design a solution. However, just generating a list of examples is not enough.

2. Orthogonal examples.

A frequent problem I've seen in beginners' lists is that many of the examples are redundant because they would drive the same piece of behavior. When two examples drive the same behavior, we say that they overlap with each other, they are overlapping examples.

To explore the problem space effectively, we need to find examples that drive different pieces of behavior, i.e. that do not overlap. From now on, I will refer to those non-overlapping examples as orthogonal examples[1].

Keeping this idea of orthogonal examples in mind while exploring a problem space, will help us prune examples that don't add value, and keep just the ones that will force us to drive new behavior.

How can we get those orthogonal examples?
  1. Start by writing all the examples that come to your mind.
  2. As you gather more examples ask yourself which behavior they would drive. Will they drive one clear behavior or will they drive several behaviors?
  3. Try to group them by the piece of behavior they'd drive and see which ones overlap so you can prune them.
  4. Identify also which behaviors of the problem are not addressed by any example yet. This will help you find a list of orthogonal examples.
With time and experience you'll start seeing these behavior partitions in your mind and spend less time to find orthogonal examples.

3. A concrete application.

Next, we'll explore a concrete application using a subset of the Mars Rover kata:
  • The rover is located on a grid at some point with coordinates (x,y) and facing a direction encoded with a character.
  • The meaning of each direction character is:
    • "N" -> North
    • "S" -> South
    • "E" -> East
    • "W" -> West
  • The rover receives a sequence of commands (a string of characters) which are codified in the following way:
    • When it receives an "f", it moves forward one position in the direction it is facing.
    • When it receives a "b", it moves backward one position in the direction it is facing.
    • When it receives a "l", it turns 90º left changing its direction.
    • When it receives a "r", it turns 90º right changing its direction.


Let's start writing a list of examples that explores this problem. But how?

Since the rover is receiving a sequence of commands, we can apply a useful heuristic for sequences to get us started: J. B. Rainsberger's "0, 1, many, oops" heuristic [2].

In this case, it means generating examples for: no commands, one command, several commands and unknown commands.

I will use the following notation for examples:
(x, y, d), commands_sequence -> (x’, y’, d’)
Meaning that, given the rover is in an initial location with x and y coordinates, and facing a direction d, after receiving a given sequence of commands (which is represented by a string), the rover will be located at x’ and y’ coordinates and facing the d’ direction.

Then our first example corresponding to no commands might be any of:
(0, 0, "N"), "" -> (0, 0, "N")
(1, 4, "S"), "" -> (1, 4, "S")
(2, 5, "E"), "" -> (2, 5, "E")
(3, 2, "E"), "" -> (3, 2, "E")
Notice that in these examples, we don't care about the specific positions or directions of the rover. The only important thing here is that the position and direction of the rover does not change. They will all drive the same behavior so we might express this fact using a more generic example:
(any_x, any_y, any_direction), "" -> (any_x, any_y, any_direction)
where we have used any_x, any_y and any_direction to make explicit that the specific values that any_x, any_y and any_direction take are not important for these tests. What is important for the tests, is that the values of x, y and direction remain the same after applying the sequence of commands [3].

Next, we focus on receiving one command.

In this case there are a lot of possible examples, but we are only interested on those that are orthogonal. Following our recommendations to get orthogonal examples, you can get to the following set of 16 examples that can be used to drive all the one command behavior (we're using any_x, any_y where we can):
(4, any_y, "E"), "b" -> (3, any_y, "E")
(any_x, any_y, "S"), "l" -> (any_x, any_y, "E")
(any_x, 6, "N"), "b" -> (any_x, 5, "N")
(any_x, 3, "N"), "f" -> (any_x, 4, "N")
(5, any_y, "W"), "f" -> (4, any_y, "W")
(2, any_y, "W"), "b" -> (3, any_y, "W")
(any_x, any_y, "E"), "l" -> (any_x, any_y, "N")
(any_x, any_y, "W"), "r" -> (any_x, any_y, "N")
(any_x, any_y, "N"), "l" -> (any_x, any_y, "W")
(1, any_y, "E"), "f" -> (2, any_y, "E")
(any_x, 8, "S"), "f" -> (any_x, 7, "S")
(any_x, any_y, "E"), "r" -> (any_x, any_y, "S")
(any_x, 3, "S"), "b" -> (any_x, 4, "S")
(any_x, any_y, "W"), "l" -> (any_x, any_y, "S")
(any_x, any_y, "N"), "r" -> (any_x, any_y, "E")
(any_x, any_y, "S"), "r" -> (any_x, any_y, "W")
There're already important properties about the problem that we can learn from these examples:
  1. The position of the rover is irrelevant for rotations.
  2. The direction the rover is facing is relevant for every command. It determines how each command will be applied.
Sometimes it can also be useful to think in different ways of grouping the examples to see how they may relate to each other.

For instance, we might group the examples above by the direction the rover faces initially:
Facing East
(1, any_y, "E"), "f" -> (2, any_y, "E")
(4, any_y, "E"), "b" -> (3, any_y, "E")
(any_x, any_y, "E"), "l" -> (any_x, any_y, "N")
(any_x, any_y, "E"), "r" -> (any_x, any_y, "S")
Facing West
(5, any_y, "W"), "f" -> (4, any_y, "W") ...
or, by the command the rover receives:
Move forward
(1, any_y, "E"), "f" -> (2, any_y, "E")
(5, any_y, "W"), "f" -> (4, any_y, "W")
(any_x, 3, "N"), "f" -> (any_x, 4, "N")
(any_x, 8, "S"), "f" -> (any_x, 7, "S")
Move backward
(4, any_y, "E"), "b" -> (3, any_y, "E")
(2, any_y, "W"), "b" -> (3, any_y, "W")
Trying to classify the examples helps us explore different ways in which we can use them to make the code grow by discovering what Mateu Adsuara calls dimensions of complexity of the problem[4]. Each dimension of complexity can be driven using a different set of orthogonal examples, so this knowledge can be useful to choose the next example when doing TDD.

Which of the two groupings shown above might be more useful to drive the problem?

In this case, I think that the by the command the rover receives grouping is more useful, because each group will help us drive a whole behavior (the command). If we were to use the by the direction the rover faces initially grouping, we'd end up with partially implemented behaviors (commands) after using each group of examples.

Once we have the one command examples, let's continue using the "0, 1, many, oops" heuristic and find examples for the several commands category.

We can think of many different examples:
(7, 4, "E"), "rb" -> (7, 5, "S") (7, 4, "E"), "fr" -> (8, 4, "S") (7, 4, "E"), "ffl" -> (9, 4, "N")
The thing is that any of them might be thought as a composition of several commands:
(7, 4, "E"), "r" -> (7, 4, "S"), "b" -> (7, 5, "S")
Then the only new behavior these examples would drive is composing commands.

So It turns out that there's only one orthogonal example in this category. We might choose any of them, like the following one for instance:
(7, 4, "E"), "frrbbl" -> (10, 4, "S")
This doesn't mean that when we're later doing TDD, we have to use only this example to drive the behavior. We can use more overlapping examples if we're unsure on how to implement it and we need to use triangulation[5].

Finally, we can consider the "oops" category which for us is unknown commands. In this case, we need to find out how we'll handle them and this might involve some conversations.

Let's say that we find out that we should ignore unknown commands, then this might be an example:
(any_x, any_y, any_direction), "*" -> (any_x, any_y, any_direction)
Before finishing, I’d like to remark that it’s important to keep this technique as lightweight and informal as possible, writing the examples on a piece of paper or on a whiteboard, and never, ever, write them directly as tests (which I’ve also seen many times).

There are two important reasons for this:
  1. Avoiding implementation details to leak into a process meant for thinking about the problem space.
  2. Avoiding getting attached to the implementation of tests, which can create some inertia and push you to take implementation decisions without having explored the problem well.

4. Conclusion.

Writing a list of examples before starting doing TDD is an often overlooked technique that can be very useful to reflect about a given problem. We also talked about how thinking in finding orthogonal examples can make your list of examples much more effective and saw some useful heuristics that might help you find them.

Then we worked on a small example in tiny steps, compared different alternatives just to try to illustrate and make the technique more explicit and applied one of the heuristics.

With practice, this technique becomes more and more a mental process. You'll start doing it in your mind and find orthogonal examples faster. At the same time, you’ll also start losing awareness of the process[6].

Nonetheless, writing a list of examples or other similar lightweight exploration techniques can still be very helpful for more complicated cases. This technique can also be very useful to think in a problem when you’re working on it with someone else (pairing, mob programming, etc.), because it enhances communication.

5. Acknowledgements.

Many thanks to Alfredo Casado, Álvaro Garcia, Abel Cuenca, Jaime Perera, Ángel Rojo, Antonio de la Torre, Fran Reyes and Manuel Tordesillas for giving me great feedback to improve this post and for all the interesting conversations.

6. References.

[1] This concept of orthogonal examples is directly related to Mateu Adsuara's dimensions of complexity idea because each dimension of complexity can be driven using a different set of orthogonal examples. For a definition of dimensions of complexity, see footnote [4] .
[2] Another very useful heuristic is described in James Grenning's TDD Guided by ZOMBIES post.
[3] This is somehow related to Brian Marick’s metaconstants which can be very useful to write tests in dynamic languages. They’re also hints about properties that might be helpful in property-based testing.
[4] Dimension of Complexity is a term used by Mateu Adsuara in a talk at Socrates Canarias 2016 to name an orthogonal functionality. In that talk he used dimensions of complexity to classify the examples in his tests list in different groups and help him choose the next test when doing TDD.
He talked about it in these three posts:
Other names for the same concept that I've heard are axes of change, directions of change or vectors of change.
[5] Even though triangulation is probably the most popular, there are two other strategies for implementing new functionality in TDD: obvious implementation and fake it. Kent Beck in his Test-driven Development: By Example book describes the three techniques and says that he prefers to use obvious implementation or fake it most of the time, and only use triangulation as a last resort when design gets complicated.
[6] This loss of awareness of the process is the price of expertise according to the Dreyfus model of skill acquisition.

Sunday, March 25, 2018

Books I read (January - March 2018)

- The Plateau Effect: Getting from Stuck to Success, Bob Sullivan & Hugh Thompson
- The Thirty-Nine Steps, John Buchan
- Memento Mori, Muriel Spark
- Cosmonauta, Pep Brocal
- The Man Who Was Thursday: A Nightmare, G. K. Chesterton
- Ébano, Alberto Vázquez-Figueroa
- The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life, Mark Manson
- The Importance of Being Earnest, Oscar Wilde

- The Maltese Falcon, Dashiell Hammett
- This Is Water, David Foster Wallace
- Judas, Amos OZ
- Never Let Me Go, Kazuo Ishiguro.
- Microservice Architecture: Aligning Principles, Practices, and Culture, Mike Amundsen, Matt McLarty, Ronnie Mitra, Irakli Nadareishvili

- On Anarchism, Noam Chomsky
- The Fire Next Time, James Baldwin
- Esperanza en la oscuridad, La historia jamás contada del poder de la gente (Hope in the Dark: Untold Histories, Wild Possibilities), Rebecca Solnit
- The Dispossessed: An Ambiguous Utopia, Ursula K. Leguin
- Release It!: Design and Deploy Production-Ready Software, Michael T. Nygard

Saturday, March 10, 2018

Kata: Generating bingo cards with clojure.spec, clojure/test.check, RDD and TDD

Clojure Developers Barcelona has been running for several years now. Since we're not many yet, we usually do mob programming sessions as part of what we call "sagas". For each saga, we choose an exercise or kata and solve it during the first one or two sessions. After that, we start imagining variations on the exercise using different Clojure/ClojureScript libraries or technologies we feel like exploring and develop those variations in following sessions. Once we feel we can't imagine more interesting variations or we get tired of a given problem, we choose a different problem to start a new saga. You should try doing sagas, they are a lot of fun!

Recently we've been working on the Bingo Kata.

The initial implementation

These were the tests we wrote to check the randomly generated bingo cards:

and the code we initially wrote to generate them was something like (we didn't save the original one):

As you can see the tests are not concerned with which specific numeric values are included on each column of the bingo card. They are just checking that they follow the specification of a bingo card. This makes them very suitable for property-based testing.

Introducing clojure.spec

In the following session of the Bingo saga, I suggested creating the bingo cards using clojure.spec.
spec is a Clojure library to describe the structure of data and functions. Specs can be used to validate data, conform (destructure) data, explain invalid data, generate examples that conform to the specs, and automatically use generative testing to test functions.
For a brief introduction to this wonderful library see Arne Brasseur's Introduction to clojure.spec talk.

I'd used clojure.spec at work before. At my current client Green Power Monitor, we've been using it for a while to validate the shape (and in some cases types) of data flowing through some important public functions of some key name spaces. We started using pre and post-conditions for that validation (see Fogus' Clojure’s :pre and :post to know more), and from there, it felt as a natural step to start using clojure.spec to write some of them.

Another common use of clojure.spec specs is to generate random data conforming to the spec to be used for property-based testing.

In the Bingo kata case, I thought that we might use this ability of randomly generating data conforming to the spec in production code. This meant that instead of writing code to randomly generating bingo cards and then testing that the results were as expected, we might describe the bingo cards using clojure.spec and then took advantage of that specification to randomly generate bingo cards using clojure.test.check's generate function.

So with this idea in our heads, we started creating a spec for bingo columns on the REPL bit by bit (for the sake of brevity what you can see here is the final form of the spec):

then we discovered clojure.spec's coll-of function which allowed us to simplify the spec a bit:

Generating bingo cards

Once we thought we had it, we tried to use the column spec to generate columns with clojure.test.check's generate function, but we got the following error:
ExceptionInfo Couldn't satisfy such-that predicate after 100 tries.
Of course we were trying to find a needle in a haystack...

After some trial and error on the REPL and reading the clojure.spec guide, we found the clojure.spec's int-in function and we finally managed to generate the bingo columns:

Then we used the spec code from the REPL to write the bingo cards spec:

in which we wrote the create-column-spec factory function that creates column specs to remove duplication between the specs of different columns.

With this in place the bingo cards could be created in a line of code:

Introducing property-based testing

Property-based tests make statements about the output of your code based on the input, and these statements are verified for many different possible inputs.
Jessica Kerr (Property-based testing: what is it?)
Having the specs it was very easy to change our bingo card test to use property-based testing instead of example-based testing just by using the generator created by clojure.spec:

See in the code that we're reusing the check-column function we wrote for the example-based tests.

This change was so easy because of:
  1. clojure.spec can produce a generator for clojure/test.check from a given spec
  2. .
  3. The initial example tests, as I mentioned before, were already checking the properties of a valid bingo card. This means that they weren't concerned with which specific numeric values were included on each column of the bingo card, but instead, they were just checking that the cards followed the rules for a bingo card to be valid.

Going fast with REPL driven development (RDD)

The next user story of the kata required us to check a bingo card to see if its player has won. We thought this might be easy to implement because we only needed to check that the numbers in the card where contained by the set of called numbers, so instead of doing TDD, we played a bit on the REPL did REPL-driven development (RDD):

Once we had the implementation working, we copied it from the REPL into its corresponding name space

and wrote the quicker but ephemeral REPL tests as "permanent" unit tests:

In this case RDD allowed us to go faster than TDD, because RDD's feedback cycle is much faster. Once the implementation is working on the REPL, you can choose which REPL tests you want to keep as unit tests.

Some times I use only RDD like in this case, other times I use a mix of TDD and RDD following this cycle:
  1. Write a failing test (using examples that a bit more complicated than the typical ones you use when doing only TDD).
  2. Explore and triangulate on the REPL until I made the test pass with some ugly but complete solution.
  3. Refactor the code.
Other times I just use TDD.

I think what I use depends a lot on how easy I feel the implementation might be.

Last details

The last user story required us to create a bingo caller that randomly calls out Bingo numbers. To develop this story, we used TDD and an atom to keep the not-yet-called numbers. These were our tests:

and this was the resulting code:

And it was done! See all the commits here if you want to follow the process (many intermediate steps happened on the REPL). You can find all the code on GitHub.


This experiment was a lot of fun because we got to play with both clojure.spec and clojure/test.check, and we learned a lot. While explaining what we did, I talked a bit about property-based testing and how I use REPL-driven development.

Thanks again to all my colleagues in Clojure Developers Barcelona!