Wednesday, September 9, 2015

Kata: "Sieve of Eratosthenes" test-driven and explained step by step

Yesterday we practiced doing the Sieve of Eratosthenes kata at a Barcelona Software Craftsmanship event.

My partner Fernando Mora and I used TDD in Scala to write the code.

Today I did it once again in Clojure.

I'd like to explain here how I did it step by step in order to share it with the Barcelona Software Craftsmanship members.

I started by writing this test:

(ns eratosthenes-sieve.core-test
(:use midje.sweet)
(:use [eratosthenes-sieve.core]))
(facts
"about Eratosthenes sieve"
(fact
"it returns all the primes up to a given number"
(primes-up-to 2) => [2]))
which I quickly got to green by just hard-coding the response:

(ns eratosthenes-sieve.core)
(defn primes-up-to [n]
[2])
In this first test I used wishful thinking to get the function and signature I wished.

Then I wrote the following test:

(ns eratosthenes-sieve.core-test
(:use midje.sweet)
(:use [eratosthenes-sieve.core]))
(facts
"about Eratosthenes sieve"
(fact
"it returns all the primes up to a given number"
(primes-up-to 2) => [2]
(primes-up-to 3) => [2 3]))
which drove me to generalize the code substituting the hard-coded list by a code that generated a list that was valid for both the first and the second test:

(ns eratosthenes-sieve.core)
(defn primes-up-to [n]
(range 2 (inc n)))
The next test was the first one that drove me to start eliminating multiples of a number, in this case the multiples of 2:

(ns eratosthenes-sieve.core-test
(:use midje.sweet)
(:use [eratosthenes-sieve.core]))
(facts
"about Eratosthenes sieve"
(fact
"it returns all the primes up to a given number"
(primes-up-to 2) => [2]
(primes-up-to 3) => [2 3]
(primes-up-to 5) => [2 3 5]))
I made it quickly pass by using filter to only keep those that are not multiples of 2 and 2 itself:

(ns eratosthenes-sieve.core)
(defn primes-up-to [n]
(filter #(or (not= 0 (mod % 2)) (= % 2))
(range 2 (inc n))))
Alternatively, I could have taken a smaller step by just eliminating 4, then go on triangulating to also eliminate 6 and finally refactor out the duplication by eliminating all the multiples of 2 that are different from 2.

Since the implementation was obvious and I could rely on filter, I decided not to follow that path and took a larger step.

I've noticed that my TDD baby steps tend to be larger in functional languages. I think the reason is, on one hand, the REPL which complements TDD by providing a faster feedback loop for triangulation and trying things out and, on the other hand, the power of sequence functions in those languages (in the case of this kata the Scala and Clojure ones).

Once the test was passing I started to refactor that ugly one-liner and got to this more readable version:

(ns eratosthenes-sieve.core)
(defn- integers-up-to [n]
(range 2 (inc n)))
(defn- multiple-of? [n p]
(and (zero? (mod n p)) (not= n p)))
(defn primes-up-to [n]
(remove #(multiple-of? % 2) (integers-up-to n)))
in which I extracted two helpers and used remove instead of filter to better express the idea of sieving.

My next goal was to write a test to drive me to generalize the code a bit more by eliminating the hard-coded number 2 in line 10 of the code.

To do it I just needed a test that forced the code to also eliminate just the multiples of 3:

(ns eratosthenes-sieve.core-test
(:use midje.sweet)
(:use [eratosthenes-sieve.core]))
(facts
"about Eratosthenes sieve"
(fact
"it returns all the primes up to a given number"
(primes-up-to 2) => [2]
(primes-up-to 3) => [2 3]
(primes-up-to 5) => [2 3 5]
(primes-up-to 11) => [2 3 5 7 11]))
Again I could have triangulated to first eliminate 9 and then 12 before refactoring but there was an easier way at this point: to introduce recursion.

But before doing that, I quickly went to green by calling the sieve function twice to eliminate the multiples of 2 and 3:

(ns eratosthenes-sieve.core)
(defn- integers-up-to [n]
(range 2 (inc n)))
(defn- multiple-of? [n p]
(and (zero? (mod n p)) (not= n p)))
(defn- sieve [numbers prime]
(remove #(multiple-of? % prime) numbers))
(defn primes-up-to [n]
(sieve (sieve (integers-up-to n) 2) 3))
With this tiny step I both highlighted the recursive pattern and provided a safe place from which to start using refactoring to introduce recursion.

In this case I think that having tried to directly introduce recursion to make the test pass would have been too large a step to take, so I played safe.

Once in green again I safely refactored the code to introduce recursion:

(ns eratosthenes-sieve.core)
(defn- integers-up-to [n]
(range 2 (inc n)))
(defn- multiple-of? [n p]
(and (zero? (mod n p)) (not= n p)))
(defn- sieve [numbers prime]
(let [sieved (remove #(multiple-of? % prime) numbers)
next-prime (first (drop-while #(<= % prime) sieved))]
(if (nil? next-prime)
sieved
(recur sieved next-prime))))
(defn primes-up-to [n]
(sieve (integers-up-to n) 2))
Notice that this version of the code is the first one that solves the kata.

From this point on I just refactored the code trying to make it a bit more readable until I got to this version:

(ns eratosthenes-sieve.core)
(defn- integers-up-to [n]
(range 2 (inc n)))
(defn- multiple-of? [n p]
(and (zero? (mod n p)) (not= n p)))
(defn- next-prime [prime numbers]
(first (drop-while #(<= % prime) numbers)))
(defn- sieve-multiples-of [prime numbers]
(remove #(multiple-of? % prime) numbers))
(defn- sieve [numbers]
(loop [primes numbers prime 2]
(let [sieved (sieve-multiples-of prime primes)]
(if-let [prime (next-prime prime sieved)]
(recur sieved prime)
primes))))
(defn primes-up-to [n]
(sieve (integers-up-to n)))
Finally, I wrote a more thorough test that made the stepping-stone tests that helped me drive the solution redundant, so I deleted them:

(ns eratosthenes-sieve.core-test
(:use midje.sweet)
(:use [eratosthenes-sieve.core]))
(facts
"about Eratosthenes sieve"
(fact
"it returns all the primes up to a given number"
(primes-up-to 100) => [2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97]))
If you want to have a closer look at the process I followed, please check the commits list where I've also included the REPL history. You can also see all the code in this GitHub repository.

That's all.

I'd like to thank Barcelona Software Craftsmanship members for practicing together every two weeks, especially Álvaro García for facilitating this last kata, and eBay España for kindly having us yesterday (and on many previous events) in their Barcelona office.

2 comments:

  1. Nice play by play TDD!!

    In the refactored version I thought that the p in sieve meant prime, but it doesn't. I like better the version of sieve on step 6. May be sieve could only remove the prime given and have another function controlling next prime / sieve and recursion which will only take the (range (inc n))

    Thanks a lot for sharing!! Reading this post was like being there while you solved the kata.

    --
    Francesc

    ReplyDelete
  2. I followed your advice.
    Thanks Francesc!
    Best regards,
    M

    ReplyDelete