Friday, September 2, 2016

Kata: Varint in Clojure using Midje and test.check

In last week's Barcelona Software Craftsmanship Dojo we did the Varint kata created by Eric Le Merdy.

Since I was facilitating it, I couldn't code during the event.

Once I got home, I did it using Clojure.

These are the tests using Midje and Clojure test.check libraries:

(ns varint.core-test
(:require
[varint.core :refer :all]
[midje.sweet :refer :all]
[clojure.test.check.clojure-test :refer [defspec]]
[clojure.test.check.generators :as gen]
[clojure.test.check.properties :as prop]))
(facts
"about varint"
(facts
"encoding numbers under 128"
(encode 1) => "00000001"
(encode 8) => "00001000"
(encode 127) => "01111111")
(facts
"encoding numbers greater or equal than 128"
(encode 300) => "1010110000000010")
(facts
"decoding varints"
(decode "1010110000000010") => 300))
(defspec coding-and-decoding
1000
(prop/for-all [num (gen/large-integer* {:min 0})]
(= (-> num encode decode) num)))
and this is the resulting code:

(ns varint.core)
(defn- pad-left [length element bin-num]
(concat (repeat (- length (count bin-num)) element) bin-num))
(defn- add-most-significat-bits [bytes]
(flatten (concat (map #(cons "1" %) (butlast bytes))
(cons "0" (last bytes)))))
(defn- partition-in-blocks-of [block-size coll]
(partition-all block-size block-size coll))
(defn- bin-str->bytes [bin-str]
(->> bin-str
reverse
(partition-in-blocks-of 7)
(map reverse)
(map (partial pad-left 7 "0"))))
(defn- int->bin-str [num]
(Long/toBinaryString num))
(def ^:private drop-most-significat-bits (partial map rest))
(defn- varint->bytes [varint]
(->> varint
(partition-in-blocks-of 8)
drop-most-significat-bits))
(defn- bytes->bin-str [bytes]
(-> bytes
reverse
flatten))
(defn- int-pow [b exp]
(reduce * (repeat exp b)))
(defn- char->int [ch]
(Integer/parseInt (str ch)))
(def ^:private bin-str->bits (partial map char->int))
(defn- bits->int [bits]
(->> bits
reverse
(map-indexed #(* %2 (int-pow 2 %1)))
(reduce +)))
(defn- bin-str->int [bin-str]
(-> bin-str
bin-str->bits
bits->int))
(defn- bytes->varint [bytes]
(->> bytes
add-most-significat-bits
(apply str)))
(defn encode [num]
(-> num
int->bin-str
bin-str->bytes
bytes->varint))
(defn decode [varint]
(-> varint
varint->bytes
bytes->bin-str
bin-str->int))
view raw varint.core.clj hosted with ❤ by GitHub
I used a mix of a bit of TDD, a lot of REPL-driven development (RDD) and some property-based testing.

Basically, the cycle I followed was like this:
  1. Write a failing test (using more complicated examples than the typical ones you use when doing only TDD).
  2. Explore and triangulate on the REPL until I made the test pass with some ugly complete solution.
  3. Refactor the code to make it more readable.
I've noticed that when I mix TDD and RDD, I end up doing bigger steps and keeping less tests than when doing only TDD. I think this is because most of the baby steps and "triangulation tests" are done on the REPL which gives you a much faster feedback cycle.

This way I wrote both encode and decode functions.

Once I had then working with a few examples, I added property-based tests using Clojure test.check library (a QuickCheck like library for Clojure) to automatically explore more examples by checking that the property of "decoding and encoded number returns the number" held for 1000 randomly generated examples.

I think that these three techniques are complementary and its combination makes me more productive.

To document the development process I commit after each TDD green step and after each refactoring. I also committed the REPL history.

See all the commits here if you want to follow the process.

You can find all the code on GitHub.

2 comments:

  1. I wrote another version of varint in Clojure.
    https://github.com/miner/varint

    ReplyDelete
    Replies
    1. Great job! It's much better than what I did.
      Thanks for sharing.
      Best regards,
      M

      Delete