Showing posts with label Unit Testing. Show all posts
Showing posts with label Unit Testing. Show all posts

Tuesday, August 23, 2022

Listening to test smells: detecting lack of cohesion and violations of encapsulation

Introduction.

We’d like to show another example of how difficulties found while testing can signal design problems in our code[1].

We believe that a good design is one that supports the evolution of a code base at a sustainable pace and that testability is a requirement for evolvability. This is not something new, we can find this idea in many places.

Michael Feathers says “every time you encounter a testability problem, there is an underlying design problem”[2].

Nat Pryce and Steve Freeman also think that this relationship between testability and good design exists[3]:

“[…] We’ve found that the qualities that make an object easy to test also make our code responsive to change”

and also use it to detect design problems and know what to refactor:

“[…] where we find that our tests are awkward to write, it’s usually because the design of our target code can be improved”

and to improve their TDD practice:

“[…] sensitise yourself to find the rough edges in your tests and use them for rapid feedback about what to do with the code. […] don’t stop at the immediate problem (an ugly test) but look deeper for why I’m in this situation (weakness in the design) and address that.”[4]

This is why they devoted talks, several posts and a chapter of their GOOS book (chapter 20) to listening to the tests[5] and even added it to the TDD cycle steps:

TDD cycle steps including listening to the tests.
TDD cycle steps including listening to the tests.

Next we’ll show you an example of how we’ve recently applied this in a client.

The problem.

Recently I was asked to help a pair that was developing a new feature in a legacy code base of one of our clients. They were having problems with the following test[6]:

that was testing the RealTimeGalleryAdsRepository class:

They had managed to test drive the functionality but they were unhappy with the results. The thing that was bothering them was the resetCache method in the RealTimeGalleryAdsRepository class. As its name implies, its intent was to reset the cache. This would have been fine, if this had been a requirement, but that was not the case. The method had been added only for testing purposes.

Looking at the code of RealTimeGalleryAdsRepository you can learn why. The cachedSearchResult field is static and that was breaking the isolation between tests. Even though they were using different instances of RealTimeGalleryAdsRepository in each test, they were sharing the same value of the cachedSearchResult field because static state is associated with the class. So a new public method, resetCache, was added to the class only to ensure isolation between different tests.

Adding code to your production code base just to enable unit testing is a unit testing anti-pattern[7], but they didn’t know how to get rid of the resetCache method, and that’s why I was called in to help.

Let’s examine the tests in RealTimeGalleryAdsRepositoryTests to see if they can point to more fundamental design problems.

Another thing we can notice is that the tests can be divided in to sets that are testing two very different behaviours:

  • One set of tests, comprised of maps_all_ads_with_photo_to_gallery_ads, ignore_ads_with_no_photos_in_gallery_ads and when_no_ads_are_found_there_are_no_gallery_ads, is testing the code that obtains the list of gallery ads;

  • whereas, the other set, comprised of when_cache_has_not_expired_the_cached_values_are_used and when_cache_expires_new_values_are_retrieved is testing the life and expiration of some cached values.

This lack of focus was a hint that the production class might lack cohesion, i.e., it might have several responsibilities.

It turns out that there was another code smell that confirmed our suspicion. Notice the boolean parameter useCache in RealTimeGalleryAdsRepository constructor. That was a clear example of a flag argument[8]. useCache was making the class behave differently depending on its value:

  1. It cached the list of gallery ads when useCache was true.
  2. It did not cache them when useCache was false.

After seeing all this, I told the pair that the real problem was the lack of cohesion and that we’d have to go more object-oriented in order to avoid it. After that refactoring the need for the resetCache would disappear.

Going more OO to fix the lack of cohesion.

To strengthen cohesion we need to separate concerns. Let’s see the problem from the point of view of the client of the RealTimeGalleryAdsRepository class, (this point of view is generally very useful because the test is also a client of the tested class) and think about what it would want from the RealTimeGalleryAdsRepository. It would be something like “obtain the gallery ads for me”, that would be the responsibility of the RealTimeGalleryAdsRepository, and that’s what the GalleryAdsRepository represents.

Notice that to satisfy that responsibility we do not need to use a cache, only get some ads from the AdsRepository and map them (the original functionality also included some enrichments using data from other sources but we remove them from the example for the sake of simplicity). Caching is an optimization that we might do or not, it’s a refinement or embellishment to how we satisfy the responsibility but it’s not necessary to satisfy it. In this case, caching changes the “how” but not the “what”.

This matches very well with the Decorator design pattern because this pattern “comes into play when there are a variety of optional functions that can precede or follow another function that is always executed”[9]. Using it would allow us to attach additional behaviour (caching) to the basic required behaviour that satisfied the role that the client needs (“obtain the gallery ads for me”). This way instead of having a flag parameter (like useCache in the original code) to control whether we cache or not, we might add caching by composing objects that implement the GalleryAdsRepository. One of them, RealTimeGalleryAdsRepository, would be in charge of getting ads from the AdsRepository and mapping them to gallery ads; and the other one, CachedGalleryAdsRepository, would cache the gallery ads.

So we moved the responsibility of caching the ads to the CachedGalleryAdsRepository class which decorated the RealTimeGalleryAdsRepository class.

This is the code of the CachedGalleryAdsRepository class:

and these are its tests:

Notice how we found here again the two tests that were previously testing the life and expiration of the cached values in the test of the original RealTimeGalleryAdsRepository: when_cache_has_not_expired_the_cached_values_are_used and when_cache_expires_new_values_are_retrieved.

Furthermore, looking at them more closely, we can see how, in this new design, those tests are also simpler because they don’t know anything about the inner details of RealTimeGalleryAdsRepository. They only know about the logic related to the life and expiration of the cached values and that when the cache is refreshed they call a collaborator that implements the GalleryAdsRepository interface, this means that now we’re caching gallery ads instead of an instance of the SearchResult and we don’t know anything about the AdsRepository.

On a side note, we also improved the code by using the Duration value object from java.time to remove the primitive obsession smell caused by using a long to represent milliseconds.

Another very important improvement is that we don’t need the static field anymore.

And what about RealTimeGalleryAdsRepository?

If we have a look at its new code, we can notice that its only concern is how to obtain the list of gallery ads and mapping them from the result of its collaborator AdsRepository, and it does not know anything about caching values. So the new design is more cohesive than the original one. Notice how we removed both the resetCache method that was before polluting its interface only for testing purposes, and the flag argument, useCache, in the constructor.

We also reduced its number of collaborators because there’s no need for a Clock anymore. That collaborator was needed for a different concern that is now taken care of in the decorator CachedGalleryAdsRepository.

These design improvements are reflected in its new tests. They are now more focused, and can only fail if the obtention of the gallery ads breaks. Having only one reason to fail comes from testing a more cohesive unit with only one reason to change. Notice how these tests coincide with the subset of tests concerned with testing the same behaviour in the original tests of RealTimeGalleryAdsRepository:

Persisting the cached values between calls.

You might be asking yourselves, how are we going to ensure that the cached values persist between calls now that we don’t have a static field anymore.

Well, the answer is that we don’t need to keep a static field in our classes for that. The only thing we need is that the composition of CachedGalleryAdsRepository and RealTimeGalleryAdsRepository is created only once, and that we use that single instance for the lifetime of the application. That is a concern that we can address using a different mechanism.

We usually find in legacy code bases that this need to create something only once, and use that single instance for the lifetime of the application is met using the Singleton design pattern described in the design patterns book. The Singleton design pattern intent is to “ensure that only one instance of the singleton class ever exists; and provide global access to that instance”. The second part of that intent, “providing global access”, is problematic because it introduces global state into the application. Using global state creates high coupling (in the form of hidden dependencies and possible actions at a distance) that drastically reduces testability.

Instead we used the singleton pattern[10]. Notice the lowercase letter. The lowercase ’s’ singleton avoids those testability problems because its intent is only to “ensure that only one instance of some class ever exists because its new operator is called only once”. The problematic global access part gets removed from the intent. This is done by avoiding mixing object instantiation with business logic by using separated factories that know how to create and wire up all the dependencies using dependency injection.

We might create this singleton, for instance, by using a dependency injection framework like Guice and its @Singleton annotation.

In this case we coded it ourselves:

Notice the factory method that returns a unique instance of the GalleryAdsRepository interface that caches values. This factory method is never used by business logic, it’s only used by instantiation logic in factories that know how to create and wire up all the dependencies using dependency injection. This doesn’t introduce testability problems because the unique instance will be injected through constructors by factories wherever is needed.

Conclusions.

We show a recent example we found working for a client that illustrates how testability problems may usually point, if we listen to them, to the detection of underlying design problems. In this case the problems in the test were pointing to a lack of cohesion in the production code that was being tested. The original class had too many responsibilities.

We refactored the production code to separate concerns by going more OO applying the decorator design pattern. The result was more cohesive production classes that led to more focused tests, and removed the design problems we had detected in the original design.

Acknowledgements.

I’d like to thank my Codesai colleagues for reading the initial drafts and giving me feedback.

Notes.

[1] We showed another example of this relationship between poor testability and design problems in a previous post: An example of listening to the tests to improve a design.

[2] Listen to his great talk about this relationship: The Deep Synergy Between Testability and Good Design

[3] This is the complete paragraph from chapter 20, Listening to the tests, of the GOOS book: “Sometimes we find it difficult to write a test for some functionality we want to add to our code. In our experience, this usually means that our design can be improved — perhaps the class is too tightly coupled to its environment or does not have clear responsibilities. When this happens, we first check whether it’s an opportunity to improve our code, before working around the design by making the test more complicated or using more sophisticated tools. We’ve found that the qualities that make an object easy to test also make our code responsive to change.”

[4] This quote is from their post Synaesthesia: Listening to Test Smells.

[5] Have a look at this interesting series of posts about listening to the tests by Steve Freeman. It’s a raw version of the content that you’ll find in chapter 20, Listening to the tests, of their book.

[6] We have simplified the client’s code to remove some distracting details and try to highlight its key problems.

[7] Vladimir Khorikov calls this unit testing anti-pattern Code pollution.

[8] A flag Argument is a kind of argument that is telling a function or a class to behave in a different way depending on its value. This might be a signal of poor cohesion in the function or class.

[9] Have a look at the discussion in the chapter devoted to the Decorator design pattern in the great book Design Patterns Explained: A New Perspective on Object-Oriented Design.

[10] Miško Hevery talks about the singleton pattern with lowercase ‘s’ in its talk Global State and Singletons at 10:20: “Singleton with capital ’S’. Refers to the design pattern where the Singleton has a private constructor and has a global instance variable. Lowercase ’s’ singleton means I only have a single instance of something because I only called the new operator once.”

References.

Monday, July 29, 2019

Playing Fizzbuzz with property-based testing

Introduction.

Lately, I’ve been playing a bit with property-based testing.

I practised doing the FizzBuzz kata in Clojure and used the following constraints for fun[1]:

  1. Add one property at a time before writing the code to make the property hold.
  2. Make the failing test pass before writing a new property.

The kata step by step.

To create the properties, I partitioned the first 100 integers according to how they are transformed by the code. This was very easy using two of the operations on sets that Clojure provides (difference and intersection).

The first property I wrote checks that the multiples of 3 but not 5 are Fizz:

and this is the code that makes that test pass:

Next, I wrote a property to check that the multiples of 5 but not 3 are Buzz (I show only the new property for brevity):

and this is the code that makes the new test pass:

Then, I added a property to check that the multiples of 3 and 5 are FizzBuzz:

which was already passing with the existing production code.

Finally, I added a property to check that the rest of numbers are just casted to a string:

which Id made pass with this version of the code:

The final result.

These are the resulting tests where you can see all the properties together:

You can find all the code in this repository.

Conclusions.

It was a lot of fun doing this kata. It is a toy example that didn’t make me dive a lot into clojure.check’s generators documentation because I could take advantage of Clojure’s set functions to write the properties.

I think the resulting properties are quite readable even if you don’t know Clojure. On the other hand, the resulting implementation is probably not similar to the ones you’re used to see, and it shows Clojure’s conciseness and expressiveness.

Footnotes:

[1] I'm not saying that you should property-based testing with this constraints. They probably make no sense in real cases. The constraints were meant to make it fun.

Thursday, June 27, 2019

An example of listening to the tests to improve a design

Introduction.

Recently in the B2B team at LIFULL Connect, we improved the validation of the clicks our API receive using a service that detects whether the clicks were made by a bot or a human being.

So we used TDD to add this new validation to the previously existing validation that checked if the click contained all mandatory information. This was the resulting code:

and these were its tests:

The problem with these tests is that they know too much. They are coupled to many implementation details. They not only know the concrete validations we apply to a click and the order in which they are applied, but also details about what gets logged when a concrete validations fails. There are multiple axes of change that will make these tests break. The tests are fragile against those axes of changes and, as such, they might become a future maintenance burden, in case changes along those axes are required.

So what might we do about that fragility when any of those changes come?

Improving the design to have less fragile tests.

As we said before the test fragility was hinting to a design problem in the ClickValidation code. The problem is that it’s concentrating too much knowledge because it’s written in a procedural style in which it is querying every concrete validation to know if the click is ok, combining the result of all those validations and knowing when to log validation failures. Those are too many responsibilities for ClickValidation and is the cause of the fragility in the tests.

We can revert this situation by changing to a more object-oriented implementation in which responsibilities are better distributed. Let’s see how that design might look:

1. Removing knowledge about logging.

After this change, ClickValidation will know nothing about looging. We can use the same technique to avoid knowing about any similar side-effects which concrete validations might produce.

First we create an interface, ClickValidator, that any object that validates clicks should implement:

Next we create a new class NoBotClickValidator that wraps the BotClickDetector and adapts[1] it to implement the ClickValidator interface. This wrapper also enrichs BotClickDetector’s’ behavior by taking charge of logging in case the click is not valid.

These are the tests of NoBotClickValidator that takes care of the delegation to BotClickDetector and the logging:

If we used NoBotClickValidator in ClickValidation, we’d remove all knowledge about logging from ClickValidation.

Of course, that knowledge would also disappear from its tests. By using the ClickValidator interface for all concrete validations and wrapping validations with side-effects like logging, we’d make ClickValidation tests robust to changes involving some of the possible axis of change that were making them fragile:

  1. Changing the interface of any of the individual validations.
  2. Adding side-effects to any of the validations.

2. Another improvement: don't use test doubles when it's not worth it[2].

There’s another way to make ClickValidation tests less fragile.

If we have a look at ClickParamsValidator and BotClickDetector (I can’t show their code here for security reasons), they have very different natures. ClickParamsValidator has no collaborators, no state and a very simple logic, whereas BotClickDetector has several collaborators, state and a complicated validation logic.

Stubbing ClickParamsValidator in ClickValidation tests is not giving us any benefit over directly using it, and it’s producing coupling between the tests and the code.

On the contrary, stubbing NoBotClickValidator (which wraps BotClickDetector) is really worth it, because, even though it also produces coupling, it makes ClickValidation tests much simpler.

Using a test double when you’d be better of using the real collaborator is a weakness in the design of the test, rather than in the code to be tested.

These would be the tests for the ClickValidation code with no logging knowledge, after applying this idea of not using test doubles for everything:

Notice how the tests now use the real ClickParamsValidator and how that reduces the coupling with the production code and makes the set up simpler.

3. Removing knowledge about the concrete sequence of validations.

After this change, the new design will compose validations in a way that will result in ClickValidation being only in charge of combining the result of a given sequence of validations.

First we refactor the click validation so that the validation is now done by composing several validations:

The new validation code has several advantages over the previous one:

  • It does not depend on concrete validations any more
  • It does not depend on the order in which the validations are made.

It has only one responsibility: it applies several validations in sequence, if all of them are valid, it will accept the click, but if any given validation fails, it will reject the click and stop applying the rest of the validations. If you think about it, it’s behaving like an and operator.

We may write these tests for this new version of the click validation:

These tests are robust to the changes making the initial version of the tests fragile that we described in the introduction:

  1. Changing the interface of any of the individual validations.
  2. Adding side-effects to any of the validations.
  3. Adding more validations.
  4. Changing the order of the validation.

However, this version of ClickValidationTest is so general and flexible, that using it, our tests would stop knowing which validations, and in which order, are applied to the clicks[3]. That sequence of validations is a business rule and, as such, we should protect it. We might keep this version of ClickValidationTest only if we had some outer test protecting the desired sequence of validations.

This other version of the tests, on the other hand, keeps protecting the business rule:

Notice how this version of the tests keeps in its setup the knowledge of which sequence of validations should be used, and how it only uses test doubles for NoBotClickValidator.

4. Avoid exposing internals.

The fact that we’re injecting into ClickValidation an object, ClickParamsValidator, that we realized we didn’t need to double, it’s a smell which points to the possibility that ClickParamsValidator is an internal detail of ClickValidation instead of its peer. So by injecting it, we’re coupling ClickValidation users, or at least the code that creates it, to an internal detail of ClickValidation: ClickParamsValidator.

A better version of this code would hide ClickParamsValidator by instantiating it inside ClickValidation’s constructor:

With this change ClickValidation recovers the knowledge of the sequence of validations which in the previous section was located in the code that created ClickValidation.

There are some stereotypes that can help us identify real collaborators (peers)[4]:

  1. Dependencies: services that the object needs from its environment so that it can fulfill its responsibilities.
  2. Notifications: other parts of the system that need to know when the object changes state or performs an action.
  3. Adjustments or Policies: objects that tweak or adapt the object’s behaviour to the needs of the system.

Following these stereotypes, we could argue that NoBotClickValidator is also an internal detail of ClickValidation and shouldn’t be exposed to the tests by injecting it. Hiding it we’d arrive to this other version of ClickValidation:

in which we have to inject the real dependencies of the validation, and no internal details are exposed to the client code. This version is very similar to the one we’d have got using tests doubles only for infrastructure.

The advantage of this version would be that its tests would know the least possible about ClickValidation. They’d know only ClickValidation’s boundaries marked by the ports injected through its constructor, and ClickValidation`’s public API. That will reduce the coupling between tests and production code, and facilitate refactorings of the validation logic.

The drawback is that the combinations of test cases in ClickValidationTest would grow, and may of those test cases would talk about situations happening in the validation boundaries that might be far apart from ClickValidation’s callers. This might make the tests hard to understand, specially if some of the validations have a complex logic. When this problem gets severe, we may reduce it by injecting and use test doubles for very complex validators, this is a trade-off in which we decide to accept some coupling with the internal of ClickValidation in order to improve the understandability of its tests. In our case, the bot detection was one of those complex components, so we decided to test it separately, and inject it in ClickValidation so we could double it in ClickValidation’s tests, which is why we kept the penultimate version of ClickValidation in which we were injecting the click-not-made-by-a-bot validation.

Conclusion.

In this post, we tried to play with an example to show how listening to the tests[5] we can detect possible design problems, and how we can use that feedback to improve both the design of our code and its tests, when changes that expose those design problems are required.

In this case, the initial tests were fragile because the production code was procedural and had too many responsibilities. The tests were fragile also because they were using test doubles for some collaborators when it wasn’t worth to do it.

Then we showed how refactoring the original code to be more object-oriented and separating better its responsibilities, could remove some of the fragility of the tests. We also showed how reducing the use of test doubles only to those collaborators that really needs to be substituted can improve the tests and reduce their fragility. Finally, we showed how we can go too far in trying to make the tests flexible and robust, and accidentally stop protecting a business rule, and how a less flexible version of the tests can fix that.

When faced with fragility due to coupling between tests and the code being tested caused by using test doubles, it’s easy and very usual to “blame the mocks”, but, we believe, it would be more productive to listen to the tests to notice which improvements in our design they are suggesting. If we act on this feedback the tests doubles give us about our design, we can use tests doubles in our advantage, as powerful feedback tools[6], that help us improve our designs, instead of just suffering and blaming them.

Acknowledgements.

Many thanks to my Codesai colleagues Alfredo Casado, Fran Reyes, Antonio de la Torre and Manuel Tordesillas, and to my Aprendices colleagues Paulo Clavijo, Álvaro García and Fermin Saez for their feedback on the post, and to my colleagues at LIFULL Connect for all the mobs we enjoy together.

Footnotes:

[2] See Test Smell: Everything is mocked by Steve Freeman where he talks about things you shouldn't be substituting with tests doubles.
[3] Thanks Alfredo Casado for detecting that problem in the first version of the post.
[4] From Growing Object-Oriented Software, Guided by Tests > Chapter 6, Object-Oriented Style > Object Peer Stereotypes, page 52. You can also read about these stereotypes in a post by Steve Freeman: Object Collaboration Stereotypes.
[5] Difficulties in testing might be a hint of design problems. Have a look at this interesting series of posts about listening to the tests by Steve Freeman.
[6] According to Nat Pryce mocks were designed as a feedback tool for designing OO code following the 'Tell, Don't Ask' principle: "In my opinion it's better to focus on the benefits of different design styles in different contexts (there are usually many in the same system) and what that implies for modularisation and inter-module interfaces. Different design styles have different techniques that are most applicable for test-driving code written in those styles, and there are different tools that help you with those techniques. Those tools should give useful feedback about the external and *internal* quality of the system so that programmers can 'listen to the tests'. That's what we -- with the help of many vocal users over many years -- designed jMock to do for 'Tell, Don't Ask' object-oriented design." (from a conversation in Growing Object-Oriented Software Google Group).

I think that if your design follows a different OO style, it might be preferable to stick to a classical TDD style which nearly limits the use of test doubles only to infrastructure and undesirable side-effects.

Saturday, May 19, 2018

Improving legacy Om code (II): Using effects and coeffects to isolate effectful code from pure code

Introduction.

In the previous post, we applied the humble object pattern idea to avoid having to write end-to-end tests for the interesting logic of a hard to test legacy Om view, and managed to write cheaper unit tests instead. Then, we saw how those unit tests were far from ideal because they were highly coupled to implementation details, and how these problems were caused by a lack of separation of concerns in the code design.

In this post we’ll show a solution to those design problems using effects and coeffects that will make the interesting logic pure and, as such, really easy to test and reason about.

Refactoring to isolate side-effects and side-causes using effects and coeffects.

We refactored the code to isolate side-effects and side-causes from pure logic. This way, not only testing the logic got much easier (the logic would be in pure functions), but also, it made tests less coupled to implementation details. To achieve this we introduced the concepts of coeffects and effects.

The basic idea of the new design was:

  1. Extracting all the needed data from globals (using coeffects for getting application state, getting component state, getting DOM state, etc).
  2. Using pure functions to compute the description of the side effects to be performed (returning effects for updating application state, sending messages, etc) given what was extracted in the previous step (the coeffects).
  3. Performing the side effects described by the effects returned by the called pure functions.

The main difference that the code of horizon.controls.widgets.tree.hierarchy presented after this refactoring was that the event handler functions were moved back into it again, and that they were using the process-all! and extract-all! functions that were used to perform the side-effects described by effects, and extract the values of the side-causes tracked by coeffects, respectively. The event handler functions are shown in the next snippet (to see the whole code click here):

Now all the logic in the companion namespace was comprised of pure functions, with neither asynchronous nor mutating code:

Thus, its tests became much simpler:

Notice how the pure functions receive a map of coeffects already containing all the extracted values they need from the “world” and they return a map with descriptions of the effects. This makes testing really much easier than before, and remove the need to use test doubles.

Notice also how the test code is now around 100 lines shorter. The main reason for this is that the new tests know much less about how the production code is implemented than the previous one. This made possible to remove some tests that, in the previous version of the code, were testing some branches that we were considering reachable when testing implementation details, but when considering the whole behaviour are actually unreachable.

Now let’s see the code that is extracting the values tracked by the coeffects:

which is using several implementations of the Coeffect protocol:

All the coeffects were created using factories to localize in only one place the “shape” of each type of coeffect. This indirection proved very useful when we decided to refactor the code that extracts the value of each coeffect to substitute its initial implementation as a conditional to its current implementation using polymorphism with a protocol.

These are the coeffects factories:

Now there was only one place where we needed to test side causes (using test doubles for some of them). These are the tests for extracting the coeffects values:

A very similar code is processing the side-effects described by effects:

which uses different effects implementing the Effect protocol:

that are created with the following factories:

Finally, these are the tests for processing the effects:

Summary.

We have seen how by using the concept of effects and coeffects, we were able to refactor our code to get a new design that isolates the effectful code from the pure code. This made testing our most interesting logic really easy because it became comprised of only pure functions.

The basic idea of the new design was:

  1. Extracting all the needed data from globals (using coeffects for getting application state, getting component state, getting DOM state, etc).
  2. Computing in pure functions the description of the side effects to be performed (returning effects for updating application state, sending messages, etc) given what it was extracted in the previous step (the coeffects).
  3. Performing the side effects described by the effects returned by the called pure functions.

Since the time we did this refactoring, we have decided to go deeper in this way of designing code and we’re implementing a full effects & coeffects system inspired by re-frame.

Acknowledgements.

Many thanks to Francesc Guillén, Daniel Ojeda, André Stylianos Ramos, Ricard Osorio, Ángel Rojo, Antonio de la Torre, Fran Reyes, Miguel Ángel Viera and Manuel Tordesillas for giving me great feedback to improve this post and for all the interesting conversations.

Improving legacy Om code (I): Adding a test harness

Introduction.

I’m working at GreenPowerMonitor as part of a team developing a challenging SPA to monitor and manage renewable energy portfolios using ClojureScript. It’s a two years old Om application which contains a lot of legacy code. When I say legacy, I’m using Michael Feathers’ definition of legacy code as code without tests. This definition views legacy code from the perspective of code being difficult to evolve because of a lack of automated regression tests.

The legacy (untested) Om code.

Recently I had to face one of these legacy parts when I had to fix some bugs in the user interface that was presenting all the devices of a given energy facility in a hierarchy tree (devices might be comprised of other devices). This is the original legacy view code:

This code contains not only the layout of several components but also the logic to both conditionally render some parts of them and to respond to user interactions. This interesting logic is full of asynchronous and effectful code that is reading and updating the state of the components, extracting information from the DOM itself and reading and updating the global application state. All this makes this code very hard to test.

Humble Object pattern.

It’s very difficult to make component tests for non-component code like the one in this namespace, which makes writing end-to-end tests look like the only option.

However, following the idea of the humble object pattern, we might reduce the untested code to just the layout of the view. The humble object can be used when a code is too closely coupled to its environment to make it testable. To apply it, the interesting logic is extracted into a separate easy-to-test component that is decoupled from its environment.

In this case we extracted the interesting logic to a separate namespace, where we thoroughly tested it. With this we avoided writing the slower and more fragile end-to-end tests.

We wrote the tests using the test-doubles library (I’ve talked about it in a recent post) and some home-made tools that help testing asynchronous code based on core.async.

This is the logic we extracted:

and these are the tests we wrote for it:

See here how the view looks after this extraction. Using the humble object pattern, we managed to test the most important bits of logic with fast unit tests instead of end-to-end tests.

The real problem was the design.

We could have left the code as it was (in fact we did for a while) but its tests were highly coupled to implementation details and hard to write because its design was far from ideal.

Even though, applying the humble object pattern idea, we had separated the important logic from the view, which allowed us to focus on writing tests with more ROI avoiding end-to-end tests, the extracted logic still contained many concerns. It was not only deciding how to interact with the user and what to render, but also mutating and reading state, getting data from global variables and from the DOM and making asynchronous calls. Its effectful parts were not isolated from its pure parts.

This lack of separation of concerns made the code hard to test and hard to reason about, forcing us to use heavy tools: the test-doubles library and our async-test-tools assertion functions to be able to test the code.

Summary.

First, we applied the humble object pattern idea to manage to write unit tests for the interesting logic of a hard to test legacy Om view, instead of having to write more expensive end-to-end tests.

Then, we saw how those unit tests were far from ideal because they were highly coupled to implementation details, and how these problems were caused by a lack of separation of concerns in the code design.

Next.

In the next post we’ll solve the lack of separation of concerns by using effects and coeffects to isolate the logic that decides how to interact with the user from all the effectful code. This new design will make the interesting logic pure and, as such, really easy to test and reason about.

Monday, April 9, 2018

test-doubles: A small spying and stubbing library for Clojure and ClojureScript

As you may know from a previous post I’m working for GreenPowerMonitor as part of a team that is developing a challenging SPA to monitor and manage renewable energy portfolios using ClojureScript.

We were dealing with some legacy code that was effectful and needed to be tested using test doubles, so we explored some existing ClojureScript libraries but we didn't feel comfortable with them. On one hand, we found that some of them had different macros for different types of test doubles and this made tests that needed both spies and stubs become very nested. We wanted to produce tests with as little nesting as possible. On the other hand, being used to Gerard Meszaros’ vocabulary for tests doubles, we found the naming used for different types of tests doubles in some of the existing libraries a bit confusing. We wanted to stick to Gerard Meszaros’ vocabulary for tests doubles.

So we decided we'd write our own stubs and spies library.

We started by manually creating our own spies and stubs during some time so that we could identify the different ways in which we were going to use them. After a while, my colleague André Stylianos Ramos and I wrote our own small DSL to create stubs and spies using macros to remove all that duplication and boiler plate. The result was a small library that we've been using in our ClojureScript project for nearly a year and that we've recently adapted to make it work in Clojure as well:

I’m really glad to announce that GreenPowerMonitor has open-sourced our small spying and stubbing library for Clojure and ClojureScript: test-doubles.

In the following example written in ClojureScript, we show how we are using test-doubles to create two stubs (one with the :maps option and another with the :returns option) and a spy:

We could show you more examples here of how test-doubles can be used and the different options it provides, but we’ve already included a lot of explained examples in its documentation.

Please do have a look and try our library. You can get its last version from Clojars. We hope it might be as useful to you as it has been for us.

Tuesday, August 22, 2017

In a small piece of code

This post appeared originally on Codesai’s Blog.

In a previous post we talked about positional parameters and how they can suffer from Connascence of Position, (CoP). Then we saw how, in some cases, we might introduce named parameters to remove the CoP and transform it into Connascence of Name, (CoN), but always being careful to not hiding cases of Connascence of Meaning, (CoM). In this post we’ll focus on languages that don’t provide named parameters and see different techniquess to remove the CoP.

Let’s see an example of a method suffering of CoP:

In languages without named parameters (the example is written in Java), we can apply a classic[1] refactoring technique, Introduce Parameter Object, that can transform CoP into CoN. In this example, we introduced the ClusteringParameters object:

which eliminates the CoP transforming it into CoN:

In this particular case, all the parameters passed to the function were semantically related, since they all were parameters of the clustering algorithm, but in many other cases all the parameters aren’t related. So, as we saw in our previous post for named parameters, we have to be careful of not accidentally sweeping hidden CoM in the form of data clumps under the rug when we use the Introduce Parameter Object refactoring.

In any case, what it’s clear is that introducing a parameter object produces much less expressive code than introducing named parameters. So how to gain semantics while removing CoP in languages without named parameters?

One answer is using fluent interfaces[2] which is a technique that is much more common than you think. Let’s have a look at the following small piece of code:

This is just a simple test. However, just in this small piece of code, we can find two examples of removing CoP using fluent interfaces and another example that, while not removing CoP, completely removes its impact on expressiveness. Let’s look at them with more detail.

The first example is an application of the builder pattern using a fluent interface[3].

Applying the builder pattern provides a very specific[4] internal DSL that we can use to create a complex object avoiding CoP and also getting an expressiveness comparable or even superior to the one we’d get using named parameters.

In this case we composed two builders, one for the SafetyRange class:

and another for the Alarm class:

Composing builders you can manage to create very complex objects in a maintanable and very expressive way.

Let’s see now the second interesting example in our small piece of code:

This assertion using hamcrest is so simple that the JUnit alternative is much clearer:

but for more than one parameter the JUnit interface starts having problems:

Which one is the expected value and which one is the actual one? We never manage to remember…

Using hamcrest removes that expressiveness problem:

Thanks to the semantics introduced by hamcrest[6], it’s very clear that the first parameter is the actual value and the second parameter is the expected one. The internal DSL defined by hamcrest produces declarative code with high expressiveness. To be clear hamcrest is not removing the CoP, but since there are only two parameters, the degree of CoP is very low[7]. The real problem of the code using the JUnit assertion was its low expressiveness and using hamcrest fixes that.

For us it’s curious to see how, in trying to achieve expressiveness, some assertion libraries that use fluent interfaces have (probably not being aware of it) eliminate CoP as well. See this other example using Jasmine:

Finally, let’s have a look at the last example in our initial small piece of code which is also using a fluent interface:

This is Mockito’s way of defining a stub for a method call. It’s another example of fluent interface which produces highly expressive code and avoids CoP.

Summary.

We started seeing how, in languages that don’t allow named parameters, we can remove CoP by applying the Introduce Parameter Object refactoring and how the resulting code was much less expressive than the one using the Introducing Named Parameters refactoring. Then we saw how we can leverage fluent interfaces to remove CoP while writing highly expressive code, mentioned internal DSLs and showed you how this technique is more common that one can think at first by examining a small piece of code.

References.

Books.

Posts.

Footnotes:

[2] Of course, fluent interfaces are also great in languages that provide named parameters.
[3] Curiosly there're alternative ways to implement the builder pattern that use options maps or named parameters. Some time ago we wrote about an example of using the second way: Refactoring tests using builder functions in Clojure/ClojureScript.
[4] The only purpose of that DSL is creating one specific type of object.
[5] For us the best explanation of the builder pattern and how to use it to create maintanable tests is in chapter 22, Constructing Complex Test Data, of the wonderful Growing Object-Oriented Software Guided by Tests book.
[6] hamcrest is a framework for writing matcher objects allowing 'match' rules to be defined declaratively. We love it!