Showing posts with label Object-Oriented Design. Show all posts
Showing posts with label Object-Oriented Design. Show all posts

Monday, February 14, 2022

Notes on LSP from Agile Principles, Practices and Patterns book

I continue sharing my notes on SOLID to prepare the ground for the upcoming The Big Branch Theory Podcast episode about Liskov Substitution Principle.

Ok, so these are the raw notes I took while reading the chapter devoted to Liskov Substitution Principle (LSP) in Robert C. Martin’s Agile Principles, Practices and Patterns in C# book (I added some personal annotations between brackets):

  • “The primary mechanisms behind the OCP are abstraction and polymorphism” <- [but in some languages inheritance is needed to have polymorphism]

  • ”.. questions addressed by the LSP”

    • “What are the desgin rules that govern this particular use of inheritance hierarchies?”

    • “What are the characteristics of the best inheritance hierarchies?”

    • “What are the traps that will cause us to create hierarchies that do not conform to OCP?”

  • LSP -> “Subtypes must be substitutable for their base types”

  • “Violating LSP often results in the use of runtime type checking in a manner that grossly violates OCP”

  • “a violation of LSP is a latent violation of OCP”

  • “… more subtle way of violating LSP” -> “.. use of IS-A relationship is sometimes thought to be one of the fundamental techniques of OOA, a term frequently used but seldom defined […]. However this kind of thinking can lead to some subtle yet significant problems. Generally, these problems are not foreseen until we see them in code”

  • Invariants -> “those properties that must always be true regardless of state”

  • “… when the creation of a derived class causes us to make changes to the base class, it often implies that the design is faulty”

  • “Validity is not intrinsic”

    • “LSP leads us to a very important conclusion: A model, viewed in isolation, cannot be meaningfully validated. The validity of a model can be expressed only in terms of its clients.

    • “When considering whether a particular design is appropriate, one cannot simply view the solution in isolation. One must view it in terms of the reasonable assumptions made by the users of that design”

    • “Therefore, as with all other principles, it is often best to defer all but the most obvious LSP violations until the related fragility has been smelled”

  • “IS-A is about behavior” -> “… it is behavior that software is really all about. LSP makes it clear than in OOD, the IS-A relationship pertains to behavior that can be reasonably assumed and that clients depend on” <- [related to behavioural approach to modelling shown in David West’s Object Thinking, or Rebecca Wirfs-Brock’s Designing Object-Oriented Software]

  • “How do you know what your clients will really expect? There is a technique for making those reasonable assumptions explicit and thereby enforcing LSP…” -> Design By Contract (DBC)

  • “Using DBC, the author of a class explicitly states the contract of that class. The contract informs the author of any client code if the behaviors that can be relied on. The contract is specified by declaring preconditions and postconditions for each method. The preconditions must be true for the method to execute. On completion, the method guarantees that the postconditions are true.”

  • “…the rule for preconditions and postconditions of derivatives, as stated by Meyer, is: ‘A routine redeclaration [in a derivative] may only replace the original precondition by one equal or weaker, and the original postcondition by one equal or stronger’” <- “X is weaker than Y if X does not enforce all the constraints of Y. It does not matter how many new constraints X enforces”

  • “In other words, when using an object through its base class interface, the user knows only the preconditions and postconditions of the base class. Thus, derived objects must not expect such users to obey preconditions that are stronger than those required by the base class. Also, derived classes must conform to all the postconditions of the base. That is, their behaviors and outputs must not violate any of the constraints established for the base class.” “Users of the base class must not be confused by the output of the derived class.” <- [a form of the Least Astonishment Principle]

  • “Contracts can […] be specified by writing unit tests. By thoroughly testing the behavior of a class, the unit tests make the behavior of the class clear. Authors of the client code will want to review the unit tests in order to know what to reasonably assume about the classes they are using”

  • “It’s a big advantage not to have to know or care what kind of [sth] you are using. It means that the programmer can decide which kind of [sth] is needed in each particular instance, and none of the client functions will be affected by that decision”

  • “…the problem with conventions: they have to be continually resold to each developer”

  • “There are occasions when it is more expedient to accept a subtle flaw in polymorphic behavior than to attempt to manipulate the design into complete LSP compliance. Accepting compromise instead of pursuing perfection is an engineering trade-off. A good engineer learns when compromise is more profitable that perfection. However, conformance to LSP should not be surrendered lightly. The guarantee that a subclass will always work where its base classes are used is a powerful way to manage complexity. Once it is forsaken we must consider each subclass individually.

  • “Factoring is a powerfull tool. If qualities can be factored out of two subclassses, there is the distinct possibility that other classes will show up later that need those qualities too”

  • Rebecca Wirfs-Brock, on factoring:”

    • “We can state that if a set of classes all support a common responsibility, they should inherit that responsibility from a common superclass”

    • “If a common superclass does not already exist, create one, and move the common responsibility to it. After all such a class is demonstrably useful […]. Isn’t it conceivable that a later extension of your system might add a new subclass that will support those same responsibilitties in a new way? This new superclass will probably be an abstract class”

  • “Some simple heuristics can give you some clues about LSP violations. These heuristics all have to do with derivative classes that somehow remove functionality from their base class. A derivative that does less that its base is usually not substitutable for that base and therefore violates LSP” <- “The presence of degenerate functions in derivatives is not always indicative of an LSP violation, but it’s worth looking at them when they occur” [see Refused Bequest code smell]

  • “The OCP is at the heart of many of the claims made for OOD. […] The LSP is one of the prime enablers of OCP”

  • “The substitutability of subtypes allows a module, expressed in terms of a base type, to be extensible without modification. That substitutability must be sth that developers can depend on implicitly. Thus, the contract of the base type has to be well and prominently understood, if not explicitly enforced, by the code”

  • “The […] IS-A is too broad to act as a definition of a subtype. The true definition of a subtype is substitutable, where substitutability is defined by either an explicit or implicit contract

This post was also published in Codesai's blog.

Saturday, January 30, 2021

Notes on OCP from Agile Principles, Practices and Patterns book

Some time ago I wrote a post sharing my notes on SRP from Agile Principles, Practices and Patterns book because I was making an effort to get closer to the sources of some object-oriented concepts. I didn’t continue sharing my notes on SOLID because I thought they might not be interesting for our readers. However, seeing the success of the Single responsibility ¿Principle? episode of The Big Branch Theory Podcast for which I used my notes on SRP, I’ve decided to share the rest of my notes on SOLID on Codesai’s blog.

Ok, so these are the raw notes I took while reading the chapter devoted to Open-closed Principle in Robert C. Martin’s Agile Principles, Practices and Patterns in C# book (I added some personal annotations between brackets):

  • OCP -> “Software entities (classes, modules, functions, etc) should be open for extension but closed for modification”

  • “When a single change to a program results in a cascade of changes to dependent modules, the design smells of fragility” <- [No local consequences. See Beck’s Local Consequences principle from Implementation Patterns] “OCP advises us to refactor the system so that further changes of that kind will not cause more modifications. If OCP is applied well, further changes of that kind are achieved by adding new code, not by changing old code that already works”

  • “It’s possible to create abstractions that are fixed and yet represent an unbounded group of possible behaviors”

  • “[A module that uses such abstractions] can be closed for modification, since it depends on an abstraction that is fixed. Yet the behavior of the module can be extended by creating new derivatives of the abstraction”

  • “Abstract classes are more closely associated to their clients than to the classes that implement them” <- [related with Separated Interface from Fowler’s P of EAA]

  • ”[Strategy and Template Method patterns] are the most common ways to satisfy OCP. They represent a clear separation of generic functionality from the detailed implementation of that functionality”

  • Anticipation

    • “[When a] program conforms to OCP. It is changed by adding code rather than by changing existing code”

    • “In general no matter how “closed” a module is, there will always be some kind of change against which it is not closed”

    • “Since closure <- [“closure” here means protection against a given axis of variation or change, see Craig Larman’s Protected Variation: The Importance of Being Closed] can’t be complete, it must be strategic. That is the designer must choose the kinds of changes against which to close the design, must guess at the kinds of changes that are most likely, and then construct abstractions to protect against those changes.”

    • “This is not easy. It amounts to making educated guesses about the likely kinds of changes that the application will suffer over time.” “Also conforming to OCP is expensive. It takes development time and money to create the appropriate abstractions. These abstractions also increase the complexity of the software design”

    • “We want to limit the application of OCP to changes that are likely”

    • “How do we know which changes are likely? We do the appropriate research, we ask the appropriate questions, and we use our experience and common sense.” <- [also requires knowing about the domain. A bit easier to predict in technological boundaries. Listen to the conversation in Single Responsibility ¿Principle?] podcast] “And after all that, we wait until the changes happen!” <- [see Yagni] “We don’t want to load the design with lots of unnecessary abstractions. Rather we want to wait until we need the abstraction and then put them in”

  • “Fool me once”

    • “… we initially write our code expecting it not to change. When a change occurs, we implement the abstractions that protect us from future changes of that kind.” <- [One heuristic: we get to OCP through refactoring to avoid Speculative Generality. Most useful heuristic in unknown territory.]

    • “If we decide to take the first bullet, it is to our advantage to get the bullets flying early and frequently. We want to know what changes are likely before we are very far down the development path. The longer we wait to find out what kind of changes are likely, the more difficult it will be to create the appropriate abstractions.”

    • “Therefore, we need to stimulate changes”

      • “We write tests first” -> “testing is one kind of usage of the system. By writing tests first, we force the system to be testable. Therefore, changes in testability will not surprise us later. We will have built the abstractions that make the system testable. We are likely to find that many of these abstractions will protect us from other kinds of changes later.” <- [incrementally (tests “right after”) might also work]

      • “We use short development cycles”

      • “We develop features before infrastructure and frequently show those feature to stake-holders”

      • “We develop the most important features first”

      • “We release the software early and often”

  • “Closure is based on abstraction”

  • “Using a data-driven approach to achieve closure” <- [OCP is not only an OO principle, see Craig Larman’s Protected Variation: The Importance of Being Closed for more]

    • “If we must close the derivatives […] from knowledge of one another, we can use a table-driven approach”

    • “The only item that is not closed against [the rule that involves] the various derivatives is the table itself. An that table can be placed in its own module, separated from all the other modules, so that changes to it do not affect any of the other modules”

  • “In many ways the OCP is at the heart of OOD.”

  • “Yet conformance to [OCP] is not achieved by using an OOP language. Nor is it a good idea to apply rampant abstraction to every part of the application. Rather it requires a dedication on the part of the developers to apply abstraction only to those parts of the program that exhibit frequent change. <- [applying Beck’s Rate of Change principle from Implementation Patterns]”

  • “Resisting premature abstraction is as important as abstraction itself <- [related to Sandi Metz’s “duplication is far cheaper than the wrong abstraction”]”

For me getting closer to the sources of SOLID principles was a great experience that helped me to remove illusions of knowledge I had developed due to the telephone game effect caused by initially learning about SOLID through blog posts and talks. I hope these notes on OCP might be useful to you as well, and motivate you to read a bit closer to some of the sources.

This post was also published in Codesai's blog.There's also a previous version of this post in this blog.

Thursday, June 27, 2019

An example of listening to the tests to improve a design

Introduction.

Recently in the B2B team at LIFULL Connect, we improved the validation of the clicks our API receive using a service that detects whether the clicks were made by a bot or a human being.

So we used TDD to add this new validation to the previously existing validation that checked if the click contained all mandatory information. This was the resulting code:

and these were its tests:

The problem with these tests is that they know too much. They are coupled to many implementation details. They not only know the concrete validations we apply to a click and the order in which they are applied, but also details about what gets logged when a concrete validations fails. There are multiple axes of change that will make these tests break. The tests are fragile against those axes of changes and, as such, they might become a future maintenance burden, in case changes along those axes are required.

So what might we do about that fragility when any of those changes come?

Improving the design to have less fragile tests.

As we said before the test fragility was hinting to a design problem in the ClickValidation code. The problem is that it’s concentrating too much knowledge because it’s written in a procedural style in which it is querying every concrete validation to know if the click is ok, combining the result of all those validations and knowing when to log validation failures. Those are too many responsibilities for ClickValidation and is the cause of the fragility in the tests.

We can revert this situation by changing to a more object-oriented implementation in which responsibilities are better distributed. Let’s see how that design might look:

1. Removing knowledge about logging.

After this change, ClickValidation will know nothing about looging. We can use the same technique to avoid knowing about any similar side-effects which concrete validations might produce.

First we create an interface, ClickValidator, that any object that validates clicks should implement:

Next we create a new class NoBotClickValidator that wraps the BotClickDetector and adapts[1] it to implement the ClickValidator interface. This wrapper also enrichs BotClickDetector’s’ behavior by taking charge of logging in case the click is not valid.

These are the tests of NoBotClickValidator that takes care of the delegation to BotClickDetector and the logging:

If we used NoBotClickValidator in ClickValidation, we’d remove all knowledge about logging from ClickValidation.

Of course, that knowledge would also disappear from its tests. By using the ClickValidator interface for all concrete validations and wrapping validations with side-effects like logging, we’d make ClickValidation tests robust to changes involving some of the possible axis of change that were making them fragile:

  1. Changing the interface of any of the individual validations.
  2. Adding side-effects to any of the validations.

2. Another improvement: don't use test doubles when it's not worth it[2].

There’s another way to make ClickValidation tests less fragile.

If we have a look at ClickParamsValidator and BotClickDetector (I can’t show their code here for security reasons), they have very different natures. ClickParamsValidator has no collaborators, no state and a very simple logic, whereas BotClickDetector has several collaborators, state and a complicated validation logic.

Stubbing ClickParamsValidator in ClickValidation tests is not giving us any benefit over directly using it, and it’s producing coupling between the tests and the code.

On the contrary, stubbing NoBotClickValidator (which wraps BotClickDetector) is really worth it, because, even though it also produces coupling, it makes ClickValidation tests much simpler.

Using a test double when you’d be better of using the real collaborator is a weakness in the design of the test, rather than in the code to be tested.

These would be the tests for the ClickValidation code with no logging knowledge, after applying this idea of not using test doubles for everything:

Notice how the tests now use the real ClickParamsValidator and how that reduces the coupling with the production code and makes the set up simpler.

3. Removing knowledge about the concrete sequence of validations.

After this change, the new design will compose validations in a way that will result in ClickValidation being only in charge of combining the result of a given sequence of validations.

First we refactor the click validation so that the validation is now done by composing several validations:

The new validation code has several advantages over the previous one:

  • It does not depend on concrete validations any more
  • It does not depend on the order in which the validations are made.

It has only one responsibility: it applies several validations in sequence, if all of them are valid, it will accept the click, but if any given validation fails, it will reject the click and stop applying the rest of the validations. If you think about it, it’s behaving like an and operator.

We may write these tests for this new version of the click validation:

These tests are robust to the changes making the initial version of the tests fragile that we described in the introduction:

  1. Changing the interface of any of the individual validations.
  2. Adding side-effects to any of the validations.
  3. Adding more validations.
  4. Changing the order of the validation.

However, this version of ClickValidationTest is so general and flexible, that using it, our tests would stop knowing which validations, and in which order, are applied to the clicks[3]. That sequence of validations is a business rule and, as such, we should protect it. We might keep this version of ClickValidationTest only if we had some outer test protecting the desired sequence of validations.

This other version of the tests, on the other hand, keeps protecting the business rule:

Notice how this version of the tests keeps in its setup the knowledge of which sequence of validations should be used, and how it only uses test doubles for NoBotClickValidator.

4. Avoid exposing internals.

The fact that we’re injecting into ClickValidation an object, ClickParamsValidator, that we realized we didn’t need to double, it’s a smell which points to the possibility that ClickParamsValidator is an internal detail of ClickValidation instead of its peer. So by injecting it, we’re coupling ClickValidation users, or at least the code that creates it, to an internal detail of ClickValidation: ClickParamsValidator.

A better version of this code would hide ClickParamsValidator by instantiating it inside ClickValidation’s constructor:

With this change ClickValidation recovers the knowledge of the sequence of validations which in the previous section was located in the code that created ClickValidation.

There are some stereotypes that can help us identify real collaborators (peers)[4]:

  1. Dependencies: services that the object needs from its environment so that it can fulfill its responsibilities.
  2. Notifications: other parts of the system that need to know when the object changes state or performs an action.
  3. Adjustments or Policies: objects that tweak or adapt the object’s behaviour to the needs of the system.

Following these stereotypes, we could argue that NoBotClickValidator is also an internal detail of ClickValidation and shouldn’t be exposed to the tests by injecting it. Hiding it we’d arrive to this other version of ClickValidation:

in which we have to inject the real dependencies of the validation, and no internal details are exposed to the client code. This version is very similar to the one we’d have got using tests doubles only for infrastructure.

The advantage of this version would be that its tests would know the least possible about ClickValidation. They’d know only ClickValidation’s boundaries marked by the ports injected through its constructor, and ClickValidation`’s public API. That will reduce the coupling between tests and production code, and facilitate refactorings of the validation logic.

The drawback is that the combinations of test cases in ClickValidationTest would grow, and may of those test cases would talk about situations happening in the validation boundaries that might be far apart from ClickValidation’s callers. This might make the tests hard to understand, specially if some of the validations have a complex logic. When this problem gets severe, we may reduce it by injecting and use test doubles for very complex validators, this is a trade-off in which we decide to accept some coupling with the internal of ClickValidation in order to improve the understandability of its tests. In our case, the bot detection was one of those complex components, so we decided to test it separately, and inject it in ClickValidation so we could double it in ClickValidation’s tests, which is why we kept the penultimate version of ClickValidation in which we were injecting the click-not-made-by-a-bot validation.

Conclusion.

In this post, we tried to play with an example to show how listening to the tests[5] we can detect possible design problems, and how we can use that feedback to improve both the design of our code and its tests, when changes that expose those design problems are required.

In this case, the initial tests were fragile because the production code was procedural and had too many responsibilities. The tests were fragile also because they were using test doubles for some collaborators when it wasn’t worth to do it.

Then we showed how refactoring the original code to be more object-oriented and separating better its responsibilities, could remove some of the fragility of the tests. We also showed how reducing the use of test doubles only to those collaborators that really needs to be substituted can improve the tests and reduce their fragility. Finally, we showed how we can go too far in trying to make the tests flexible and robust, and accidentally stop protecting a business rule, and how a less flexible version of the tests can fix that.

When faced with fragility due to coupling between tests and the code being tested caused by using test doubles, it’s easy and very usual to “blame the mocks”, but, we believe, it would be more productive to listen to the tests to notice which improvements in our design they are suggesting. If we act on this feedback the tests doubles give us about our design, we can use tests doubles in our advantage, as powerful feedback tools[6], that help us improve our designs, instead of just suffering and blaming them.

Acknowledgements.

Many thanks to my Codesai colleagues Alfredo Casado, Fran Reyes, Antonio de la Torre and Manuel Tordesillas, and to my Aprendices colleagues Paulo Clavijo, Álvaro García and Fermin Saez for their feedback on the post, and to my colleagues at LIFULL Connect for all the mobs we enjoy together.

Footnotes:

[2] See Test Smell: Everything is mocked by Steve Freeman where he talks about things you shouldn't be substituting with tests doubles.
[3] Thanks Alfredo Casado for detecting that problem in the first version of the post.
[4] From Growing Object-Oriented Software, Guided by Tests > Chapter 6, Object-Oriented Style > Object Peer Stereotypes, page 52. You can also read about these stereotypes in a post by Steve Freeman: Object Collaboration Stereotypes.
[5] Difficulties in testing might be a hint of design problems. Have a look at this interesting series of posts about listening to the tests by Steve Freeman.
[6] According to Nat Pryce mocks were designed as a feedback tool for designing OO code following the 'Tell, Don't Ask' principle: "In my opinion it's better to focus on the benefits of different design styles in different contexts (there are usually many in the same system) and what that implies for modularisation and inter-module interfaces. Different design styles have different techniques that are most applicable for test-driving code written in those styles, and there are different tools that help you with those techniques. Those tools should give useful feedback about the external and *internal* quality of the system so that programmers can 'listen to the tests'. That's what we -- with the help of many vocal users over many years -- designed jMock to do for 'Tell, Don't Ask' object-oriented design." (from a conversation in Growing Object-Oriented Software Google Group).

I think that if your design follows a different OO style, it might be preferable to stick to a classical TDD style which nearly limits the use of test doubles only to infrastructure and undesirable side-effects.

Sunday, September 3, 2017

Data clumps, primitive obsession and hidden tuples

During the writing of a recent post about connascence for Codesai's blog some of us were discussing whether we could consider a data clump a form of Connascence of Meaning (CoM) or not. In the end, we agreed that data clumps are indeed a form of CoM and that introducing a class for the missing abstraction reduces their connascence to Connascence of Type (CoT).

I had wondered in the past why we use a similar refactoring to eliminate both primitive obsession and data clump smells. Thinking about them from the point of view of connascence has helped me a lot to understand why.

I had also an alternative and curious reasoning to get to the same conclusion, in which a data clump gets basically reduced to an implicit form of primitive obsession. The reasoning is as it follows:

The concept of primitive obsession might be extended to consider the collections that a given language offers as primitives. In such cases, encapsulating the collection reifies a new concept that might attract code that didn't have where to "live" and thus was scattered all over. So far so good.

From the point of view of connascence, primitive obsession is a form of CoM that we transform into CoT by introducing a new type and then we might find Connascence of Algorithm (CoA) that we'd remove by moving the offending code inside the new type.

The composing elements of a data clump only make sense when they go together. This means that they're conceptually (but implicitly) grouped. In this sense a data clump could be seen as a "hidden or implicit tuple".

Having this "hidden collection" in mind is now easier to see how closely related the data clump and primitive obsession smells are. In this sense, we remove a data clump by encapsulating a collection, its "implicit or hidden tuple", inside a new class. Again, from the point of view of connascence, this encapsulation reduces CoM to CoT and might make evident some CoA that will make us move some behavior into the new class that becomes a value object.

This "implicit tuple" reasoning helped me to make more explicit the mental process that was leading me to end up doing very similar refactorings to remove both code smells.

However I think that CoM unifies both cases much more easily than relating the two smells.

The fact that the collection (the grouping of the elements of a data clump) is implicit also makes it more difficult to recognize a data clump as CoM in the first place. That's why I think that a data clump is a more implicit example of CoM than primitive obsession, and, thus, we might consider its CoM to be stronger than the primitive obsession's one.

A curious reasoning, right?

Wednesday, August 23, 2017

Notes on OCP from Agile Principles, Practices and Patterns book

This post continues with the series of posts publishing my notes about SOLID principles taken from Robert C. Martin's wonderful Agile Principles, Practices and Patterns in C# book.

  • OCP -> "Software entities (classes, modules, functions, etc) should be open for extension but closed for modification <- [Martin's definition. The origin of the principle comes from Bertrand Meyer that gave it a slightly different definition]"
  • "When a single change to a program results in a cascade of changes to dependent modules, the design smells of fragility" <- [related with violating Beck's Local Consequences principle from Implementation Patterns] OCP advises us to refactor the system so that further changes of that kind will not cause more modifications. <- [related with Cockburn's & Larman's Protected Variations] If OCP is applied well, further changes of that kind are achieved by adding new code, not by changing old code that already works"
  • "It's possible to create abstractions that are fixed and yet represent an unbounded group of possible behaviors"
  • "[A module that uses such abstractions] can be closed for modification, since it depends on an abstraction that is fixed. Yet the behavior of the module can be extended by creating new derivatives of the abstraction"
  • "Abstract classes are more closely associated to their clients than to the classes that implement them" <- [related with Separated Interface from Fowler's P of EAA book]
  • "[Strategy and Template Method patterns] are the most common ways to satisfy OCP. They represent a clear separation of generic functionality from the detailed implementation of that functionality"
  • Anticipation
    • "[When a] program conforms to OCP. It is changed by adding code rather than by changing existing code"
    • "In general no matter how "closed" a module is, there will always be some kind of change against which it is not closed"
    • "Since closure can't be complete, it must be strategic. That is the designer must choose the kinds of changes against which to close the design, must guess at the kinds of changes that are most likely, and then construct abstractions to protect against those changes."
    • "This is not easy. It amounts to making educated guesses about the likely kinds of changes that the application will suffer over time."
    • "Also conforming to OCP is expensive. It takes development time and money to create the appropriate abstractions. These abstractions also increase the complexity of the software design"
    • "We want to limit the application of OCP to changes that are likely"
    • "How do we know which changes are likely? We do the appropriate research, we ask the appropriate questions, and we use or experience and common sense. And after all that, we wait until the changes happen!" <- [relates with Yagni, also talking about learning about your domain] "We don't want to load the design with lots of unnecessary abstractions. <- [related with Metz's The Wrong Abstraction] Rather we want to wait until we need the abstraction and then put them in"
  • "Fool me once"
    • "... we initially write our code expecting it not to change. When a change occurs, we implement the abstractions that protect us from future changes of that kind."
    • "If we decide to take the first bullet, it is to our advantage to get the bullets flying early and frequently. We want to know what changes are likely before we are very far down the development path. The longer we wait to find out what kind of changes are likely, the more difficult it will be to create the appropriate abstractions."
    • "Therefore, we need to stimulate changes"
      • "We write test first" -> "testing is one kind of usage of the system. By writing tests first, we force the system to be testable. Therefore, changes in testability will not surprise us later. We will have built the abstractions that make the system testable. We are likely to find that many of these abstractions will protect us from other kinds of changes later <- [related with Feather's The Deep Synergy Between Testability and Good Design]."
      • "We use short development cycles"
      • "We develop features before infrastructure and frequently show those features to stake-holders"
      • "We develop the most important features first"
      • "We release the software early and often"
  • "Closure is based on abstraction"
  • "Using a data-driven approach to achieve closure"
    • "If we must close the derivatives [...] from knowledge of one another, we can use a table-driven approach"
    • "The only item that is not closed against [the rule that involves] the various derivatives is the table itself. An that table can be placed in its own module, separated from all the other modules, so that changes to it do not affect any of the other modules"
  • "In many ways the OCP is at the heart of OOD."
  • "Yet conformance to [OCP] is not achieved by using and OOP language. Nor is it a good idea to apply rampant abstraction to every part of the application. Rather it requires a dedication on the part of the developers to apply abstraction only to those parts of the program that exhibit frequent change."
  • "Resisting premature abstraction is as important as abstraction itself <- [related with Metz's The Wrong Abstraction]"
Other posts in this series:

Tuesday, August 22, 2017

In a small piece of code

This post appeared originally on Codesai’s Blog.

In a previous post we talked about positional parameters and how they can suffer from Connascence of Position, (CoP). Then we saw how, in some cases, we might introduce named parameters to remove the CoP and transform it into Connascence of Name, (CoN), but always being careful to not hiding cases of Connascence of Meaning, (CoM). In this post we’ll focus on languages that don’t provide named parameters and see different techniquess to remove the CoP.

Let’s see an example of a method suffering of CoP:

In languages without named parameters (the example is written in Java), we can apply a classic[1] refactoring technique, Introduce Parameter Object, that can transform CoP into CoN. In this example, we introduced the ClusteringParameters object:

which eliminates the CoP transforming it into CoN:

In this particular case, all the parameters passed to the function were semantically related, since they all were parameters of the clustering algorithm, but in many other cases all the parameters aren’t related. So, as we saw in our previous post for named parameters, we have to be careful of not accidentally sweeping hidden CoM in the form of data clumps under the rug when we use the Introduce Parameter Object refactoring.

In any case, what it’s clear is that introducing a parameter object produces much less expressive code than introducing named parameters. So how to gain semantics while removing CoP in languages without named parameters?

One answer is using fluent interfaces[2] which is a technique that is much more common than you think. Let’s have a look at the following small piece of code:

This is just a simple test. However, just in this small piece of code, we can find two examples of removing CoP using fluent interfaces and another example that, while not removing CoP, completely removes its impact on expressiveness. Let’s look at them with more detail.

The first example is an application of the builder pattern using a fluent interface[3].

Applying the builder pattern provides a very specific[4] internal DSL that we can use to create a complex object avoiding CoP and also getting an expressiveness comparable or even superior to the one we’d get using named parameters.

In this case we composed two builders, one for the SafetyRange class:

and another for the Alarm class:

Composing builders you can manage to create very complex objects in a maintanable and very expressive way.

Let’s see now the second interesting example in our small piece of code:

This assertion using hamcrest is so simple that the JUnit alternative is much clearer:

but for more than one parameter the JUnit interface starts having problems:

Which one is the expected value and which one is the actual one? We never manage to remember…

Using hamcrest removes that expressiveness problem:

Thanks to the semantics introduced by hamcrest[6], it’s very clear that the first parameter is the actual value and the second parameter is the expected one. The internal DSL defined by hamcrest produces declarative code with high expressiveness. To be clear hamcrest is not removing the CoP, but since there are only two parameters, the degree of CoP is very low[7]. The real problem of the code using the JUnit assertion was its low expressiveness and using hamcrest fixes that.

For us it’s curious to see how, in trying to achieve expressiveness, some assertion libraries that use fluent interfaces have (probably not being aware of it) eliminate CoP as well. See this other example using Jasmine:

Finally, let’s have a look at the last example in our initial small piece of code which is also using a fluent interface:

This is Mockito’s way of defining a stub for a method call. It’s another example of fluent interface which produces highly expressive code and avoids CoP.

Summary.

We started seeing how, in languages that don’t allow named parameters, we can remove CoP by applying the Introduce Parameter Object refactoring and how the resulting code was much less expressive than the one using the Introducing Named Parameters refactoring. Then we saw how we can leverage fluent interfaces to remove CoP while writing highly expressive code, mentioned internal DSLs and showed you how this technique is more common that one can think at first by examining a small piece of code.

References.

Books.

Posts.

Footnotes:

[2] Of course, fluent interfaces are also great in languages that provide named parameters.
[3] Curiosly there're alternative ways to implement the builder pattern that use options maps or named parameters. Some time ago we wrote about an example of using the second way: Refactoring tests using builder functions in Clojure/ClojureScript.
[4] The only purpose of that DSL is creating one specific type of object.
[5] For us the best explanation of the builder pattern and how to use it to create maintanable tests is in chapter 22, Constructing Complex Test Data, of the wonderful Growing Object-Oriented Software Guided by Tests book.
[6] hamcrest is a framework for writing matcher objects allowing 'match' rules to be defined declaratively. We love it!

Tuesday, August 15, 2017

Notes on SRP from Agile Principles, Practices and Patterns book

I think that if you rely only on talks, community events, tweets and posts to learn about a concept, you can sometimes end up with diluted (or even completely wrong) versions of the concept due to broken telephone game effects. For this reason, I think it's important to try instead to get closer to the sources of the concepts you want to learn.

Lately I've been doing some study on object-oriented concepts doing an effort to get closer to the sources. These are the resulting notes on Single Responsibility Principle I've taken from the chapter devoted to it in Robert C. Martin's wonderful Agile Principles, Practices and Patterns in C# book:

  • "This principle was described in the work of [Larry Constantine, Ed Yourdon,] Tom DeMarco and Meilir Page-Jones. They called it cohesion, which they defined as the functional relatedness of the elements of a module" <- [!!!]
  • "... we modify that meaning a bit and relate cohesion to the forces that cause a module, or a class, to change"
  • [SRP definition] -> "A class should have only one reason to change"
  • "Why was important to separate [...] responsibilities [...]? The reason is that each responsibility is an axis of change" <- [related with Mateu Adsuara's complexity dimensions]
  • "If a class has more than one responsibility the responsibilities become coupled" <- [related with Long Method, Large Class, etc.] <- [It also eliminates the possibility of using composition at every level (functions, classes, modules, etc.)] "Changes to one responsibility may impair or inhibit the class ability to meet the others. This kind of coupling leads to fragile designs" <- [For R. C. Martin, fragility is a design smell, a design is fragile when it's easy to break]
  • [Defining what responsibility means]
    • "In the context of the SRP, we define a responsibility to be a reason for change"
    • "If you can think of more than one motive for changing a class, that class has more than one responsibility. This is sometimes difficult to see"
  • "Should [...] responsibilities be separated? That depends on how the application is changing. If the application is not changing in ways that cause the [...] responsibilities to change at different times, there is no need to separate them." <- [applying Beck's Rate of Change principle from Implementation Patterns] "Indeed separating them would smell of needless complexity" <- [Needless Complexity is a design smell for R. C. Martin. It's equivalent to Speculative Generality from Refactoring book]
  • "An axis of change is an axis of change only if the changes occur" <- [relate with Speculative Generality and Yagni] "It's not wise to apply SRP, or any other principle if there's no symptom" <- [I think this applies at class and module level, but it's still worth it to always try to apply SRP at method level, as a responsibility identification and learning process]
  • "There are often reasons, having to do with the details of hardware and the OS [example with a Modem implementing two interfaces DateChannel and Connection], that force us to couple things that we'd rather not couple. However by separating their interfaces, we [...] decouple[..] the concepts as far as the rest of the application is concerned" <- [Great example of using ISP and DIP to hide complexity to the clients] "We may view [Modem] as a kludge, however, note that all dependencies flow away from it." <- [thanks to DIP] "Nobody needs to depend on this class [Modem]. Nobody except main needs to know it exists" <- [main is the entry point where the application is configured using dependency injection] "Thus we've put the ugly bit behind a fence. It's ugliness need not leak out and pollute the rest of the app"
  • "SRP is one of the simplest of the principles but one of the most difficult to get right"
  • "Conjoining responsibilities is something that we do naturally"
  • "Finding and separating those responsibilities is much of what software design is really about. Indeed the rest of the principles we discuss come back to this issue in one way or another"
Agile Principles, Practices and Patterns in C# is a great book that I recommend to read. For me getting closer to the sources of SOLID principles has been a great experience that has helped me to remove illusions of knowledge I had developed due to the telephone game effect of having learned it through blogs and talks.

Other posts in this series:

Monday, July 31, 2017

Two examples of Connascence of Position

This post appeared originally on Codesai’s Blog.

As we saw in our previous post about connascence, Connascence of Position (CoP) happens when multiple components must be adjacent or appear in a particular order. CoP is the strongest form of static connascence, as shown in the following figure.
Connascence forms sorted by descending strength (from Kevin Rutherford's XP Surgery).
A typical example of CoP appears when we use positional parameters in a method signature because any change in the order of the parameters will force to change all the clients using the method.

The degree of the CoP increases with the number of parameters, being zero when we have only one parameter. This is closely related with the Long Parameters List smell.
In some languages, such as Ruby, Clojure, C#, Python, etc, this can be refactored by introducing named parameters (see Introduce Named Parameter refactoring)[1].

Now changing the order of parameters in the signature of the method won’t force the calls to the method to change, but changing the name of the parameters will. This means that the resulting method no longer presents CoP. Instead, now it presents Connascence of Name, (CoN), which is the weakest form of static connascence, so this refactoring has reduced the overall connascence.

The benefits don’t end there. If we have a look at the calls before and after the refactoring, we can see how the call after introducing named parameters communicates the intent of each parameter much better. Does this mean that we should use named parameters everywhere?

Well, it depends. There’re some trade-offs to consider. Positional parameters produce shorter calls. Using named parameters gives us better code clarity and maintainability than positional parameters, but we lose terseness[2]. On the other hand, when the number of parameters is small, a well chosen method name can make the intent of the positional arguments easy to guess and thus make the use of named parameters redundant.

We should also consider the impact that the degree and locality of each instance of CoP[3] can have on the maintainability and communication of intent of each option. On one hand, the impact on maintainability of using positional parameters is higher for public methods than for private methods (even higher for published public methods)[4]. On the other hand, a similar reasoning might be made about the intent of positional parameters: the positional parameters of a private method in a cohesive class might be much easier to understand than the parameters of a public method of a class a client is using, because in the former case we have much more context to help us understand.

The communication of positional parameters can be improved a lot with the parameter name hinting feature provided by IDEs like IntelliJ. In any case, even though they look like named parameters, they still are positional parameters and have CoP. In this sense, parameter name hinting might end up having a bad effect in your code by reducing the pain of having long parameter lists.

Finally, moving to named parameters can increase the difficulty of applying the most frequent refactoring: renaming. Most IDEs are great renaming positional parameters, but not all are so good renaming named parameters.

A second example.

There are also cases in which blindly using named parameters can make things worse. See the following example:

The activate_alarm method presents CoP, so let’s introduce named parameters as in the previous example:

We have eliminated the CoP and now there’s only CoN, right?

In this particular case, the answer would be no. We’re just masking the real problem which was a Connascence of Meaning (CoM) (a.k.a. Connascence of Convention). CoM happens when multiple components must agree on the meaning of specific values[5]. CoM is telling us that there might be a missing concept or abstraction in our domain. The fact that the lower_threshold and higher_threshold only make sense when they go together, (we’re facing a case of data clump), is an implicit meaning or convention on which different methods sharing those parameters must agree, therefore there’s CoM.

We can eliminate the CoM by introducing a new class, Range, to wrap the data clump and reify the missing concept in our domain reducing the CoM to Connascence of Type (CoT)[6]. This refactoring plus the introduction of named parameters leaves with the following code:

This refactoring is way better than only introducing named parameters because it does not only provides a bigger coupling reduction by going down in the scale from from CoP to CoT instead of only from CoP to CoM, but also it introduces more semantics by adding a missing concept (the Range object).

Later we’ll probably detect similarities[7] in the way some functions that receives the new concept are using it and reduce it by moving that behavior into the new concept converting it in a value object. It’s in this sense that we say that value objects attract behavior.

Summary.

We have presented two examples of CoP, a “pure” one and another one that was really hiding a case of CoM. We have related CoP and CoM with known code smells, (Long Parameters List, Data Clump and Primitive Obsession), and introduced refactorings to reduce their coupling and improve their communication of intent. We have also discussed a bit, about when and what we need to consider before applying these refactorings.

References.

Talks.

Books.

Posts.

Footnotes.

:
[1] For languages that don't allow named parameters, see the Introduce Parameter Object refactoring.
[3] See our previous post About Connascence.
[4] For instance, Sandi Metz recommends in her POODR book to "use hashes for initialization arguments" in constructors (this was the way of having named parameters before Ruby 2.0 introduced keyword arguments).
[5] Data Clump and Primitive Obsession smells are examples of CoM.
[6] Connascence of Type, (CoT), happens when multiple components must agree on the type of an entity.
[7] Those similarities in the use of the new concept are examples of Conascence of Algorithm which happens when multiple components must agree on a particular algorithm.

Thursday, January 26, 2017

About Connascence

This post appeared originally on Codesai’s Blog.

Lately at Codesai we’ve been studying and applying the concept of connascence in our code and even have done an introductory talk about it. We’d like this post to be the first of a series of posts about connascence.

 

1. Origin.

The concept of connascence is not new at all. Meilir Page-Jones introduced it in 1992 in his paper Comparing Techniques by Means of Encapsulation and Connascence. Later, he elaborated more on the idea of connascence in his What every programmer should know about object-oriented design book from 1995, and its more modern version (same book but using UML) Fundamentals of Object-Oriented Design in UML from 1999.
Ten years later, Jim Weirich, brought connascence back from oblivion in a series of talks: Grand Unified Theory of Software Design, The Building Blocks of Modularity and Connascence Examined. As we’ll see later in this post, he did not only bring connascence back to live, but also improved its exposition.
More recently, Kevin Rutherford, wrote a very interesting series of posts, in which he talked about using connascence as a guide to choose the most effective refactorings and about how connascence can be a more objective and useful tool than code smells to identify design problems[1].

 

2. What is connascence?

The concept of connascence appeared in a time, early nineties, when OO was starting its path to become the dominant programming paradigm, as a general way to evaluate design decisions in an OO design. In the previous dominant paradigm, structured programming, fan-out, coupling and cohesion were fundamental design criteria used to evaluate design decisions. To make clear what Page-Jones understood by these terms, let’s see the definitions he used:
Fan-out is a measure of the number of references to other procedures by lines of code within a given procedure.
Coupling is a measure of the number and strength of connections between procedures.
Cohesion is a measure of the “single-mindedness” of the lines of code within a given procedure in meeting the purpose of that procedure.
According to Page-Jones, these design criteria govern the interactions between the levels of encapsulation that are present in structured programming: level-1 encapsulation (the subroutine) and level-0 (lines of code), as can be seen in the following table from Fundamentals of Object-Oriented Design in UML.

Encapsulation levels and design criteria in structured programming.

However, OO introduces at least level-2 encapsulation, (the class), which encapsulates level-1 constructs (methods) together with attributes. This introduces many new interdependencies among encapsulation levels, which will require new design criteria to be defined, (see the following table from Fundamentals of Object-Oriented Design in UML).

Encapsulation levels and design criteria in OO.

Two of these new design criteria are class cohesion and class coupling, which are analogue to the structured programing’s procedure cohesion and procedure coupling, but, as you can see, there are other ones in the table for which there isn’t even a name.
Connascence is meant to be a deeper criterion behind all of them and, as such, it is a general way to evaluate design decisions in an OO design. This is the formal definition of connascence by Page-Jones:
Connascence between two software elements A and B means either
  1. that you can postulate some change to A that would require B to be changed (or at least carefully checked) in order to preserve overall correctness, or
  2. that you can postulate some change that would require both A and B to be changed together in order to preserve overall correctness.
In other words, there is connascence between two software elements when they must change together in order for the software to keep working correctly.
We can see how this new design criteria can be used for any of the interdependencies among encapsulation levels present in OO. Moreover, it can also be used for higher levels of encapsulation (packages, modules, components, bounded contexts, etc). In fact, according to Page-Jones, connascence is applicable to any design paradigm with partitioning, encapsulation and visibility rules[2].

 

3. Forms of connascence.

Page-Jones distinguishes several forms (or types) of connascence.
Connascence can be static, when it can be assessed from the lexical structure of the code, or dynamic, when it depends on the execution patterns of the code at run-time.
There are several types of static connascence:
  • Connascence of Name (CoN): when multiple components must agree on the name of an entity.
  • Connascence of Type (CoT): when multiple components must agree on the type of an entity.
  • Connascence of Meaning (CoM): when multiple components must agree on the meaning of specific values.
  • Connascence of Position (CoP): when multiple components must agree on the order of values.
  • Connascence of Algorithm (CoA): when multiple components must agree on a particular algorithm.
There are also several types of dynamic connascence:
  • Connascence of Execution (order) (CoE): when the order of execution of multiple components is important.
  • Connascence of Timing (CoTm): when the timing of the execution of multiple components is important.
  • Connascence of Value (CoV): when there are constraints on the possible values some shared elements can take. It’s usually related to invariants.
  • Connascence of Identity (CoI): when multiple components must reference the entity.
Another important form of connascence is contranascence which exists when elements are required to differ from each other (e.g., have different name in the same namespace or be in different namespaces, etc). Contranascence may also be either static or a dynamic.

 

4. Properties of connascence.

Page-Jones talks about two important properties of connascence that help measure its impact on maintanability:
  • Degree of explicitness: the more explicit a connascence form is, the weaker it is.
  • Locality: connascence across encapsulation boundaries is much worse than connascence between elements inside the same encapsulation boundary.
A nice way to reformulate this is using what it’s called the three axes of connascence[3]:

4.1. Degree.

The degree of an instance of connascence is related to the size of its impact. For instance, a software element that is connascent with hundreds of elements is likely to become a larger problem than one that is connascent to only a few.

4.2 Locality.

The locality of an instance of connascence talks about how close the two software elements are to each other. Elements that are close together (in the same encapsulation boundary) should typically present more, and higher forms of connascence than elements that are far apart (in different encapsulation boundaries). In other words, as the distance between software elements increases, the forms of connascence should be weaker.

4.3 Stregth.

Page-Jones states that connascence has a spectrum of explicitness. The more implicit a form of connascence is, the more time consuming and costly it is to detect. Also a stronger form of connascence is usually harder to refactor. Following this reasoning, we have that stronger forms of connascence are harder to detect and/or refactor. This is why static forms of connascence are weaker (easier to detect) than the dynamic ones, or, for example, why CoN is much weaker (easier to refactor) than CoP.
The following figure by Kevin Rutherford shows the different forms of connascence we saw before, but sorted by descending strength.

 Connascence forms sorted by descending strength (from Kevin Rutherford's XP Surgery).

 

5. Connascence, design principles and refactoring.

Connascence is simpler than other design principles, such as, the SOLID principles, Law of Demeter, etc. In fact, it can be used to see those principles in a different light, as they can be seen using more fundamental principles like the ones in the first chapter of Kent Beck’s Implementation Patterns book.
We use code smells, which are a collection of code quality antipatterns, to guide our refactorings and improve our design, but, according to Kevin Rutherford, they are not the ideal tool for this task[4]. Sometimes connascence might be a better metric to reason about coupling than the somewhat fuzzy concept of code smells.
Connascence gives us a more precise vocabulary to talk and reason about coupling and cohesion[5], and thus helps us to better judge our designs in terms of coupling and cohesion, and decide how to improve them. In words of Gregory Brown, “this allows us to be much more specific about the problems we’re dealing with, which makes it it easier to reason about the types of refactorings that can be used to weaken the connascence between components”.
It provides a classification of forms of coupling in a system, and even better, a scale of the relative strength of the coupling each form of connascence generates. It’s precisely that scale of relative strengths what makes connascence a much better guide for refactoring. As Kevin Rutherford says:
"because it classifies the relative strength of that coupling, connascence can be used as a tool to help prioritize what should be refactored first"
Connascence explains why doing a given refactoring is a good idea.

 

6. How should we apply connascence?

Page-Jones offers three guidelines for using connascence to improve systems maintanability:
  1. Minimize overall connascence by breaking the system into encapsulated elements.
  2. Minimize any remaining connascence that crosses encapsulation boundaries.
  3. Maximize the connascence within encapsulation boundaries.
According to Kevin Rutherford, the first two points conforms what he calls the Page-Jones refactoring algorithm[6].
These guidelines generalize the structured design ideals of low coupling and high cohesion and is applicable to OO, or, as it was said before, to any other paradigm with partitioning, encapsulation and visibility rules.
They might still be a little subjective, so some of us, prefer a more concrete way to apply connascence using, Jim Weirich’s two principles or rules:
  • Rule of Degree[7]: Convert strong forms of connascence into weaker forms of connascence.
  • Rule of Locality: As the distance between software elements increases, use weaker forms of connascence.

 

7. What’s next?

In future posts, we’ll see examples of concrete forms of conasscence relating them with design principles, code smells, and refactorings that might improve the design.

Footnotes:
[1] See Kevin Rutherford's great post The problem with code smells.
[2] This explains the titles Jim Weirich chose for his talks: Grand Unified Theory of Software Design and The Building Blocks of Modularity.
[4] Again see Kevin Rutherford's great post The problem with code smells.
[5] The concepts of coupling and cohesion can be hard to grasp, just see this debate about them Understanding Coupling and Cohesion hangout.
[6] See Kevin Rutherford's post The Page-Jones refactoring algorithm.
[7] Even though he used the word degree, he was actually talking about strength.

 

References.

Books.

Papers.

Talks.

Posts.

Others.