Sunday, November 27, 2011

Interesting Talk: "Listening to test smells"

Probably most of you have already watched this Steve Freeman's talk about what test smells can teach us about the design of our code. I watched it last week and found it great. I post the link just in case:

Saturday, November 26, 2011

Sorting a vector of pointers to objects using STL and functors

Imagine you have a stl::vector container that contains pointers to objects of some class, and that you'd like to sort them based on the value of some member variable using a STL algorithm. The approach of overloading the < operator of the class wouldn't work in this case. It works only in case the vector contains objects.

What can we do then?

We can solve this problem using functors or function objects.

Quoting the link above:
"Functors are functions with a state. In C++ you can realize them as a class with one or more private members to store the state and with an overloaded operator () to execute the function."
Let's see a simple example using functors to sort a vector of pointer to objects. As I cannot show examples from the code I write at work ( it's top secret ;) ), I've just made up an example using my favorite food: chocolate.

This is the Chocolate class:
#ifndef __CHOCOLATE_H__
#define __CHOCOLATE_H__

#include <string>

class Chocolate {
  public:
    Chocolate(std::string name, double cocoaPercentage, double price)
    {
      this->cocoaPercentage = cocoaPercentage;
      this->price = price;
      this->name = name;
    };
    ~Chocolate();

    double getCocoaPercentage() { return this->cocoaPercentage; };
    double getPrice() { return this->price; };
    std::string getName() { return this->name; };

  private:
    double cocoaPercentage; 
    double price;
    std::string name;
};
#endif /* __CHOCOLATE_H__ */

Imagine we'd like to sort the following vector:
vector<Chocolate*> chocolates;
using the STL sort algorithm and two different sorting criteria: by price and by cocoa percentage. 

There are two versions of STL sort. The first one takes two parameters, first and last, which are iterators to the initial and final positions of the sequence to be sorted, the range [first, last). This version uses the operator< to compare the elements of the class. As I said above, this version will not work.
The second version accepts, besides first and last, a third parameter, comp, which according to sort's C++ reference is a:
"Comparison function object that, taking two values of the same type than those contained in the range, returns true if the first argument goes before the second argument in the specific strict weak ordering it defines, and false otherwise."
So, to sort our chocolates vector, we'll first have to create two functors: one for each sorting criteria.
I called them ComparatorByPrice and ComparatorByCocoaPercentage:
#ifndef CHOCOLATE_COMPARATORS
#define CHOCOLATE_COMPARATORS

#include "Chocolate.h"

class ComparatorByPrice {
  public:
    bool operator() (Chocolate *a, Chocolate *b) {
      return a->getPrice() < b->getPrice();
    }
};

class ComparatorByCocoaPercentage {
  public:
    bool operator() (Chocolate *a, Chocolate *b) {
      return a->getCocoaPercentage() < b->getCocoaPercentage();
    }
};

#endif /* CHOCOLATE_COMPARATORS */
Notice how we've implemented the () operator for both classes.

Once we have the functors, we just need to pass them to the sort function as its third parameter.
sort(chocolates.begin(), chocolates.end(), ComparatorByPrice());
Notice how we needed to add the () after ComparatorByPrice in order for it to work.

Ok, now that we finally have all the required ingredients to sort our chocolates, let's see the sorting in action:
#include "Chocolate.h"
#include "ChocolateComparators.h"
#include <vector>
#include <algorithm>
#include <iostream>

using namespace std;

void showChocolates(vector<Chocolate*> & chocolates);

int main(int argc, char **argv)
{
  vector<Chocolate*> chocolates;
  chocolates.push_back(new Chocolate("ChocoKoko", 80., 20.));
  chocolates.push_back(new Chocolate("ChocoLolo", 70., 30.));
  chocolates.push_back(new Chocolate("ChocoBebo", 25., 10.));
  chocolates.push_back(new Chocolate("ChocoBrian", 24., 15.));
  chocolates.push_back(new Chocolate("ChocoMiko", 30., 45.));
  
  cout<<"Sorted by price:"<<endl;
  sort(chocolates.begin(), chocolates.end(), ComparatorByPrice());
  showChocolates(chocolates);
  
  cout<<endl<<"Sorted by cocoa percentage:"<<endl;
  sort(chocolates.begin(), chocolates.end(), 
               ComparatorByCocoaPercentage());
  showChocolates(chocolates);
  
  for(unsigned int i=0; i<chocolates.size(); ++i) {
    delete chocolates[i];
  }
  return 0;
}

void showChocolates(vector<Chocolate*> & chocolates) {
  for(unsigned int i=0; i<chocolates.size(); ++i) {
    cout<<chocolates[i]->getName()<< " " 
      <<chocolates[i]->getPrice()<<" "
      <<chocolates[i]->getCocoaPercentage()
      <<"%"<<endl;
  } 
}
This is the output we get:
$ g++ main.cpp ChocolateComparators.h Chocolate.h -o sorters
$ ./sorters
Sorted by price:
ChocoBebo 10 25%
ChocoBrian 15 24%
ChocoKoko 20 80%
ChocoLolo 30 70%
ChocoMiko 45 30%

Sorted by cocoa percentage:
ChocoBrian 15 24%
ChocoBebo 10 25%
ChocoMiko 45 30%
ChocoLolo 30 70%
ChocoKoko 20 80%
In C++11 we could use lambdas instead of functors, but that will be material for a future post.

Update:
See how to do it using lambdas in this newer post: Sorting a vector of pointers to objects using STL and C++11 lambdas

Friday, November 25, 2011

Bash one-line for loops

Bash has become the standard shell on many Linux distributions.
One of my favorite features in Bash are the one-line loops, specially the for ones. Even though you might have listened many times that Bash scripting can become a nightmare due to its complicated and rigid syntax, one-line loops can save you a lot of time.  
As an example, imagine that you have two folders and you need to copy all the files from one folder to the other:
$ ls
folder1  folder2
$ cp folder1/* ../other_folder
$ cp folder2/* ../other_folder
That's alright, but now imagine that you have 1000 folders. Will you manually execute 1000 commands? The answer is obviously no. Here's when one-line bash loops become handy:
$ for i in `ls -d */`;do cp $i/* ../other_folder;done

Let's have a look at how the previous line works. The basic syntax for a for loop in bash is: 
for variable in some_list; do command(s); done 
In our example, we iterate over a list of directories in the current folder using the command `ls -d */`. Note that the "`" is used to execute a command and to be able to assign its output to a variable.

As a second example, imagine that you need to count the number of lines of a bunch of files whose names are comprised of a prefix, such as, "FILE_NAME_" and a suffix that can be a number ranging from 1 to N.
A possible solution using one-line bash loops would be:
$ for i in {1..N}; do cat "FILE_NAME_$i" | wc -l; done
You could also have a list of file names in a file and iterate over them:
$ cat files.list
file_1
file_2
file_3
$ for file in `cat files.list`;do echo $file; done
file_1
file_2
file_3

I hope this post will help you to take advantage of one of my favorite features in Bash :)

Monday, November 21, 2011

Technical friction

A great way to think about the technical debt metaphor to avoid misunderstandings posted by chromatic at Modern Perl Books, a Modern Perl Blog:

Python idioms: Building strings

I'll start a series of posts about Python idioms.
What is a programming idiom? According to Wikipedia's Programming Idiom article, a programming idiom is defined as "the use of an unusual or notable feature that is built in to a programming language".
Python is very rich in useful idioms. Python programmers who take advantage of them are known as pythonistas. On that sense, I highly recommend you to read the famous article Code like a Pythonista: Idiomatic Python by David Goodger.

Once defined the concept of programming idiom and introduced the concept pythonista, let's talk about one of favorite idioms: building strings from substrings.

Imagine that you have a list of strings,
bands = ['Machine Head','Metallica','Opeth','Veil of Maya']
and you want to concatenate each item in the list to form a unique comma separared string. If you come from the C, Java syntax world, you would write something like this:
output = ''
for band in bands[:-1]:
    output += band + ', '
output += bands[-1] 
If you print the content of the output variable, you will get a list of the bands:
print output
>>> Machine Head, Metallica, Opeth, Veil of Maya
This is a very inefficient way to concatenate strings in Python because in each iteration of the for loop, a temporal string is generated before the string addition and thrown away after.

The pythonic way is faster and more elegant:
>>> print ', '.join(bands)
Machine Head, Metallica, Opeth, Veil of Maya

I coded a small example to compare the performance of both techniques:
from functools import wraps
import time

def timed(f):
  @wraps(f)
  def wrapper(*args, **kwds):
    start = time.clock()
    result = f(*args, **kwds)
    elapsed = time.clock() - start
    #print "%s took %d time to finish" % (f.__name__, elapsed)
    print "%.5gs" % (elapsed)
    return result
  return wrapper

bands = ['Machine Head','Metallica','Opeth','Veil of Maya']*100000

@timed
def func1():
  output = ''
  for band in bands[:-1]:
    output += band + ', '
  output += bands[-1]
  return output

@timed 
def func2():
   output = ', '.join(bands)
   return output

func1()
func2()

The result is:

$ python test.py
0.07s
0.01s

If we increase one order of magnitude the bands list:

$ python test.py
0.62s
0.1s

the difference in performance becomes more relevant.

PS: In my toy example, I used the timed decorator that can be found in this stackoverflow thread.

Saturday, November 19, 2011

Dictionary of Algorithms and Data Structures

I've just found "a dictionary of algorithms, algorithmic techniques, data structures, archetypal problems, and related definitions":
Since one of the courses I'm taking at the UOC is about Data Structures I think this place will prove really useful.

Friday, November 18, 2011

Question about different TDD schools at My Agile Education blog

Angela Harms has asked an interesting question at My Agile Education:
I think that J. B. Rainsberger's answer is great.
This is his conclusion:
"The Detroit style leads me to discover types as I go, and the London style leads me to guess-and-refine types as I go. I’m happy to feel comfortable doing both."

Configuring Eclipse for TDD in Python: The Nosetests way

In my daily work, Python is my main (and really beloved) programming language. For coding in Python, I usually work with vim in remote machines and Eclipse IDE in my desktop one. In the next sections I'll explain how to convert Eclipse in the killer TDD Python-IDE, but you can have a look at the official Python Wiki for more options: Python IDE's and Python Editors.

1. Installing Eclipse
In the Eclipse downloads section, get the Eclipse Classic package that corresponds to your platform (I'll focus on Linux in this post, but it should be easy to follow the same steps in Windows and Mac OS platforms as well).
Installing Eclipse is as easy as extracting the downloaded package if you have an existing Java Virtual Machine installed in your system. If not, check this FAQ.

2. PyDev
PyDev is a Python IDE for Eclipse with a lot of goodies such code completion, syntax highlighting, debugger, unit testing support and refactoring options. You can find a very nice installation manual in the offical project page: PyDev Manual (Installing). For lazy readers, it should be as easy as opening Eclipse, go to Help > Install New Software, adding a new software location with the Add button and fill in the location field with the url http://pydev.org/updates. Once added, check PyDev option and follow the instructions.

3. Nosetests
python-nose or nosetests is an unit test-based framework that is capable of discover and execute your tests without having to register them first. Nose makes my life much easier writing test functions to check for exceptions and discovering and executing my test packages. However, I strongly suggest you to read this article to understand which are the main motivations to use a unit test framework and why choosing nose.

Nose is available via setuptools:
$ easy_install nose

Now try to import nose in python:
$ python
Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nose
>>>

If nose installation is fine, you shouldn't see any error messages after importing the nose package in the Python console.

If you don't have setuptools installed in your system, there are a few more options to install nose package, check the official installation guide.

4. Configuring Eclipse for continuous-testing
First of all, create a new PyDev project: File > New > Project, navigate to PyDev and choose "PyDev Project". Click "Next", choose a project name (in our case "tdd-example"), select Grammar Version as 2.6 or 2.7 (nose supports Python 3 syntax, but many plugins not) and select the option "Add project directory to the PYTHONPATH?". Finally, click "Finish". You should see now in the PyDev Package Explorer view your new tdd-example project.

Now, go to RunExternal Tools > External Tools Configurations. In Program option, create a new one. Name could be "NoseTestTool", working directory will be "${project_loc}" and in location the path to nosetests binary, in my case "/usr/local/bin/nosetests".

Right click in the project tdd-example > Properties and navigate to Builders. Click in Import and choose the previous created NoseTestTool. Make sure is the first option selecting it and clicking the Up button. Now, click in Edit. In the Main tab, add in Arguments field "--processes=4" where processes is the number of simultaneous threads that will be used to execute your tests. In my case, I use 4 cores, so I set processes to 4.
In the Environment tab, create a new variable with name "PYTHONPATH" and value "${project_loc}".
In Build Options tab, check the options "Allocate Console (necessary for input)", "Launch in background", "After a Clean", "During manual builds" and "During auto builds".

To test that everything is configured fine, create a new Python file in your project, test_all.py and include the following code:

def test1():
    assert True
    
def test2():
    assert False 

At the time that you save the file, nose may execute your tests automagically and an ouput like this should be shown:

.F
======================================================================
FAIL: test_all.test2
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/nose-1.1.2-py2.6.egg/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/dev/workspace/python/tdd-example/test_all.py", line 6, in test2
    assert False
AssertionError

----------------------------------------------------------------------
Ran 2 tests in 0.002s

FAILED (failures=1)

Nose searches for files and functions that start with "test*". In our case, two functions will be tested, test1 and test2, and one of them will fail (assert False) as seen in the previous output.

I hope this howto will motivate people working with continuous TDD in Python :)

Thursday, November 17, 2011

Watched at work: "STL with Stephan T. Lavavej"

Lately I've been very busy after work because I had some assignments to submit to the UOC (what I'm doing there), so I haven't posted the links of the last three videos we've watched at work.

We're currently watching a series of talks about the STL library published in Microsoft's Channel 9 by Stephan T. Lavavej who is a software engineer that works developing and maintaining Microsoft's STL version. I like his talks very much.
These are the ones we've watched so far:
  1. Sequence Containers.
  2. Associative Containers
  3. Smart Pointers.
There are a lot of videos about C++ in C9.