# cumulativehypotheses

mostly professional blather

## Distribution of Complexity in Hudson

with one comment

Suppose we were to take methods one by one, at random and without replacement, from the source code of Hudson 2.1.0 How would we expect the Cyclomatic Complexity of those methods to be distributed?

Here you will find some automation to discover the raw numbers, and here is a Mathematica Computable Document (get the free reader here) showing the analysis. If you have been playing along so far you might expect the distribution of complexity to follow a power law.

Result:

This evidence suggests that the Cyclomatic Complexity per method in this version of Hudson is not distributed according to a discrete power–law distribution (the hypothesis that it is, is rejected at the 5% level).

Probability of Complexity of Methods in Hudson

This chart shows the empirical probability of a given complexity in blue and that from the maximum–likelihood fitted power–law distribution in red. Solid lines show where the fitted distribution underestimates the probability of methods with a certain complexity occurring, dashed lines where it overestimates. As you can see, the fit is not great, especially in the tail.

Note that both scales are logarithmic.

Other long-tailed distributions (e.g. log-normal) can be fitted onto this data, but the hypothesis that they represent data is rejected at the 5% level.

Written by keithb

August 31, 2011 at 8:23 pm

# What’s the problem with TDD?

TDD is quite a simple process. Beck Describes it here in these terms:

### TDD

2. Run all tests and fail
3. Make a little change
4. Run the tests and succeed
5. Refactor to remove duplication

This turns out to be a hard thing to do. My observation has been that the more experienced and fluent a programmer someone is the more difficult it is for them to stick to this process. What tends to happen is more like this:

### Pseudo-TDD

1. Think of a solution
2. Imagine a bunch of classes and functions that you just know you’ll need to implement (1)
3. Write some tests that assert the existence of (2)
4. Run all tests and fail
5. Implement a bunch of stuff
6. Run all tests and fail
7. Debug
8. Run the tests and succeed
9. Write a `TODO `saying to go back and refactor some stuff later.

Really good programmers can get away with this, for a bit. But even during that early period I think they are missing a trick. A couple of tricks, in fact.

Firstly, in the pseudo-TDD steps 1, 2 and 3 can take a long, long time. Tens of minutes perhaps, hours, or days even.  This is time during which  you aren’t running tests, aren’t getting feedback and aren’t learning anything. Step 7 must be assumed to take an amount of time unbounded above.

Secondly, in the Pseudo-TDD process the programmer must fall back on some technique for making design decisions and somehow getting them right exactly at the time when they know least about the problem and its solution. In TDD we have the advantages of evolutionary design: we can discover a good-enough design and then incrementally improve it. I think it is really hard for people who know themselves to be good programmers to let go of the design process in this way.

# TDD as if you Meant It

I began to wonder if there was some sort of exercise that folks could do, in safe controlled conditions, whereby they could experience the odd and surprising (and delightful) things that can happen when you really do TDD, as Beck describes it, with the hope that the experience would carry over to their daily work as programmers.

This would be a pair-programming exercise, since it’s often easier to maintain a level of discipline if you know someone is watching you can provide friendly, constructive feedback.

The problem to be solved should be simple enough that decent progress can be made on it during a typical conference “workshop” session (say, 90 minutes to 3 hours) and also have an obvious solution that experienced programmers would want to jump to implementing.

In the first couple of presentations of the session I used a problem from the game of Go. This worked reasonably well but I feel I spent too long explaining the game. Some folks who have picked the exercise up have used tic-tac-toe with good results. I tried that myself at NDC 2011 and was quite pleased with the result. From now on I will use that problem.

Another description of TDD is due to Bob Martin. It goes like this:

1. You are not allowed to write any production code unless it is to make a failing unit test pass.
2. You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.
3. You are not allowed to write any more production code than is sufficient to pass the one failing unit test.

I took these rules as a starting point and then tried to produce stronger rules that would force the programmer (pair) to allow the design to evolve. I don’t think I’ve yet landed on the best set of rules, and people report difficulties with various part of it, but if I were going to do the workshop today these are the rules I would enforce (subject to change and refinement at any time, last  updated 3 Sept 2011):

## The Rules

1. Write exactly one new test, the smallest test you can that seems to point in the direction of a solution
2. See it fail
3. Make the test from (1) pass by writing the least implementation code you can in the test method.
4. Refactor to remove duplication, and otherwise as required to improve the design. Be strict about using these moves:
1. you want a new method—wait until refactoring time, then… create new (non-test) methods by doing one of these, and in no other way:
1. preferred: do Extract Method on implementation code created as per (3) to create a new method in the test class, or
2. if you must: move implementation code as per (3) into an existing implementation method
2. you want a new class—wait until refactoring time, then… create non-test classes to provide a destination for a Move Method and for no other reason
1. populate implementation classes with methods by doing Move Method, and no other way

The member of the pair without their hands on the keyboard must be very strict in enforcing these rules, especially 4.1 and 4.2

After some respectable time coding, contrast and compare solutions. Consider the classes created. How many? How big? What mutable state? Consider the methods created How many? How long? Apply a few simple design metrics. How was the experience of working this way different from the usual? How could these ideas be applied in your day job?

# Experiences with the Workshop

If you haven’t tried the workshop yet, and would like to, you might want to stop reading now so that you don’t lose the “a-ha!”. That is to say: spoiler alert!

## Mine

This is a tough exercise for experienced programmers and doubly so experienced TDD practitioners. I observe that pairs including folks who are not full-time programmers (BA’s, testers, managers even) do much better.

I’ve come to recognise the point about 5 to 10 minutes after the start of the exercise proper where everyone quietens down and seems to be making progress. At this point I stop the exercise and ask who has (in the case of tic-tac-toe, say) created a class called something like `Board `with something like a 3×3 array of `int`s in it (or even better, of an `enum `with members like `BLANK` and `X` and `O`) and no tests for it. After a bit of cajoling it always turns out that several pairs have. Because they “know” they will “need” it. Or because without that class they “can’t write any tests”.

At this point the facilitator needs to be strong and force them to delete that code and start again. Yes, really.

It’s extraordinarily hard for some pairs to get going. Often it’s the ones who have just had their code deleted. They will just sit and stare at an empty editor window. This is the crucial learning step. If they are not allowed to write tests about the solution that doesn’t exist yet, what are they allowed to write tests about? There is only the problem. Here is where we start to see the connection between TDD, BDD (in so far as they are not identical—hint: they are identical) and DDD and eDSLs.

There is always a startling variety of solutions.

Some pairs can implement pretty much a whole tic-tac-toe game playing program without a class remotely like a `Board`.

## Others

At least these people have kindly written up their experiences of doing the exercise:

Those are just the ones I know about. I’d love to hear about more.

Written by keithb

August 30, 2011 at 2:31 pm

Posted in conference, TDD

Tagged with , ,

## Distribution of Complexity in jMock

Suppose we were to take methods one by one, at random and without replacement, from the source code of jMock 2. How would we expect the Cyclomatic Complexity of those methods to be distributed?

Here you will find some automation to discover the raw numbers, and here is a Mathematica Computable Document (get the free reader here) showing the analysis.

Result:

This evidence suggests that the Cyclomatic Complexity per method in this version of jMock is distributed according to a discrete power–law distribution with shape parameter ρ ≈ 1.92

Probability of Complexity of Methods in jMock 2

This chart shows the empirical probability of a given complexity in blue and that from the maximum–likelihood fitted power–law distribution in red. Solid lines show where the fitted distribution underestimates the probability of methods with a certain complexity occurring, dashed lines where it overestimates.

Note that both scales are logarithmic.

Other long-tailed distributions (e.g. log-normal) can be fitted onto this data, but the hypothesis that they represent data is rejected at the 5% level.

Written by keithb

August 17, 2011 at 2:01 pm