Archive for July 2011
Distribution of Complexity in JUnit
Suppose we were to take methods one by one, at random and without replacement, from the source code of JUnit. How would we expect the Cyclomatic Complexity of those methods to be distributed?
Here you will find some automation to discover the raw numbers, and here is a Mathematica Computable Document (get the free reader here) showing the analysis.
Result:
This evidence suggests that the Cyclomatic Complexity per method in this version of JUnit is distributed according to a discrete power–law distribution with shape parameter ρ ≈ 1.43
This chart shows the empirical probability of a given complexity in blue and that from the maximum–likelihood fitted power–law distribution in red. Solid lines show where the fitted distribution underestimates the probability of methods with a certain complexity occurring, dashed lines where it overestimates.
Note that both scales are logarithmic.
Other long-tailed distributions (e.g. log-normal) can be fitted onto this data, but the hypothesis that they represent data is rejected at the 5% level.
What this blog is for
Over the years I have been doing some rather informal research into the relationship between Test-Driven Development and certain measurable properties of the resulting code. That older work is discussed on my Blogger blog. I want to put that research onto a more firm foundation and to do so in the style of “open science”
I welcome you comments, criticisms and even contributions.
Thanks to Steve Freeman for help with the name.