TDD: giving in to get along
I like TDD. But if you reading this blog you probably knew that already.
“coding like a bastard” considered harmful
I can remember a time when I thought, because I’d been taught so and it seemed to make sense, that software—programs—were designed on paper, using ▭s and →s of various kinds and, in several of my earlier jobs, more than a few ∀s and ∃s and that the goodness of a design was determined by printing it out in a beautifully formatted document† and having an older, wiser, better—well, older and therefore presumably wiser, and more senior so presumably better, anyway—designer write comments and suggestions all over it in red pen during a series of grotesquely painful “review” meetings and then tying to fix it up until the older etc. designer was happy with it. After which came an activity know at the time as “coding like a bastard” after which came the agony of integration, after which came the dismaying emotional wasteland of the “test and debug” activity which took a duration essentially unbounded above, even in principle.
Hmm, now I come to write it down like that, it seems like that was a colossally idiotic way to proceed.
There were some guidelines about what made a good design. There were the design patterns. There were Parnas’s papers, such as On the Criteria… and there were all these textbook ideas about various kinds of coupling an cohesion and…ah, yes, the textbooks. Well, there was Pressman‘s, and there was Sommerville‘s and some more specialist volumes and some all–rans. When I returned to university after a fairly hair–raising time in my first job as a programmer, wanting to learn how to do this software thing properly, we used whatever edition of Somerville as current at the time—its now in its 9th—as our main textbook for the “software engineering” component: project management, planing, risk, that sort of thing.
So, it’s a bit…startling, we might say, to see Ian’s writeup of his experiment with TDD dealt with by Uncle Bob in quite such…robust, we might say, terms. Startling even for someone as…forthright, we might say, as I usually am myself.
As described, doing TDD wrongly
Thing is, though, Ian is, as described, doing TDD wrongly. And the disappointments that he reports with it are those commonly experienced by… by… by people who are very confident—rightly or wrongly and in Ian’s case, probably rightly—in their ability to design software well. I used to be very confident—perhaps wrongly, but I don’t think so—of my ability to design software. I mean to say, I could produce systems in C++ which worked at all—mid 1990s C++, at that—and this is no mean feat.
Interestingly, at the time I first heard about TDD by reading and then conversing with Kent and Ron and those guys on the wiki—C2, I mean, the wiki—I was already firmly convinced of the benefits of comprehensive automated unit testing, having been made to do that by a previous boss—who had himself learned it long before that—but of course we wrote the tests after we wrote the code, or, to be more honest about it, while debugging. And, yes, even with that experience behind me, I though that TDD sounded just crazy. Because it to someone used to the ▭s and →s and design as an activity that goes on away from a keyboard and especially to someone who does that well—or believes that they do—it does sound crazy.
And so a lot of the objections to TDD that Ian makes in his blog post seem early familiar to me. And not only because I’ve heard them often from others since I started embracing TDD.
Thorough, if unnecessarily harsh
Well, anyway, Bob’s critique of what Ian reports is pretty thorough, if unnecessarily harshly worded in places, but there are a few observations that I’d add.
Ian says:
Test-first or test-driven driven development (TDD) is an approach to software development where you write the tests before you write the program.
Apart from the fact that writing tests first is merely necessary, but very much not sufficient, for doing TDD, so far so good.
You write a program to pass the test, extend the test or add further tests and then extend the functionality of the program to pass these tests.
Mmmm.
You build up a set of tests over time that you can run automatically every time you make program changes.
That does happen,yes…
The aim is to ensure that, at all times, the program is operational and passes the all tests.
Yep. I especially like the distinction between merely passing all the tests and also being operational.
You refactor your program periodically to improve its structure and make it easier to read and change.
and, sadly, this last sentence misses a key practice of TDD and largely invalidates what comes before. Which practice is that you refactor your code with maniacal determination up to as frequently as after every green bar.
Every. Green. Bar.
Technically, we could claim to be doing something “periodically” if we did it every 29th of February or every millisecond but I think that to say we do something “periodically” points to a lower frequency. But in TDD we should be refactoring often. Very often. Many times an hour. To really do TDD requires that we spend quite a large proportion of all the time invested any given programming exercise on refactoring. So, Ian has kind–of fallen at the first hurdle because he’s not really doing TDD right in the first instance.
Now, it used to be a frequent complaint about TDD advocates that we sounded like Communists: it was claimed that we would immediately respond to anyone who said that “I tried TDD and it doesn’t work” by claiming that they weren’t even doing TDD, really, in the same way that fans of Communism would contend that it had never really been tried properly so, hey, it might work, you don’t know.
Not a useful response
The thing is, though, a lot of people who dismiss TDD really haven’t tried it properly—and a lot who say that they do TDD aren’t doing it right either and are missing some benefits, but that’s another story—so of course they didn’t get the advertised effect. And by now we have lots of examples of people who really have tried TDD properly and the interesting and positive results they’ve obtained. Ian did not try doing TDD properly.
And then since that wasn’t going so well, he stopped even trying to:
[…] as I started implementing a GUI, the tests got harder to write and I didn’t think that the time spent on writing these tests was worthwhile.
Well, yes, we know that writing automated tests for GUIs is 1) hard and 2) relatively low value. But this:
So, I became less rigid (impure, perhaps) in my approach, so that I didn’t have automated tests for everything and sometimes implemented things before writing tests.
is not a useful response.
One useful response is to use something like MVC, or MVP, or ports-and-adaptors or one of the many other ways to make the GUI very, very thin and do automated tests behind that and test the actual GUI by hand. But from this point on Ian has basically invalidated his own exercise in TDD because although he wasn’t really doing it to begin with he was at least trying but it turned out to be tough and so he stopped trying. And also stopped learning. Which is a missed opportunity for him, and also for the rest of us. I encourage Ian to try again, maybe with some coaching, and see how that goes, because I would be genuinely interest to see how a seasoned software engineering academic gets on with that.
Not your daddy’s COBOL compiler
Ian says:
Think-first rather than test-first is the way to go.
Well…
he also says:
I started programming at a time where computer time was limited and you had to spend time looking at and thinking about the program as a whole.
Yes. There’s a whole hour long presentation that I have about this but—the microeconomics of programming have changed in quite a fundamental way over the last few decades. Even since I started working.
In my second job as a programmer I worked on a product written in C++ where, no joke, a full build was something you started on Friday lunchtime and went down the pub, hoping that it would be finished by the time you strolled in late on Monday morning. Even incremental builds on just the sub-system I was working on took “go have a cup of tea” amounts of time. Running our comprehensive automated unit test suite (written post hoc, as described above) took “go have lunch” amounts of time.
The time period that Ian is talking about was much worse even than that. In that era the rare and expensive resource was machine cycles and they need to be dedicated to doing the useful, revenue–earning thing. Programmer thinking time was, relatively, cheap and abundant so they mode of working tended to use lots of that to avoid wasting machine cycles on code that was not strongly expected to be correct.
If you wanted to work the way we do now—for example, with approximately one computer per programmer—you had to be, say, NASA, and you had to have, say, basically unlimited resources because your project was, say, considered to be a matter of national survival. But for most programmers, their employer could not afford that. The entire organisation might have as few as one computer. Maybe one to a department.
The whole edifice of traditional software engineering can be seen as a perfectly reasonable attempt to deal with the constraint that you can’t afford to have a programmer use machine cycles to do programming with. So you need to find ways to write programs away form a computer. That’s what the ▭s and →s were trying to do. The people who came up with that stuff meant well, but ended up creating that world of the colossally idiotic ways to proceed.
I was once sent on a COBOL programming course—it’s a long and dreary story—and on this course we worked within a simulation of those bad old good old days: programs were designed using what I later realised was Jackson Structured Programming, written out in pencil on pre-printed 80-column coding sheets, desk-checked, and then typed in to a COBOL development system. One PC for a class of about 20 students—before which we formed a queue—and we each only had three goes at the compiler. If it took more than three compile/test/debug episodes to get your program running you failed the course.
Today, we are awash with machine cycles. I have many billions of them available to me here right now every second and all I’m using them for is writing this blog post. John von Neumann* must be spinning in his grave.
Don’t play dumb
If I were programming right now, rather than doing this, then I could use those billions of cycles to get prompt, concrete feedback from a large body of tests and from other tools about my current position in a long series of small design decisions.
Rather than thinking in big, speculative lumps I could think in tiny, tiny increments—always with the ever important continual, frequent and determined refactoring.
There is a failure mode, though. Ian says:
[…] with TDD, you dive into the detail in different parts of the program and rarely step back and look at the big picture.
Don’t do that.
I don’t think that there’s anything in TDD that says not to step back and look at the big picture. There’s nothing that says to do that, it’s true, but why would’t you? It’s disappointing to see a retired Professor of Software Engineering playing dumb like this—if he feels the need to step back and look at the big picture then he should. He shouldn’t not do that merely because he’s making an attempt to try out a technique that doesn’t say to do that. I mean, really!
Mighty thinking is not the winning strategy
Added to which, I don’t recall anyone ever saying that TDD is the only design technique—and it is a design technique—that anyone needs to use at any scale to produce a good system. What is said, by me for one, is that by using TDD to guide design thinking and most importantly, to make it quick, easy, cheap and safe to explore different design options, we can get to better results sooner and more reliably than we can by mighty thinking, which was previously the only economically viable method.
I understand that this can discomforting to those who’s thoughts tend to the mighty. It’s almost as if in contemporary** software development mighty thinking has turned out not to be the winning strategy, long term.
Neither for individuals in their careers nor for their employers, nor for their industry. It might be time to come to terms with that. And for a certain kind of very smart, very capable, very confident designer of programs that means letting go. Letting go of the code, of the design, letting go of a certain sense of control and gaining in return a safe way to explore design options that you were too smart to think up yourself.
And that’s not easy.
† we had to use professional quality document preparation systems to do that, because of all the ▭s and →s and ∀s and ∃s. Which was fun.
* He’s supposed to have responded to a demo of some tools written by a programmer to make programming easier by saying that “it is a waste of a valuable scientific computing instrument to use it to do clerical work”
** that is, since about 2006…
“as I started implementing a GUI, the tests got harder to write and I didn’t think that the time spent on writing these tests was worthwhile.”
In my opinion, this is the point at which TDD *starts*.
Until the tests are difficult to write, I’m basically writing, in the form of tests, a predetermined design or a design that has been imposed a-priori by the environment I’m working in (if it has to be a web app, I have to handle HTTP requests, for example).
When the next test is difficult to write, I now have to let the design be *driven* by the process of writing automated tests: refactor until that test is easy to write. This is usually not easy, but once achieved, development returns to a smooth test/implement/refactor cycle within the new design.
(Writing this brings to mind Kuhn’s model of scientific progress. Small difficulties writing tests are coped with until a large difficulty forces a paradigm shift into a new way of looking at things, which solves the old difficulties but inevitably introduces new small difficulties of its own).
Nat Pryce
March 21, 2016 at 9:42 am
In general I agree.
I say to people that while it’s true that “when the bar is green the code is clean” (does anyone still say that?) we are only really making progress while on the next go around the loop, so “when the bar is red we’re forging ahead”, and that’s part of why we need to get back to a red bar ASAP.
However, I think the particular difficulties that come with trying o write automated tests for a GUI—which difficulties I believe are accidental not essential—aren’t a good reference point for that. Or anything else.
keithb
March 21, 2016 at 9:56 am
Hello, Keith!
If I propose a new design technique to you, and give you 400 samples of large code bases designed using that technique, and you examine those samples and find they they are excellent designed software, then you might be think that my design technique is possibly a good one.
If I only give you 10 samples, then you might still think the technique is, though you may feel somewhat less convinced.
If I give you no samples, then you will clearly be far less convinced: you’re essentially being asked to believe based on faith and not evidence, which is seldom wise.
If I give you two code bases and they are poorly designed, then I hope you would be highly skeptical of my design technique.
This is where I find myself with TDD.
On Ian’s blog, as I asked the responding TDD community to present evidence of TDD’s benefit by showing large (or just non-trivial) open source Java written using TDD, which I then analyzed using JDepend. (I had intended JDepend to be just a first step, but it turned out to be sufficient).
It was suggested I look at JUnit, and Bob Martin himself – HIMSELF! – suggested his own program, FitNesse.
The latter is interesting because I always use Uncle Bob’s own definition of bad design to judge software (preliminarily), that is: “A piece of software that fulfills its requirements and yet exhibits any or all of the following three traits has a bad design. It is hard to change because every change affects too many other parts of the system. 2. When you make a change, unexpected parts of the system break. 3. It is hard to reuse in another application because it cannot be disentangled from the current application.”
JUnit turned out to be a very badly designed piece of software because its package dependencies were highly entangled, thus making it very hard to change any part of the system without other parts of the system being affected. Have a look for yourself: JDepend shows clearly how terribly interconnected the code is.
More shockingly, FitNesse itself was also very poorly designed for the exact same reason. It’s dependencies are awful. They are inexcusably bad. Tangled software like this would never, never, never get to trunk in our shop. Yet Uncle Bob offers this up as the good design achievable by TDD.
This was dreadfully disappointing.
So, Keith, I’m looking to you for some hope. Do you have any (non-trivial) open source, well crafted Java designed using TDD? And that YOU think is good design because you’ve examined, not presuming it’s good design because it’s written by acknowledged TDD expert?
And, honestly, why doesn’t the TDD community have 400 such samples for us to look at?
(Note I was also pointed towards Spring – which turned out to be MUCH bigger than I had expected – but it, too, is poorly designed – though I had to go a little beyond JDepend to uncover the nastiness.)
Or am I wrong? Do you think FitNesse is well designed? Would you be happy giving it to your students as an example of the excellence TDD produces? Could you please take a peek at it, if for nothing else than to see what is being proposed as good design built using TDD?
I’m sorry if I sound belligerent; I’m just frustrated. I hear so, so, so much about how I should be using TDD, yet I feel I’m being asked to take it all on faith, and even counter to obvious evidence. Yet I know so many intelligent people – such as your good self – clearly cannot be wrong.
I’m left with gross disconnect I cannot explain.
Help!
Monica
May 15, 2016 at 6:23 am
Regarding package dependencies as a measure of code quality, & speaking as someone who likes non-cyclical package dependencies: an important question to ask is ‘why is lack of tangling in package dependencies an important measure of code quality to you?’
lambdaclef
April 16, 2017 at 9:48 am