cumulativehypotheses

mostly professional blather

Is Testing “Waste”?

leave a comment »

That is, in the technical, sense used in Lean manufacturing, who’s first two principles include:

  1. Specify value from the standpoint of the end customer by product family.
  2. Identify all the steps in the value stream for each product family, eliminating whenever possible those steps that do not create value.

The “steps that do not create value” are waste. If our product is, or contains a lot of, software, is the action of testing that software waste, that is, not creating value from the standpoint of the end customer?

At the time of writing I am choosing the carpets tiles for our new office. On the back of the sample book is a list of 11 test results for the carpet relating to various ISO, EN and BS standards, eg the EN 986* dimensional stability of these carpet tiles is < 0.2%—good to know! There are also the marks of cradletocradle certification, GUT registration, BREEAM registration, a few other exotica and a CE mark. Why would the manufacturer go to all this trouble? Partly, because of regulation: an office fitter would baulk at using carpet that did not meet certain mandatory standards. And partly because customers like to see a certain amount of testing.

Take a look around your home or office, I’ll bet you have a lot of small electrical items of various kinds. Low-voltage power supplies, in particular.  Take a look a them. You will find on some the mark of Underwriters Laboratories, UL Mark which indicates that the manufacturer has voluntarily had the product tested by UL for safety, and maybe for other things. If you’re in the habit of taking things apart, or building things, you might also be familiar with the UL’s “recognised component” mark for parts of productsUL Recognised Component Mark. On British made goods you might see the venerable British Standards Institution “Kite Mark” BSI Kite Mark, or maybe on Canadian gear the CSA mark CSA Mark, on German kit one of the TÜV marks, and so on. These certifications are for the most part voluntary. Manufacturers will not be sanctioned for not obtaining these marks for their products, nor will—other than in some quite specialised cases—anyone be sanctioned for buying a product which does not bear these marks.

Sometimes a manufacturer will obtain many marks for a product, and sometimes fewer, and sometimes none. I invite you to do a little survey of the electrical items in your office or home: how many marks does each one have. Do you notice a pattern?

I’ll bet that the more high-end a device—in the case of power supplies, the more high-end what they drive—the more marks the device will bear, and the more prestigious those marks will be. Cheaper gear will have fewer, less prestigious marks—ones that make you say “uh?!”†† and the very cheapest will have none.

If testing is waste, why do manufacturers do this?

How does your answer translate to software development?


* BS EN 986:1995—Textile floor coverings. Tiles. Determination of dimensional changes due to the effects of varied water and heat conditions and distortion out of plane.

† tanks, missiles, land-mines, leg-irons, electric cattle-prods, that sort of thing.

†† There are persistent rumours that some Chinese manufacturers of questionable business ethics have concocted a mark of their own which looks from a distance like the Screen Shot 2015-11-04 at 20.38.27mark

Advertisement

Written by keithb

November 4, 2015 at 8:41 pm

Posted in Uncategorized

Mocks

with one comment

Well, this feels like a conversation from a long time ago. This presentation got tweeted about, which asserts that

Mocks kill TDD. [sic]

which seems bold. And also that

TDD = Design Methodology

which seems wrong. And also that

Test-first encourages you to design code well enough to test…and no further

which seems to have profoundly misunderstood TDD.

TDD

Just so we can all agree what we’re talking about, I think that TDD works like this:
repeat until done:

  • write a little test, reflecting the next thing that your code needs to do, but doesn’t yet
  • see it fail
  • make all tests—including the new one—pass, as quickly and easily as possible
  • refactor your working code to produce an improved design

I don’t see that as being a design methodology. It’s a small-scale process for making rapid progress towards done while knowing that you’ve not broken anything that was working, and which contains a publicly stated commitment to creating and maintaining a good design. There’s nothing there about what makes a good design—although TDD typically comes with guidance about well designed code being simple, well designed code lacking duplication and—often overlooked, this—well designed code being easy to change. I also often suggest that if the next test turns out to be hard to write, you should probably do some more refactoring.

Note that in TDD we don’t—or shouldn’t—test a design, that is, we shouldn’t come up with a design and then test for it. Instead we discover a design through writing tests. TDD doesn’t design for you, but it does give you a set of behaviours within which to do design. And I’m pretty sure that when followed strictly, TDD leads to designs that have measurably different properties than designs arrived at other ways. Which is why this blog existed in the first place (yes, I have been a bit lax about that stuff recently). UPDATE: a commentator on lobste.rs (no, me neither) quotes me saying that “TDD doesn’t design for you, but it does give you a set of behaviours within which to do design.” and asks: how is TDD not a design methodology, then?! And I answer: because it doesn’t provide a vocabulary of terms with which to talk about design, it doesn’t provide a goal for design, it doesn’t provide any criteria by which a design could be assessed, it doesn’t provide any guidance for doing design beyond this—do some, do it little bit at a time, do it by improving the design of already working code. If that looks like a methodology to you, then OK.

But Ken does have a substantive objection to code that he’s seen written with mocks. Code which has tests like this:

A Terrible Test which Happens to use Mocks

A Terrible Test which happens to use Mocks

and I certainly agree that this is a terrible test. There are far too many mocks in it, and their expectations are far too complex and far too specific. Worst of all, the expectations refer to other mocks. This is terrible stuff. You can’t tell what the hell’s going on, and this test will be extraordinarily brittle because it reaches out far too far into the system. It probably has a net negative value to the programmers who wrote it. That’s bad. Don’t do that.

Is this the fault of mocks? Not really. The code under test here wouldn’t be much different, I’ll bet, if it hadn’t been TDD’d—If this code even was TDD’d, I have my doubts although people do do this sort of thing, I know. This confusing, brittle, unhelpful test has been written with mocks, but not because of mocks. One could speculate that it was written by someone who’d got far, far too carried away with the things that mock frameworks can do, and failed to apply good taste, common sense and any kind of design sensibility to what they were doing. Is that the fault of mocks? Not really. Show me a tool that can’t be abused and I’ll show you a tool that isn’t worth having.

Other Styles of Programming

Ken, of course, has an agenda, which is really to promote a functional style of programming in which mock objects are not much help in writing the tests. I think he’s right about that and it should be no surprise as mocks are about writing tests that have something to say about what method invocations happen in what order, and as you move towards a functional style that becomes less and less of a concern. So maybe Ken’s issue with mocks is that they don’t stop you from writing non-functional code—to which I say: that doesn’t mean that you have to.

If you can move to functional programming (spoiler: not everyone can) and if your problem is one that is best solved though a functional solution (spoiler: not all of them are), then off you go, and mocks will not be a big part of your world and fair enough and more power to you. But if not…

Now, I tweeted to this effect and that got Ron wondering about that kind of variation, and why it might be that Smalltalk programmers don’t use mocks when doing TDD. Ron kind-of conflates what he calls the “Detroit School” of TDD and “doing TDD in Smalltalk”, which is kind-of fair enough as Kent and he and the others developed their thinking about TDD in Smalltalk and that’s the style of TDD that was first widely discussed on the wiki and spread from there.

Ron says that he does use “test doubles” for:

“slow” operations, and operations across an interface to software that I don’t have control of

and of course mocks are very handy in those cases. But that’s not what they’re for. Ron says:

Perhaps our system relies on a slow operation, such as a database access […] When we TDD such a thing, we will often build a little object that pretends to be a database […] that responds instantly without actually exercising any of the real mechanism. This is dead center in the Mock Object territory,

Well, no. Again, you can use mocks for such tests, but you’ll only get much value from that if your test cares about, say, what the query to the database is (rather than merely using the result). And while it will make your tests go fast, that’s not the real motivation for the mock handy as it may be.

A Brief History Lesson

Mocks were invented to solve a very specific problem: how to test Java objects which do not expose any state. Really not any. No public fields, no public getters. It was kind-of a whim of a CTO. And the solution was to pass in a collaborating object which would test the object which was the target of the test “from the inside” by expecting to be called with certain values (in a certain order, blah blah blah) by the object under test and failing the test otherwise.

A paper from 2001 by the originators of mocks describes the characteristics of a good mock very well:

A Mock Object is a substitute implementation to emulate or instrument other domain code. It should be simpler than the real code, not duplicate its implementation, and allow you to set up private state to aid in testing. The emphasis in mock implementations is on absolute simplicity, rather than completeness. […] We have found that a warning sign of a Mock Object becoming too complex is that it starts calling other Mock Objects – which might mean that the unit test is not sufficiently local. [emphasis added]

the object under test in a mock object test is surrounded by a little cloud of collaborating mocks which are simple, incomplete and local. UPDATE: Nat Pryce reminds me that process calculi, such as CSP, had an influence on the JMock approach to mocking.

Ron talks about Detroit/Smalltalk TDD-ers developing their test doubles by this means:

just code such a thing up […] Generally we’d build them up because we work very incrementally – I think more incrementally than London Schoolers often do – so it is natural for our mock objects to come into being gradually. [emphasis added]

I don’t know where he gets that impression about the “LondonSchool”. In my experience, in London and elsewhere, mocks made with frameworks also come into being gradually, one expectation or so at a time. How else? UPDATE: Rachel Davies reminds me that the originators of mocking had a background in Smalltalk programming anyway.

Ron speculates that mocks are likely to be more popular amongst programmers who work with libraries that they don’t control, and I expect so. Smalltalkers don’t do that much, almost everyone else does, lots. He speculates that mocks are likely to be more popular amongst programmers who work with distributed systems of various kinds, and I expect so. Smalltalkers don’t do that much, almost everyone else does, lots. Now, if we could all write our software in Smalltalk the world would undeniably be a better place, but…

In fact, I suspect that Smalltalkers write a lot of mocks, but that these tend to develop quite naturally into the real objects. The Smalltalk environment and tools affords that well. Almost everyone else’s environment and tooling fights against that every step of the way. And Smalltalkers won’t generally use a mocking framework, although there are some super cute ones, because the don’t have to overcome the stumbling blocks that languages like Java put in the way of anyone who actually wants to get anything done.

Tools

Anyway, there’s this thing about tools. Tools have affordances, and good tools strongly afford using them the right way and weakly—or not at all—afford using them the wrong way. And there are very special purpose tools, and there are tools that are very flexible. I read somewhere that the screwdriver is the most abused tool in the toolbox, because a steel rod that’s almost sharp at one end and has a handle at the other is just so damn useful. But that doesn’t mean that it’s a good idea to use one as a chisel. I grew up on a farm and I remember an old Ferguson tractor which was started by using a (very large) screwdriver to short between the starter motor solenoid and the engine block. Also not a good idea.

That we can do these things with them does not make screwdrivers bad. And the screwdriver does not encourage us to be idiots—it just doesn’t stop us. And so it is with mocks—they are enormously powerful and useful and flexible and will not stop us from being stupid. In particular, they will not stop us from doing our design work badly. And neither will TDD.

What I think they do do, in fact, is make the implementation of bad design conspicuously painful—remember that line about the next test being hard to write? But programmers tend to suffer from very bad target fixation when a tool becomes difficult to use and they put their head down and power through, when they should really stop and take a step back and think about what the hell they’re doing.

Written by keithb

November 3, 2015 at 9:41 pm

Posted in Uncategorized

So: What does “#NoEstimates” even mean, Anyway?

with 2 comments

Jump to the TL;DR if you want.

What does #NoEstiamtes mean? It’s surprisingly difficult to tell. Depending on whom you ask It might mean, as the name suggests, No Estimates! or it might mean Estimate All The Things—just don’t call it that! or it might mean something in between those, or it might mean something which has nothing to do with estimates at all, or it might be about questions not answers and it might be about just “starting a debate”1. A term that can mean so many things runs the risk of meaning nothing, or of just being the latest shiny buzzword to signal that you get it (not like those other silly folks, stuck in their ways).

When it comes to #NoEstimates what I’ve found is that the most concrete statement of what it might mean that anyone can point to is Vasco Duatre’s self–published book No Estimates: How to measure project success without estimating. (I’ll refer to it as “NE” in what follows, whereas the NoEstimates movement at large will be #NE)

It’s a commendably brief book, and not so expensive. It’s also clearly a labour of love and I do respect that. The urge to share really cool ideas is a strong and respectable one. There’s a bit of a “business novel” style of story running through it, linking some tutorial style material, and this story tells the initially very sad tale of Carmen, a well-meaning but inexperienced project manager—even within 10 pages of the end of the book she still thinks that a Gantt chart is going to be of any use to her—and her profoundly idiotic and bullying boss, both of whom seem to work at a rather desperate and very old–fashioned outsource software development house, named Carlsson and Associates (I’ll call them “CA”).

Quotes?

The word “budget” occurs ten times in NE, variously in the contexts of: the difficulty of not exceeding one, the unreasonableness of demands made relative to them, the further unreasonableness of demanding that people conform to budgets that they neither determined nor can control, and so on. And those are all difficult and unreasonable things. However, a company like CA, which is taking part in a competitive bid in Carmen’s story is going to have to produce a proposal, containing a quote, a proposed budget for the work on the “Big Fish” government contract.

NE often focusses on the difference between an estimate, a commitment, and a forecast. Those are different things, but NE seems to want the distinction to hinge on whether or not you have data (that would be a “forecast”) or whether you’re just guessing (that’s an “estimate”). I’d like to suggest that amongst people who know what they are doing the distinction is much less clear-cut, and much of what NE calls forecasting looks a great deal like estimation to me.

But NE doesn’t seem to mention quotes (other than as in “what somebody said once”). Throughout the book there’s no indication that I can find of how exactly an #NE “practitioner” is supposed to produce a quote for a piece of work—which will be required at some point by anyone who isn’t working for an in–house team and needs to win a contract.

Update: I’ve been asked how quotes fit into an agile world. In my experience, if you are a supplier of development effort to clients then the quote is what gets you permission to start spending money. It’s not really “the budget”—although it might be described that way—it’s a starting point for an on-going conversation about value. Again, in my experience, a £60,000 proof–of–concept or a £120,000 Discovery activity can, though the establishment of a reputation for steady delivery of value, grow into a multi-million pound endeavour spanning several years without anyone ever deciding that this is what it should end up being. But sometimes you really do need to talk about the years and the millions, and if you can’t: no sale!

In the story Carmen sets about the estimation task (to produce the undeclared quote) in the worst way possible: she tries to construct a Work Breakdown Structure2, estimate the effort for the leaf nodes, and then roll that up into an estimate for the whole thing, which is madness.  CA get the gig—after somehow having sight of their competitor’s bid, which suggests that the client is pretty sloppy. It also suggests that CA did a very common thing and priced their bid “to win”, that is, by producing a very low quote. It’s important to realise that the estimated effort (time/team size/cost, whatever…) to complete a piece of work is only one input to a quote. By quoting a price to win the work CA are following in the footsteps of many a supplier who has low balled an alleged “fixed–price” for a piece of work comfortable in the knowledge that he client will want to change their mind about the scope and can then be charged for change control for very, very long time—which is where the unscrupulous supplier3 makes their profit. CA don’t seem to be even that smart, and Carmen’s boss seems to think that CA can somehow price to win with a fixed price and a fixed scope and then deliver against both. Carmen’s project is pre–doomed. Which can be a good thing. So long as everyone recognises that you have no chance of delivering, whatever you do, then it doesn’t matter what you do and all sorts of options which were previously unavailable can become plausible, because what the hell!

Now, Carmen’s boss is an idiot but weirdly, on page 62, he suddenly asks a smart question, albeit in a stupid way and for the wrong reason:

“Carmen, we have a review of the project with the Client next week. How are things going, what kind of progress can we show them?” Asked her boss.

“Good Morning sir. Yes, we do have the requirements delivery and the Earned Value Management reports that I showed you just yesterday.”

“That is great Carmen, but I was asking if we can show them working software. You know, to make a good impression.”

Tuns out that there is no way to demonstrate any useful intermediate state of the implementation of the Big Fish system. Carmen’s project has become even more doomed than it was before CA won the gig. Although CA seem highly clueless, unfortunately Carmen’s situation is not so fictional as one might hope. But…and this I think speaks to the core of why #NE is so disappointing to so many people, CA have allowed their client to make them do stupid things and then CA have piled stupidity upon stupidity in how they respond to that. Competent suppliers just don’t behave the way that CA does, not these days.

Although all too plausible the scenario in the story is also a sort of pastiche of what too many mainstream project looked like more than ten years ago. I certainly saw projects like this when I started working in the industry in the early 90s. But these days, not so much…in between times, something changed.

Government

BigFish is a government project and as NE explains, government projects are notoriously very expensive, very late, and often deliver almost nothing of any value. The astonishingly terrible UK project to build a new IT system for the NHS is cited. But, here’s the thing, that project came to a long slow, shuddering halt, finally stopping all together in 2013—and even governments can learn. Since 2011 new build projects in HM Government departments4 are run with oversight from the Government Digital Service, who know what they are doing. All GDS projects are iterative, incremental and evolutionary. Spending departments simply are not allowed to sign up for the kind of catastrophic deal with the Usual Suspects that lead to those horror story government IT projects of the lore.

This was meant to be chapter–by–chapter review of NE, but my eyes started to glaze over—which I release is a poor trait in a book reviewer, but the reason why they did is interesting. Back to the story:

Carmen’s Big Fish project gets into exactly the sort of trouble that you’d expect, being driven by guesswork and wishful thinking, and she ends up appealing to the local #NE guru, Herman. In the charming illustrations by Ángel Medinilla this Herman is depicted as a portly, bearded, balding fellow. I certainly applaud the principle that portly, bearded, balding men are the fount of all wisdom. Anyway, Herman gives Carmen various items of good, commonplace and uncontroversial advice and between them they get the project back on track.

Now, through the first half of NE I’d been thinking: so far so unsurprising, when do we get to the new thing? And when Herman entered the story I though: great! here comes the punchline. But it just doesn’t.


Errata to NE

Perhaps these can be addressed in later version of the book. They are found in the pdf of version 1.0

p16 J. B. Rainsberger has made many fine contributions to the state of the art, but did not introduce the concept of distinguishing essential from accidental complexity in 2013 (although I’m happy to believe that he spoke about it that year). This distinction was introduced by Fred Brooks in his famous paper No Silver Bullet[pdf] Essence and Accidents of Software Engineering. The distinction was part of the folklore of the industry when I started programming for money in the early 1990s, a long time before I met J.B.

p51 incorrectly characterises Set Based Concurrent Engineering[pdf] as the process of starting to build the production line for a product before you’ve finished developing it. It isn’t. Or rather, doing that is just (one part of) “Concurrent Engineering”. The “Set” is of alternative design choices and they are all developed (concurrently) to a surprisingly high level of refinement and each eliminated through a tournament until one remains which then goes into production. This SBCE process is followed in part to allow for the decision to go to production to be made as late as possible. Reinertsen, in his The Principles of Product Development Flow criticises this approach as too often delaying the decision too long, beyond the point where the economic return on further delay starts to decline.

p64 wrongly states that RUP5 is a linear process model. It’s not. Or rather, it’s not supposed to be. Philippe Kruchten, who was the brains of the operation, built RUP to be very flexible and highly configurable and the first thing any RUP project was supposed to do was tailor the process within some very broad parameters by creating a “Development Case”. The non–negotiable bits of a RUP-derived process were meant to be [emphasis added]:

  1. Develop iteratively, with [technical] risk as the primary iteration driver
  2. Manage requirements
  3. Employ a component-based architecture
  4. Model software visually
  5. Continuously verify quality
  6. Control changes

It’s important to note that in Kruchten’s idea of what a RUP project should look like, the implementation, testing and deployment to production of code happens in every iteration of every phase of the project. However, what a lot of people (every RUP project I ever saw, in the UK or the USA, certainly) did was to carry on doing whatever linear, phased process they were doing before but rename bits of it using RUP terminology. Thus, the requirements gathering phase was renamed “Inception” and so on, and this worked about as well as you’d expect: very, very badly. And so the reputation of RUP was destroyed.

The aspect of RUP—when done right—that most lean/agile folks would object to most these days is the scheduling of work by risk rather than by value: we believe that agile technical practices tame technical risk for us, whatever order we develop features in. They’d probably not to keen on visual modelling (it is a mistake not to use visual modelling) nor on controlling changes (we embrace change, don’t we?) .


TL;DR

I think it was the great philosopher Robert Anton Wilson who said that the secret of leadership is to find some people who are going somewhere and get in front of them. I feel as if #NE, certainly as described in NE, might be doing something very much like that. Which isn’t a bad thing, necessarily, so much as it is disingenuous. Maybe that makes the #NE folks sound too cynical—which I don’t think they are. But there’s a huge gulf between the sort of pre–doomed idiocy of the way CA run their project to begin with in the story and what competent suppliers working with the  current good practice of iterative, incremental, evolutionary development (the only way that has ever worked in the general case, currently known as “Agile”) do today. And the gap6 between that and what #NE recommends and what NE very well explains is very small to non–existent.

At least this book is the first place I have seen all of those current good practices collected together with a semi-coherent story about how to use them all together on the same project. That’s a very useful artefact to have. But I might wish that the continual identification these good practices as being an approach distinct from the leading edge of mainstream lean/agile practice (which it is not) were dropped. The book would be greatly improved thereby and would, specifically, look a lot less like snake-oil salesmanship—which I don’t believe it is, but it looks like it, especially with all the charlatan hard–sell techniques you have to get past on the site to buy the thing.

Punchline

So what is the substantive content of #NE (as revealed in NE)?

There is one specific practice, illustrated very well in the book, which may be unfamiliar to many people doing mainstream Agile: slicing stories until they are all about the same size7, at which point “velocity” becomes a count of stories completed, not the sum of estimates of stories completed. Note that this isn’t a new, nor particularly radical idea, merely unfamiliar to many.

If you’ve drunk too much of the Scrum kool-aid (enough the for the effects to become irreversible) then you will hold fast to the dictum that “Work may be of varying size, or estimated effort” [Scrum Guide, v1, p 9] however, what might have slipped you mind is that the Scrum Guide says only this about how Sprint Planning works:

The input to this meeting is the Product Backlog, the latest product Increment, projected capacity of the Development Team during the Sprint, and past performance of the Development Team.

This allows for a great deal of latitude in how that goal is achieved—and the #NE proposition, as explained in NE would seem to fit that fine, if you were so minded. My experience with Certified ScrumMasters and Professional Scrum Masters8, however, is that the actual courses they do lead them to have a fetishistic determination to estimate, and as the Scrum Guide says, “estimate” [9 occurrences] and “re-estimate” [2] stories, and even “[make] more precise estimates […] based on the greater clarity and increased detail [available on items at the top of the backlog]” I’ll admit that the obsession that Scrum seems to have with estimating and re-estimating has struck me as odd, ever since I myself became a Certified ScrumMaster back in the 2000s. But is doing estimation the root of all evil? No.

Who is this for, again?

So, NE and #NE take a specific view on this specific issue: don’t estimate stories, slice them. And this is pretty much the only difference I can see between what #NE recommends and what any of the Agile teams that I think of as “getting it” do—and since many of them do slicing, often there’s no difference. Now, the detail material in NE explains with great subtlety and much appeal to thought experiments with probability distributions and what-not how not doing estimation is a waste–eliminating optimisation for  your process—although they do not demonstrate the effort of doing the slicing is actually less than the effort of doing the estimation, nor indeed that slicing is somehow value–adding and therefore not waste. But, Carmen’s story is one of utter foolish disregard for intelligence in project management brought under control by an Agile process which just so happens to use slicing instead of estimation—and the story also just so happens to leave out how you’d do the activities (such as providing a quote) that really do need estimates. This leaves me at a loss as to who NE (and #NE) is for: is it a subtle optimisation for people who are basically doing everything pretty much right? Is it a wakeup call for those in the lengthy tail of very, very late adopters of Agile processes? I don’t know, and I can’t tell.

With some brutal editing to strip out all the propaganda, NE would actually be very useful both as a thing to use to introduce current good practice in Agile to newbies, and as an aide memoire for current practitioners. But it has this incessant drumbeat insistence that the techniques presented are New! and Different! and Radical! when they simply are not, which I think makes it little use for either group.

I do strongly suspect that if v2.0 of NE had, instead of he story of Carmen and the chaos at CA, a protagonist working at a company that was operating current good practice in Agile development, then the switch to #NE then the differences, and the story, would be much less compelling—but maybe more useful.


1 When was the last time you heard anyone say that they “just want to start a debate” and anything remotely enlightening happened?

2 WBSs for software development are almost never valid. I have seen valid ones, but only cases where a team is in almost a manufacturing mode, grinding out another instantiation of a very well-known product with only marginal changes from a bunch of other instantiations of it. This is dull, low risk work and therefore low margin, and most of it is done by low–cost development shops in Farawayvia (or, as it may be, Distantistan). Anyone doing any remotely interesting software development work simply will not be able to construct a valid—never mind useful—WBS and should not even bother trying.

3 I often refer to these jokers as “the Usual Suspects”. You know who they are.

4 Full disclosure: my employer is a supplier to more than one department of HM Government, where we run projects as mandated by GDS and it works so well that we’e started to use the same DABL framework on private sector projects.

5 RUP is the process that will not lie down dead. Amongst those people who don’t seem to be comfortable running a development project without a vast and incomprehensible wall chart to follow, parts of the re-animated corpse of RUP are currently lurching around in two flavours: SAFe and SEMAT.

6 Theres this diagram in NE which could have been copy-pasted out of one of my own project proposals—I don’t suggest plagiarism, nor any sort of influence either way, it’s just a nice illustration of how NE doesn’t contain much of anything new, and of how #NE doesn’t contain much of anything that many people aren’t just doing anyway. It’s the one on p116 of the PDF, where Herman explains how to explain to a client what of their backlog they will, might, and won’t get—as best we know.

Illustrative Sketch of a structured backlog

Illustrative Sketch of a structured backlog

You and I might imagine that constructing such a diagram might involve estimation…that’s certainly how I do mine. In fact, many of the techniques that Herman uses are estimation techniques, even though he insists otherwise, without really explaining why not. I think that this sort of thing is what leads Alister Cockburn to conclude that #NE is a “bait–and–switch”, they spend far too much time explaining how they estimate stuff.

7 Yes, doing that would appear to require that you estimate and re-estimate the size of a story to see if it needs to be sliced down any further—I guess you have to just not call it estimation…

8 “ScrumMaster” or “Scrum Master”? What are the semiotics of that interposed whitespace? Or is it simply a matter of not infringing intellectual property rights? A “ScrumMaster” was, originally, someone who had mastery of doing Scrum. A “Scrum Master” seems more like the master–of–the–Scrum…

Written by keithb

October 12, 2015 at 10:58 am

Posted in #NoEstimates

#AsMuchEstimationAsYouNeedWhenYouNeedItAndThatsLessThatYouThinkAndNotSoOftenAsAllThatReallyButJustGetOverIt

leave a comment »

Ron Jeffries and Steve McConnell have been discussing #NoEstimates.

Ron wants me to signup to a google group to comment, and who has time for that? Worse, Steve wants me to become a registered user of Construx. So, instead I’ll comment here. I’m still paying for this site, after all.

A you might imagine, world famous estimation guru McConnell isn’t so keen on #NoEstimates. Here’s Ron’s response to Steve’s response to Ron’s response to Steve’s video responding to the #NoEstimates thing.

One of the smartest things I ever read about estimation, and one that I quote freely is this: “The primary purpose of software estimation is not to predict a project’s outcome; it is to determine whether a project’s targets are realistic enough to allow the project to be controlled to meet them”—McConnell, 2006.

That was published about 10 years ago. In the context of the state of the art of software development ten years ago, this statement was quite radical—surprisingly many organisations today still don’t get it. In the ten years since then the state of the art has move on to the point that some (not all, but some) development shops are now so good at controlling a project to meet its targets that creating an up-front determination of whether or not that can be done is really not so useful an exercise. Of course, part of that process has been to teach “the business” that they are wasting their time in trying to fix their targets far ahead into the future, because they will want to change them.

Anther very smart thing, from only eight years ago: “strict control is something that matters a lot on relatively useless projects and much less on useful projects. It suggests that the more you focus on control, the more likely you’re working on a project that’s striving to deliver something of relatively minor value.”—DeMarco, 2009

Very true. And since then that same progression in the state of the art has so reduced the cost of building working, tested software that the balance has moved in further in the direction of not doing projects where the exact cost matters a lot. #NoEstimates is this pair of ideas carried to their natural conclusion.

It’s still not unusual to see IT departments tie themselves in knots over whether a project who’s goal is to protect billions in revenue should have a budget of one million or one point five million. And to spend hundreds of thousands on trying to figure that out. The #NoEstimates message is that they don’t need to put themselves into that position.

It’s not for free, of course, that state of the art in development has to be present. But if it is, on we go.

In the video, Steve tries some rhetorical jiu-jitsu and claims that if we follow the Agile Manifesto value judgement and prefer to collaborate with our customers than to negotiate contracts with them, and they ask for estimates why then we should, in a collaborative mood, produce estimates. That’s a bit like suggesting that if an alcoholic asks me for a drink, I should, in a cooperative and generous spirit, buy them one.

I’d like to suggest a root cause of the disagreement between Ron and Steve. I’m going to speculate about the sorts of people and projects that Ron world with and that Steve works with.  Personally, I’ve worked in start-ups and in gigantic consultancies and I’ve done projects for blue-chip multinationals selling a service and for one-man-band products shops. My speculation is that in Steve’s world, IT is a always and only a cost centre. It’s viewed by the rest of the business as a dark hole into which, for unclear reasons, a gigantic pile of money disappears every year. The organisation is of course very well motived to both understand how big that hole is, and to try to make it smaller. Hence: estimation! in addition, Steve likes to present estimation as this cooly rational process of producing the best information we can from the merge scraps of fact available, suitably and responsibly hedged with caveats and presented in a well-disciplined body of statistical  inferences. And then the ugly political  horse-trading of the corporation gets going. I think that believing this is a reasonable defence mechanism for a smart and thoughtful person caught in the essentially medieval from of life that exists inside large corporations (and, whisper it, all the more so enlarge American corporations). But it isn’t realistic. In those environments, estimation is political, always.

My speculation is that Ron, and many #NoEstimates advocates, work more in a world where the effort (and treasure) that goes into building some software is very clearly, and very closely in time and space, connected with the creation of value. And that this understanding of IT work as part of the value creation of the organisation and the quickness of the return leads to estimation being really not such a big deal. An overhead of limited utility. So why do that?

You organization, I’ll bet, falls somewhere between these two models, so you probably are going to have to do #AsMuchEstimationAsYouNeedWhenYouNeedItAndThatsLessThatYouThinkAndNotSoOftenAsAllThatReallyButjustGetOverIT

Written by keithb

August 2, 2015 at 9:10 am

Posted in Uncategorized

TDD as if You Meant It at London Software Craftsmanship

leave a comment »

Please add your comments about the session to this post.

Written by keithb

August 29, 2012 at 4:10 pm

Posted in TDD

TDD as if You Meant It at XP Day London 2011

leave a comment »

Attendees, please add your thoughts, and links to your code repo if you wish, as comments to this post.

Thanks.

Written by keithb

November 21, 2011 at 11:56 am

Posted in conference, Raw results, TDD

Hiring…

with 2 comments

If you like the kind of work you see here, come join me in London. We’re hiring. Apply via LinkedIn or drop me a line.

Principal consultants take responsibility for particularly challenging solutions and in demanding organisational environments. They closely interact with senior project managers, customer representatives at all levels including senior management, and guide project teams. Together with the responsible project managers, they lead technical and strategic initiatives to success, ranging from critical consulting mandates to complex delivery projects. Together with business development and business unit managers, they actively expand Zuhlke’s business and develop new opportunities. This can involve taking the leading technical role in large bids.

Lead consultants take decisions and provide advice regarding complex technical systems. They closely liaise with the software development team, the project manager, and customer representatives, often with a technical background. They ensure that sound technical decisions are made and subsequently realised in state-of-the-art solutions by the project team. They can take the leading role in technical consulting assignments within their specialisation area.

The role is based in London and the majority of the work takes place in UK but on occasion training and consulting engagements may be delivered anywhere in the world.

The competitive package includes 20 days of professional development time per year.

Written by keithb

October 3, 2011 at 5:13 pm

Posted in Uncategorized

TDD as if You Meant it at Agile Cambridge 2011

with 4 comments

Attendees of the session at Agile Cambridge, please add links to you code in comments to this post.

Written by keithb

September 29, 2011 at 12:04 pm

Posted in Uncategorized

Iterative, Incremental Kanban

with 8 comments

There’s something about Kanban which worries me. Not kanban, which is a venerable technique used to great effect by manufacturing and service organization the world over for decades, but “Kanban” as applied to software development. More specifically, the examples of Kanban boards that I see worry me.

What you do now

David Anderson gives guidance on applying Kanban to software development something like this:

  • Start with what you do now
  • Agree to pursue incremental, evolutionary change
  • Respect the current process, roles, responsibilities & titles

Which is fine. What worries me is that the published examples of Kanban that I see so often seem to come from a place where what they do now is a linear, phased, one-shot process and the current process roles, responsibilities and titles are those of separate teams organized by technical specialism with handovers between them. Of course, there are lots of development organizations which do work that way.

But there are lots that do not. I’ve spent the last ten years and more as one of the large group of people promoting the idea that linear, phased processes with teams organized by technical specialism is a wasteful, high risk, slow and error prone way to develop software. And that a better alternative is an iterative, incremental, evolutionary processes with a single cross–functional team. And that this has been known for literally as long as I’ve been alive so it shouldn’t be controversial (although it still too often is).

Best case: Kanban will be picked up by those who are still doing linear, phased etc. etc. processes and will help them move away from that. A good thing. Worst case: the plethora of Kanban examples showing phases and technology teams undoes a lot of hard work by a lot of people by making linear, phased processes respectable again. After all, Kanban is the hot new thing! (And so clearly better).

Kanban boards

Take a look at the example boards that teams using Kanban have kindly published (note: I wish every one of those teams great success and am grateful that they have chosen to publish their results).  The overwhelming theme is columns read from left to right with a sequence of names like “Analysis”, “Design”, “Review”, “Code”, “Test”, “Deploy”. Do you see a problem with this?

Taken as a diagnostic instrument there is discussion of ideas like this: if lots of items queue up in and before the the “Test” column then the testers are overloaded and the developers should rally round to help with testing. Do you see a problem with this?

There is a way of talking about /[Kk]anban/ which strongly invites the inference that each work item must pass though every column on the board exactly once in order. This discussion of kanban boards as value stream maps, while very interesting in its own right, makes very explicit that in the view of its author the reason a work item might return from a later column to an earlier one is because it is “defective” or has been “rejected”. How is one to understand iterative development in which we plan to re-work perfectly acceptable, high quality work items with such language?

Not Manufacturing

Iterative development plans to rework items. Not because they are of low quality, not because they are defective, not because they are unacceptable, but because we choose to limit the scope of them earlier so that we can get to learn something about them sooner. This is a product development approach. Kanban is mainly a manufacturing technique. Software development resembles manufacturing to a degree of approximately 0.0 so it’s a bit of a puzzle why this manufacturing technique has become quite so popular with software developers. Added to which the software industry has a catastrophically bad track record at adopting management ideas from manufacturing in an appropriate way. We in IT are perennially confused about manufacturing, product development and engineering, three related but very different kinds of activity.

An Example

So, what if “what you do now” is iterative and incremental? What if you don’t have named specialist teams? And yet you would like to obtain some of the unarguable benefits of visualising your work and limiting work in progress. What would your kanban board look like?

Here’s one possibility (click for full-size):

Iterative, Incremental kanban board

Some colleagues were working on a problem and their environment lead to some very hard WIP limits: only two development workstations, only two test environments, only one deployment channel. But they are a cross-functional team, and they want to iteratively develop features. So, the column on the far left is a queue for new features and the column on the right holds things that are done (done recently, ancient history is in the space below). The circle in between is divided into three sectors, one for each of the three things that have WIP limits. Each sector has an inner and and outer part, to allow for two different kinds of activity: feature and integration. For example, both test environments might be in use but one for integration testing of several features and one for iterative testing of one particular feature.

The sectors of the circle are unordered. Any story can be placed in, and moved to, or back to, any other sector at any time any number of times, but respecting the WIP limit.

Feedback

Why can’t I find more examples like this?

I expect that some Kanban experts are going to see this and comment that they don’t mean for groups using Kanban to adopt linear, phased processes and specialized teams. And I’m sure that many of them don’t. But that’s what the examples pretty much universally show—and we know that people have a tendency to treat examples (intended to be illustrative) as if they were normative.

I’d really like to hear more stories of whole–team iterative, incremental kanban. Please point some out.

Written by keithb

September 16, 2011 at 11:59 am

Posted in kanban

Tagged with , ,

Distribution of Complexity in Hudson

with one comment

Suppose we were to take methods one by one, at random and without replacement, from the source code of Hudson 2.1.0 How would we expect the Cyclomatic Complexity of those methods to be distributed?

Here you will find some automation to discover the raw numbers, and here is a Mathematica Computable Document (get the free reader here) showing the analysis. If you have been playing along so far you might expect the distribution of complexity to follow a power law.

Result:

This evidence suggests that the Cyclomatic Complexity per method in this version of Hudson is not distributed according to a discrete power–law distribution (the hypothesis that it is, is rejected at the 5% level).

Probability of Complexity of Methods in Hudson

This chart shows the empirical probability of a given complexity in blue and that from the maximum–likelihood fitted power–law distribution in red. Solid lines show where the fitted distribution underestimates the probability of methods with a certain complexity occurring, dashed lines where it overestimates. As you can see, the fit is not great, especially in the tail.

Note that both scales are logarithmic.

Other long-tailed distributions (e.g. log-normal) can be fitted onto this data, but the hypothesis that they represent data is rejected at the 5% level.

Written by keithb

August 31, 2011 at 8:23 pm