cumulativehypotheses

mostly professional blather

Archive for the ‘Uncategorized’ Category

On TDD and Double Entry Bookkeeping

leave a comment »

“Uncle” Bob Martin offers here a number of claimed parallels between TDD and Double-entry Bookkeeping (DeB). I think he’s rather missed important aspects of—surprisingly—both of those things. They aren’t primarily about correctness, although they do help with that, they are both more about knowing the state of your world.

That TDD is somehow like DeB is not an especially new observation, people have been making that comparison for a decade or more, but in the recent post Bob says that:

  • Both [TDD and DeB] are disciplines used by experts who carefully manipulate complex documents full of arcane symbols that must, under pain of terrible consequences, be absolutely correct in both type and position.
  • Both involve representing a long sequence of granular gestures in two different forms on two different documents.
  • Both techniques update their documents one granular gesture at a time, and each such update concludes with a check to be sure that the two documents remain in balance with each other.

This is…highly questionable. Bob uses the parallel to try to make programmers who don’t TDD with much enthusiasm (or even at all) feel bad, asking:

[…] why are accountants able to maintain their discipline so assiduously and so completely? Can you imagine a group of accountants saying to each other: “Guys, we’re really under the gun. We’ve got to finish these accounts by Friday. So, do all the Assets and Equities. We’ll come back and do the Liabilities later.”

No. You can’t imagine that. Or, at least, you shouldn’t. Accountants can go to jail for doing things like that.

With a suitable analogy to sloppy programmers. Also…highly questionable.

Suppose that your business chooses to use DeB, then…oh, but wait, that is a choice. DeB is not required of all businesses.

There are two gold-standard benchmark type things for financial reporting: GAAP (from the USA) and IFRS (rest of the world, ie most of it). The main point of these standards is to have common approach to reporting between businesses, to allow for comparisons, and a clear and accurate approach to accounting within a business; thereby to afford good management and avoid fraud. These standards relate largely to how the business is described in its published financial records. They do not, so far as I know, explicitly mandate DeB. However, for a business that’s much more than a lone sole trader it is hard to meet the requirements for financial reporting any other way. Not that a small scale Sole Trader would be reporting much, anyway, but that’s another story. I can be done, there is such a thing as Single Entry Bookkeeping and it is perfectly respectable, but its hard to extract from such records the kind of reporting that auditors, and even more so again markets, if your business is publicly traded, want to see. And that’s the important bit. Both GAAP and IFRS relate to what, and how, a business publishes about its finances.

People tend to view accountancy as a rather arid discipline, and tend to get hung up on the vast quantities of arithmetic required. But this is to miss the point. The accounts of a firm are just that: they give an account of what went on during a trading period. They tell a story. These invoices were settled, this capital expenditure was incurred, that provision for whatever it it was was made, and so on. The accounts tell the story of where the value of the company came from. The art of accountancy, and where it starts to overlap with management and finance, is how that story is told.

  • Do bookkeepers and accountants make mistakes? Yes. Do they go to jail for them? No. If they aid and abet deliberate fraud by the company, then they may do, of course. And if they get a reputation for making lots of errors then they will find it hard to secure work, and may eventually lose professional accreditations. But jail for honest mistakes? No.
  • Are a company’s accounts free of error at all times? No. At the close of an accounting period the accountant will prepare a “Trial Balance”, which is exactly what is sounds like—a first attempt at constructing a valid balance sheet. A balance sheet shows a net position for each account at the end of the period, taking all the transactions during that period into account. Is the trial balance bound to be correct because of DeB? No. There can be errors that do not break the accounting identity. Does anyone go to jail if the Trial Balance does not…balance? No. They just have to do a lot more work and hunt down the error. And recall that it is the Directors of a business that sign off and publish the financial reports of a business, not the accountants. And recall that in most jurisdictions you need two accountants: the every-day ones that help operate the business and the independent auditors who validate that this was done correctly. But it is still the Directors who sign off and publish the accounts.

Bob emphasises that DeB contains some built-in checks against error, he emphasises correctness. And at one level, DeB does help with correctness. If the fundamental accounting equation (the invariant: assets = liabilities + equity) is broken then an error has occurred somewhere. It could be a simple arithmetic error, and back in the day that was a very useful thing to be able to spot, or it could be a data entry error, and again, good to be able to spot that. But since the 1950s, starting with larger and extending now down to all but the very smallest companies, the arithmetic is fully automated and the data capture of receivables and payables and what-not is very largely so and the kinds of error that DeB flags all by itself are now very rare. The mistakes that DeB doesn’t catch are modelling errors.

A large business may maintain many, many accounts for different purposes. The “Double” in DeB can be misleading. A transaction is not necessarily recorded in in two accounts, but in at least two. Suppose that Consolidated Amalgamated wants to buy a new Turbo-Encabulator—those are pricey, so in addition to the (perfectly respectable) slight-of-hand in depreciating that capital expenditure they also sell a bond to raise additional cash. As a result of that one transaction a bunch of accounts will be updated: cash on hand, debt, perhaps a couple of kinds of short and long term asset, maybe others depending on how tricky sophisticated they want to be. Here are many opportunities for putting the wrong values in the “wrong”—that is, badly disadvantageous—places, even if the whole thing still balances out. Other kinds of check are required over and above the balancing.

DeB is more complicated than single entry and does need trained operators. If the correctness guarantees offered by DeB are relatively weak, why do it? One answer, and a very important one, is that with DeB the value of a company can be known exactly at any time. The invariant assets = liabilities + equity is true, in principle, all the time. Which means that the managers of a company, and its shareholders (if it has them) or other investors, can see the state of the company all the time. And, even better, can quickly assess the effect of a management decision or action on the finances of the company.

TDD does the equivalent for a codebase. It lets you find out its status at any time, and assess very quickly the impact of a change.

If TDD is like DeB, then we can ask: in TDD, who is it that fill roles comparable to those the auditors and the Directors? What is the equivalent of signing off the audited accounts?

Advertisement

Written by keithb

December 21, 2017 at 10:54 am

Posted in Uncategorized

Call for Volunteers: Programming Paradigm Competition

with 4 comments

[There are updates, and also useful discussion in the comments]

Join the Slack team for this competition where the baseline requirement is on the backlog. Get an email address to me somehow and I will invite you. PLEASE NOTE that the use of Slack does not imply any sort of real-time interactive collaboration between participants, or between participants and myself. Slack enables but does not require “chat”. If you choose to participate you will be added to the Slack team and, I believe, be able to see what conversations have gone before. 

Let’s say that you think it’s really important that people who work as programmers write their code in a certain way.

Whether this is for aesthetic reasons, or economic ones, or…actually the reason why you think this is not important. Only that you do. And, for my purposes, how you think they should isn’t important either. Maybe you think that they should only write assembler using pen and paper, maybe you think they should always use the very latest IDE to write dynamically typed OO code using TDD, maybe you think they should write using only pure functions and immutable data, maybe you think they should have rigorous proofs of correctness of every line of code, maybe…no, I don’t care. Only that you do.

Here’s the thing: advocates of one programming language, paradigm, toolset, what-have-you, always have really good explanations for why their way ought to be the best—for whatever figure of merit they say they care about—but we don’t really know. This should really be a topic for academic research but unfortunately no one is prepared to pay for the sort of study that would really settle the question, which would be astonishingly expensive. For practical reasons, what studies there are tend to be based on small cohorts of students who work on toy problems (code katas and such like), for a short time, once, having received a few lectures on whatever the techniques in question are. There is no reason to think that the results of such experiments should carry over particularly well to the work of seasoned practitioners.

But, maybe we can take advantage of the enthusiasm of advocates to do a semi-controlled, informal study using seasoned practitioners.

This is an idea that I’ve been kicking around on twitter with John De Goes. He likes a certain kind of functional programming. I don’t, much. I like a certain kind of OO programming and John doesn’t, much. What if, we wondered, some people who really knew what they were doing were to use their preferred approach to address a realistic—but small, we are talking about volunteers—problem under something like industrial conditions? What might we, they and the programming world at large learn about what the various programming paradigms do and don’t lead to?

“Industrial conditions” here means things like: that the scope of the problem grows and changes over time, that you—therefore—don’t know and can’t even in principle find out what the entire scope is, that the very goal of the exercises changes under you. If such volunteer expert practitioners were to work on such a system then we could look at what they have to do as those things evolve and develop and we could try to get a handle on how different programming paradigms—in the hands of seasoned practitioners—perform against a figure of merit of great interest in industry: how hard is it to adapt the code to new or changing requirements.

And this would be a back-office-y, line-of-business-y problem, because that’s so many programmers’ bread and butter.

I have such a problem in hand.

The Exercise

There’s a business domain that I know reasonably well: the scheduling of advertising copy into commercial breaks between TV programs on a linear-broadcast station. You can, of course, buy COTS software to manage this problem. Such products are vast in scope and mind-boggling in expense. But the central problem is—to begin with—simple. I think that this is a good proxy for a broad class of business problems.

Data stores can be file based, or use an integrated in-memory database of some sort, but no separate database products, please. Processing can be mostly batch, with a very minimal interactive command-line interface if required.

Product Owner

Here’s how it will work: I will be the Product Owner.

There will be is a Slack channel where stories will be published and volunteers will ask me questions from time to time about stories and I will answer, from time to time—so everyone can see both questions and answers, asynchronously. I do not expect to hold any synchronous conversations about this on Slack. In fact, to provide an equitable environment for all participants, I will not hold any synchronous conversations. Such answers will be—for the purposes of the exercise—normative.

There will be a backlog of stories, quite a long one, recorded as a series of additions to a text file in a private github repo. So that I can’t prejudice the exercise stories will be written far in advance and published from time to time along with the hash of the checkin which added them to the backlog. At the close of the exercise the repo will be made public and everyone will be able to see that the backlog wasn’t manipulated along the way to favour or disfavour any one implementation.

I won’t intensively test anyone’s code, but I will review the means they choose to use to demonstrate that their code implements the story and take a view on whether or not I’m convinced by it. But that’s just, like, my opinion. I would like to be able to (build and) run all the solutions using the most vanilla setup—ie, what the download of a recent version give me out-of-the-boxfor whatever the target language is. I’d much prefer to do that in a FreeBSD VM. Linux if I must. If you would like to use a .Net language, perhaps you could make sure your solution works with Mono? Maybe I can borrow a Windows machine from somewhere if not.

Practitioners

People will volunteer to be Practitioners will implement stories. Their code will be in a public github repo—sadly this does limit us to languages that have a sensible representation in text files.

Practitioners can work on the problem for as long or as little as they like—but should apply their nominated language, tools and approach with maximum sincerity. I’m more interested in seeing solutions done “right”—whatever that means in a Practitioner’s chosen paradigm—than done quickly. I can generate new stories basically forever—it’s a huge problem space—so we’ll run the exercise until no one is interested in doing any more of it, and then stop.

Solutions should include whatever the equivalent of a Makefile is so that I can build and run them locally.

Assessment

The figure of merit will be not how quickly, in real time, stories are implemented, this is not a race, but how small, proportionately, the impact of each new or changed requirements is on the already existing code.

Volunteer

If you’d like to volunteer to be a Practitioner, please reply to this posting and drop me and John a note on twitter.

There’s no reason why we couldn’t usefully have more than one Practitioner for a given paradigm, but it would be good to have at least one for each of:

  • Procedural language (eg C)
  • OO in a dynamically typed language (eg ruby, python) with TDD [this is the one I’d pick]
  • OO in a statically typed language (eg Java, C#) with TDD
  • OO of some sort without TDD
  • strict (and eager) dynamically typed functional-ish language (eg CommonLISP, Erlang)
  • non-strict (and lazy) statically typed functional language (eg Haskell) in a less polymorphic, more concrete style
  • non-strict (and lazy) statically typed functional language (eg Haskell) in a maximally polymorphic, most highly abstracted style [John likes this one]
  • Other (eg Prolog, Forth, what the hell: assembler)

Prizes

There will be one prize for the paradigm most clearly superior in the consensus opinion of the Practitioners as interpreted by the Judge (me). And that prize is GLORY! 

The Judge’s decision (ie, mine) is final, normative, and nugatory.

FAQ:

  • Am I biased? Yes.
  • Do I stand by to be pleasantly surprised? Yes.
  • Do you have to take my word for that? Yes.

 

If this does come off, I’ll attempt to find an academic interested in doing something with the repos.

Written by keithb

October 27, 2016 at 5:03 pm

Posted in Uncategorized

Special–ism

leave a comment »

If there is a common theme to this “Agile” journey that you’re almost certainly on if you’re reading this—and there might not be—then a good candidate for that could be a certain series of unpleasant, ego–challenging realisations.

It began with programmers, back in the late 90s, as eXtreme Programming took hold and the unpleasant, ego–challenging realisations dawned that: the common old–school programmers’ preferred mode of engagement with users—pretending that they don’t exist—and their preferred mode of engagement with their employers—surly passive–aggression leavened with withering cynicism—and their preferred mode of engagement with each other—isolation interspersed with bouts of one–up–manship—and their preferred mode of engagement with the subject matter at hand—wrestling it into submission via the effortful application of sheer individual intellectual brilliance—weren’t going to cut it any more.

Over time, the various other specialisms that go into most industrial–scale software development—and IT systems work more generally—have had also to come to terms with the idea that they aren’t that special. With the idea that locking themselves into their caves with a sign outside saying “Do Not Disturb, Genius at Work” and each finding a way to make their particular specialism into the one true bastion of intelligence—and the true single point of failure, and that not seen as a bad thing—is not going to cut it any more.

A cross–functional, self–organising team will of course contain, must contain, various individuals who have, though education, experience or inclination, a comparative advantage over their colleagues in some particular skill, domain, tool, or whatever. It would be perverse indeed for a cross–functional, self–organising team not have the person who knows the most about databases look first at a data storage problem. And it would be foolish indeed to let that person do so by themselves—at least pair, maybe mob—so that the knowledge and experience is spread around. And it would be perverse indeed for a cross–functional, self–organising team to make a run preventing any one member of the team having a go at any particular problem merely because they aren’t the one with the strongest comparative advantage in that topic. And it would be foolish indeed for such a team to not create a physical, technical and psychological environment where that could be done safely. And so on.

Different disciplines have embraced their not–special–ism at different times and with different levels of enthusiasm. “Lean UX” represents the current batch, as designers get to grips with the idea that they—in common with ever other discipline—turn out not to be the special and unique snowflakes uniquely crucial to the success of any development endeavours. Where is your discipline on this journey?

Written by keithb

May 18, 2016 at 2:47 pm

Posted in Uncategorized

TDD as if You Meant It at Agile Mancs 2016

leave a comment »

Please share your experience in comments to this blog post. Would be most grateful if you also shared your code. Thanks.

Written by keithb

May 12, 2016 at 9:22 am

Posted in Uncategorized

TDDaiYMI XPManchester

with 3 comments

Please report on your experience with the session in comments to this post. And, if you’re happy to, please post a link to your code.

Written by keithb

April 14, 2016 at 4:36 pm

Posted in Uncategorized

Some Mass Transport Metaphors

leave a comment »

One day the software development industry will be gown-up enough to talk about what we do in its own terms, but for now we have, for some reason, to use metaphors. And we love, for some reason†, transport related ones. A recent entry is the (Agile) Release Train. It’s a terrible metaphor. Here are some other mass–transportation metaphors for how you might organise the release schedule for your software development activities, ordered from less to more meritorious in respect of the figures of merit batch size, and the cost of delay if you miss one.

In these metaphors, people stand for features.

  1. Ocean liner—several thousands of people move slowly at great expense once in a while. Historically, this has been the status quo for scheduling releases.
  2. Wide–body airliner—several hundreds of people move quickly, a couple of times a day
  3. Train—several hundreds of people move at moderately high speeds several times a day

Some failure modes of the train metaphor…

Although the intent of a “release train” is that it leaves on time no matter what, and you can get on or off at any time until then, and if you miss this one, another will be a long in a little while, in practice we see attempts to either:

  • cram more and more people onto the train in a desperate attempt to avoid having to wait for the next one, à la rush-hour in many large cities, or
  • add more and more rolling stock to the train to avoid having to run another one

More generally, what a metaphor based on trains will mean to you may depend a lot on your personal experience of trains. For my colleagues in Zürich trains are frequent, swift, punctual, reliable, capacious & cheap. For my colleagues in London, not so much any of those things…

  1. Tram—several dozens of people move significantly faster than walking pace every few minutes

† Actually, that might be because for quite a lot of the the industrial period, say from about 1829 to about 1969, various forms of transport were the really hot technological stuff of the day.

Written by keithb

March 29, 2016 at 11:22 am

Posted in Uncategorized

Circumflexual

leave a comment »

So, reports that in France there is outrage amongst (right-wing, conservative) commentators that the current government of President Hollande (a socialist) has reconfirmed the orthographic changes proposed originally in 1990 and agreed by the government of President Chirac (right-wing, conservative) which—amongst other things—deletes the circumflex from words where it makes no difference.

Some of these reports seem to follow the wikipedia article on the circumflex in mentioning that in English, apart from loan words, the circumflex is not used today but was once: in the days when posting a letter was priced by weight an ô was used to abbreviate ough. As in “thô” for “though”. This seems like a fine convention, and one that I intend to adopt in tweets and instant messages. Now that we can pretty much assume that both ends of any messaging app conversation will have good Unicode support we can do a range of interesting things.

For example, althô you can put newlines in tweets† it seems as if many messaging apps are designed on the assumption that no–one using them ever has two consecutive thoughts and interprets a [RETURN] as send. I’ve started using in messages. I wish it could be typed on an iPhone soft keyboard. For some reason § can be, which I think is no more obscure. Anyway, the pilcrow can be copied and pasted, as can ‘∀’ to  mean “all” & ‘∃’ to mean “there’s a” or similar. I’d like to use ‘¬’ for “not” but that might be a step too far, althô I do see a lot of “!=” and “=/=” type of thing in my twitter stream. I also tend to use pairs of unspaced em–dash for parenthetical remarks—like this—which saves two characters in a tweet vs. using actual parens (like this). The ellipsis comes in very handy in several ways… ¶ Over time I’m getting more relaxed about using ‘&’ which of course has a particularly long heritage, although not so long as is sometimes thôt.¶ What other punctuation can we revive, re-purpose or re-use?

Update: how do we feel about ‘þ’ or ‘ð’, both easily available from the Icelandic keyboard, for the?


† I’ve used this to sneak footnotes into tweets. Of course, this will all become a bit pointless if the managers at Twitter really do continue to force fit their brilliant ideas into the product, rather than continuing their previously successful strategy of paving cowpaths.

Written by keithb

February 20, 2016 at 2:19 pm

Posted in Uncategorized

Why I have such a strong negative reaction to #NoEstimates

with 6 comments

TL;DR

You may be pleased to learn that this is probably the penultimate thing I have to say here about #NoEstimates.

Anyway, it’s for these reasons…

It’s conceptually incoherent

From what what I can gather from following twitter discussions, and reading blogs, and articles, and buying and reading the book, then, in #NoEstimates land, supposing that someone were to come and ask you “how long will it take to develop XYZ piece of software?” then any one of the below could be an acceptable #NoEstimates answer, depending on which advocate’s line of reasoning you follow:

  1. Estimation is morally bankrupt and I shall have no part in continuing the evil of it. You are a monster! Get out of my office! But fund the endeavour anyway. Yes, I do mean an open-ended commitment to spend however much it turns out to take.
  2. Estimation is impossible even in principle so there is no way to answer that question, however roughly. But do please still fund the endeavour. No, I can’t indicate a reasonable budget.
  3. Estimation is impossible even in principle so there is no way to answer that question and even if there were I still wouldn’t because you can’t be trusted with such information. No, I can’t indicate a reasonable budget. It’ll be ok. Trust me. No, I don’t trust you; but trust me.
  4. Estimation is so difficult and the results so vague that you’re better off just starting the work and correcting as you go. It’ll be ok. Trust me. No, I still don’t trust you.
  5. Estimation is so difficult and the results so vague that you’re better off choosing to invest a small, but not too small, amount of money to do something and learn from it and then decide if you’ve come to trust me and want to do some more (or not, which would be disappointing but OK). For your kind of thing, I suggest we start with $BUDGET_FOR_A_BIT, expect to end up spending something in $TOP_DOWN_SYNTOPIC_ESTIMATED_SPEND_AS_A_RANGE
  6. Estimation is difficult to do with any useful level of confidence and the results of it hard to use effectively. What would you do with an estimate if I did provide it? How could we meet that need some other way?
  7. Here is a very rough estimate, heavily encrusted with caveats and  hedges, of the effort required to deliver something of a size such as experience suggests that what you asked for will end up being. No, I will not convert that into a delivery date for you. Let me explain a better way to plan.
  8. OK, OK, since you insist, here is a grudgingly made estimate of a delivery date in which I have little faith, I hope it makes you happy. Please don’t use it against me.

For the record: my preferred answer is some sort of combination of 5 and 6, with a hint of 4, and 7 as a backup. And I have turned down potentially lucrative work on the basis of those kinds of answer being rejected.

That’s a huge range of options, many subsets of which are fundamentally, conceptually, incompatible with other subsets. Which means that #NoEstimates doesn’t really seem to me as if it’s much help in deciding what to do.

Except…one good thing about #NE is that it does rule out this answer: “let me fire up MS Project and do a WBS and figure out the critical path and…” which is madness, for software development, but you still see people trying to do it.

Also for the record: In my opinion far too many teams spend far too much time estimating far too many things in far too much detail, and in ways that aren’t sufficiently smart or useful.

Even in an “Agile” setting you see this, and for that I blame Scrum which has had from the beginning in it this weird obsession with estimating and re-estimating, and re-re-estimating again and again and again. I don’t do that. And I certainly don’t do, and do not recommend task-level estimates (or even having tasks smaller than a story).

I can’t understand what anyone’s saying

It seems as if the “no” in #NoEstimates doesn’t mean no. Or maybe it does. Or it might mean: prefer not to but do if you have to. Or it might mean: I’d prefer that you didn’t, but if it’s working for you carry on.

And the “estimate” in #NoEstimates doesn’t mean estimate. It means: an involuntary commitment to an unsubstantiated wild-arsed guess that someone in authority later punishes you for not meeting§. Or it might mean estimate after all, but only if the estimate is based on no data. If there’s data, then that’s a “forecast”, which is both a kind of estimate and not an estimate.

“Control” seems to be a magic word for some #NE people. It’s said to them in the morally neutral, cybernetics sense but they hear it in some Theory X sense, as if it always has the prefix “Command and ”. This creates the impression that they have no interest in being in control of development, budgets, etc. Which might or might not be true. Who can say?

So not only are the #NoEstimates concepts all over the place, they’re discussed in something close to a private vocabulary—maybe more than one private vocabulary. This is not an effective approach to an engineering topic.

Nevertheless: it’s strong medicine and it’s being handled sloppily

…which, if you’ve ever taken strong medicine you’ll know is a poor policy.

In the contexts for software development that I’m familiar with the idea of making estimates as an input to a budgetary process at the level of months, quarters, years and maybe (hopefully) beyond is really deeply baked in. Maybe this is part of why Scrum has managed to find such a better fit in corporate land that, say, XP ever did, because a Scrum team can seem to still play that game.

For a development team to around and say even that estimates are too difficult to make useful so lets do something else instead is very challenging to the conventions of the corporation. Conventions which I believe should be challenged, in principle. To turn around and say estimation is (and always was) impossibly difficult and management were doing bad things with the results of it is going to deeply challenge and upset many people in an unhealthy way. That’s not the way to effectively change organisational habits. We saw this before with Scrum.

Now, I happen to be of the opinion that estimation is hard, but can be done well, and you can learn to do it better, and the results of it are often misapplied. And I’ve come to the opinion that the most effective and/because least upsetting route to dealing with that is to re-educate managers to do their work in a better way such that they stop asking for estimates.I  find that coaching managers to ask more helpful questions beats coaching programmers to give fewer unhelpful answers.

For the record, too: I agree that too many enterprises use developers’ estimates in a way that is invalid in theory, unhelpful in practice, and questionable in its effect on the long term health of the business (and the developers). But, also for the record, I do not agree that this is an inevitable consequence of some intrinsic problem with estimation.

But in the #NE materials that I’ve seen there’s not really much recognition of these organizational aspects of the proposed change. It seems mainly to be about convincing developers that they shouldn’t be producing estimates and explain how misguided (and best) or evil (at worst) management are to ask for estimates in the first place.

We’re just not smart about this kind of thing

…and the treatment of #NoEstimates that I’ve seen fosters exactly the kind of not-smartness that can get us into a real mess.

The industry, and corporations within it, and teams within corporations have a tendency to lurch from one preposterous extreme to another, and to wildly mis-interpret very well intentioned recommendations. This is a particular problem when the recommendation is to do less of something that’s perceived as irksome.

eXtreme Programming offers a good example. When considering a proposed way of working I often find it useful to consider to what it has arisen as a reaction. On one hand, #NoEstimates seems to be partly a reaction against the very degenerate Scrum-and-Jira style of “Agile” development that many corporations are far to comfortable with. And on another hand, it seems to be a reaction against some really terrible management behaviour* that’s connected with estimation.

eXtremeProgramming can be usefully read as a reaction against the insane conditions** in large corporate software shops in the mid 1990s. People who really wanted to be programmers rushed to XP with joyous relief. As it happens I took some convincing, because I kind-of wanted the models thing—and not just boxes-and-arrows, I love, love, love me some ∃s and ∀s—to work. But it doesn’t. So, you know, I’m able to recognise that my ideas need to change, and I’m prepared to do the work.

Anyway, in part of the rush to XP we found that people abandoned the writing of design documents—these days they’d be condemned as muda, but that Lean/kanban vocabulary wasn’t so widespread then—but unfortunately the design work that we were meant to do instead in other ways didn’t really get picked up. Similarly, BRDs and Use Cases and all that stuff went out of the window but good requirements practices tended to vanish too. And the results were not pretty.

And so, over a long and painful period we had to invent things like Software Craftsmanship to try to re-inject design thinking, and we—that is, Jeff Patton—had to introduce things like Story Mapping to get back to a way to talk about big, complex scope.

I expect the worst

I invite you to get back to me in, oh, say, 5 years and check on this…forecast: Either #NoEstimates will have burned out and no-one will really remember what it was, or…

  1. There will be a[t least one] subscription taking membership organization  devoted to #NoEstimates
    1. Leading members of that organization will make their living as #NoEstimates consultants and trainers, far removed from any actual development
    2. This organization will operate a multi-level marketing scheme in which the organization certifies, at great expense, senior trainers who train, at substantial expense, certified trainers who train, at not outrageous expense, certified #NoEstimates “practitioners”
  2. adoption of #NoEstimates will turn out to lack some benefit of estimation that #NoEstimates advocates wont’t see and can’t imagine and some other practice will have to have been invented to get it back.

And then the circle will be complete. I don’t think that we’re collectively smart enough to avoid this dreary, self-defeating outcome.

Update—as if by magic, this twitter exchange appears within 12 hours of my post (NB I mean no criticism of either party and I thank them for holding this conversation in public):

Noel asks: What is the first thing one should consider when contemplating a move to NoEstimates?
And Woody replies isn’t something you “move to”. It is about exploring our dependence on and use of estimates, and seeking better.

I expect that many conversations like this are taking place. And that’s how the subtle but valuable message fades away and the industry’s hunger to be told the right thing to do  (and then, worse, do it) takes over.


§ A terrible idea which I am happy to join in condemning

† Which is largely in corporations, by which I mean for-profit enterprises which intend to be consistently profitable over a long period.

Although I have worked at, as it were, newly-founded companies with few staff, little money and one big idea, working out of adapted loft space I don’t characterise those as “startups”. That’s because the business model was not to throw fairly random things at the wall in the hope that one of them stuck long enough to pay the bills until the next funding round arrived; repeat until exit strategy kicks in or broke—which is how I understand “startups” to work. That’s a world that I don’t understand very well because I haven’t done it.

So: some of the corporations I’ve worked have been product companies, and some sold a service, and some were small and local and some were large and global, and plenty of variations on that. That’s the world I understand, and what I write here grows out of that understanding.

* Years ago I was asked to brief the management functions of a certain corporate entity about this new “Agile” thing that the technology division wanted to do. This was interesting.

The F&A people wanted to know when, during an iterative, incremental development process they were supposed to create the intangible asset on the balance sheet representing the value-add on the CAPEX on building the system, so that they could start amortising it.

The HR people wanted to know how, with a cross-functional, self-organizing team in place, they could do performance management and, and I quote, figure out “who to pay a bonus to and who to fire”.

I’ve recently heard of companies that link the “accuracy” (whatever that might mean…) of task estimates to bonus pay. And I agree with J.B. that it’s fucking disgusting. What I very much doubt is that fixing that state of affairs is primarily a bottom-up exercise in not estimating.

** Around that time I held—mercifully briefly—a job as a “Software Engineer” in which the ideal we were meant to strive for was that no code was created other than by forward-generation from a model in Rational Rose.

Written by keithb

December 12, 2015 at 7:32 pm

Posted in Uncategorized

Is Testing “Waste”?

leave a comment »

That is, in the technical, sense used in Lean manufacturing, who’s first two principles include:

  1. Specify value from the standpoint of the end customer by product family.
  2. Identify all the steps in the value stream for each product family, eliminating whenever possible those steps that do not create value.

The “steps that do not create value” are waste. If our product is, or contains a lot of, software, is the action of testing that software waste, that is, not creating value from the standpoint of the end customer?

At the time of writing I am choosing the carpets tiles for our new office. On the back of the sample book is a list of 11 test results for the carpet relating to various ISO, EN and BS standards, eg the EN 986* dimensional stability of these carpet tiles is < 0.2%—good to know! There are also the marks of cradletocradle certification, GUT registration, BREEAM registration, a few other exotica and a CE mark. Why would the manufacturer go to all this trouble? Partly, because of regulation: an office fitter would baulk at using carpet that did not meet certain mandatory standards. And partly because customers like to see a certain amount of testing.

Take a look around your home or office, I’ll bet you have a lot of small electrical items of various kinds. Low-voltage power supplies, in particular.  Take a look a them. You will find on some the mark of Underwriters Laboratories, UL Mark which indicates that the manufacturer has voluntarily had the product tested by UL for safety, and maybe for other things. If you’re in the habit of taking things apart, or building things, you might also be familiar with the UL’s “recognised component” mark for parts of productsUL Recognised Component Mark. On British made goods you might see the venerable British Standards Institution “Kite Mark” BSI Kite Mark, or maybe on Canadian gear the CSA mark CSA Mark, on German kit one of the TÜV marks, and so on. These certifications are for the most part voluntary. Manufacturers will not be sanctioned for not obtaining these marks for their products, nor will—other than in some quite specialised cases—anyone be sanctioned for buying a product which does not bear these marks.

Sometimes a manufacturer will obtain many marks for a product, and sometimes fewer, and sometimes none. I invite you to do a little survey of the electrical items in your office or home: how many marks does each one have. Do you notice a pattern?

I’ll bet that the more high-end a device—in the case of power supplies, the more high-end what they drive—the more marks the device will bear, and the more prestigious those marks will be. Cheaper gear will have fewer, less prestigious marks—ones that make you say “uh?!”†† and the very cheapest will have none.

If testing is waste, why do manufacturers do this?

How does your answer translate to software development?


* BS EN 986:1995—Textile floor coverings. Tiles. Determination of dimensional changes due to the effects of varied water and heat conditions and distortion out of plane.

† tanks, missiles, land-mines, leg-irons, electric cattle-prods, that sort of thing.

†† There are persistent rumours that some Chinese manufacturers of questionable business ethics have concocted a mark of their own which looks from a distance like the Screen Shot 2015-11-04 at 20.38.27mark

Written by keithb

November 4, 2015 at 8:41 pm

Posted in Uncategorized

Mocks

with one comment

Well, this feels like a conversation from a long time ago. This presentation got tweeted about, which asserts that

Mocks kill TDD. [sic]

which seems bold. And also that

TDD = Design Methodology

which seems wrong. And also that

Test-first encourages you to design code well enough to test…and no further

which seems to have profoundly misunderstood TDD.

TDD

Just so we can all agree what we’re talking about, I think that TDD works like this:
repeat until done:

  • write a little test, reflecting the next thing that your code needs to do, but doesn’t yet
  • see it fail
  • make all tests—including the new one—pass, as quickly and easily as possible
  • refactor your working code to produce an improved design

I don’t see that as being a design methodology. It’s a small-scale process for making rapid progress towards done while knowing that you’ve not broken anything that was working, and which contains a publicly stated commitment to creating and maintaining a good design. There’s nothing there about what makes a good design—although TDD typically comes with guidance about well designed code being simple, well designed code lacking duplication and—often overlooked, this—well designed code being easy to change. I also often suggest that if the next test turns out to be hard to write, you should probably do some more refactoring.

Note that in TDD we don’t—or shouldn’t—test a design, that is, we shouldn’t come up with a design and then test for it. Instead we discover a design through writing tests. TDD doesn’t design for you, but it does give you a set of behaviours within which to do design. And I’m pretty sure that when followed strictly, TDD leads to designs that have measurably different properties than designs arrived at other ways. Which is why this blog existed in the first place (yes, I have been a bit lax about that stuff recently). UPDATE: a commentator on lobste.rs (no, me neither) quotes me saying that “TDD doesn’t design for you, but it does give you a set of behaviours within which to do design.” and asks: how is TDD not a design methodology, then?! And I answer: because it doesn’t provide a vocabulary of terms with which to talk about design, it doesn’t provide a goal for design, it doesn’t provide any criteria by which a design could be assessed, it doesn’t provide any guidance for doing design beyond this—do some, do it little bit at a time, do it by improving the design of already working code. If that looks like a methodology to you, then OK.

But Ken does have a substantive objection to code that he’s seen written with mocks. Code which has tests like this:

A Terrible Test which Happens to use Mocks

A Terrible Test which happens to use Mocks

and I certainly agree that this is a terrible test. There are far too many mocks in it, and their expectations are far too complex and far too specific. Worst of all, the expectations refer to other mocks. This is terrible stuff. You can’t tell what the hell’s going on, and this test will be extraordinarily brittle because it reaches out far too far into the system. It probably has a net negative value to the programmers who wrote it. That’s bad. Don’t do that.

Is this the fault of mocks? Not really. The code under test here wouldn’t be much different, I’ll bet, if it hadn’t been TDD’d—If this code even was TDD’d, I have my doubts although people do do this sort of thing, I know. This confusing, brittle, unhelpful test has been written with mocks, but not because of mocks. One could speculate that it was written by someone who’d got far, far too carried away with the things that mock frameworks can do, and failed to apply good taste, common sense and any kind of design sensibility to what they were doing. Is that the fault of mocks? Not really. Show me a tool that can’t be abused and I’ll show you a tool that isn’t worth having.

Other Styles of Programming

Ken, of course, has an agenda, which is really to promote a functional style of programming in which mock objects are not much help in writing the tests. I think he’s right about that and it should be no surprise as mocks are about writing tests that have something to say about what method invocations happen in what order, and as you move towards a functional style that becomes less and less of a concern. So maybe Ken’s issue with mocks is that they don’t stop you from writing non-functional code—to which I say: that doesn’t mean that you have to.

If you can move to functional programming (spoiler: not everyone can) and if your problem is one that is best solved though a functional solution (spoiler: not all of them are), then off you go, and mocks will not be a big part of your world and fair enough and more power to you. But if not…

Now, I tweeted to this effect and that got Ron wondering about that kind of variation, and why it might be that Smalltalk programmers don’t use mocks when doing TDD. Ron kind-of conflates what he calls the “Detroit School” of TDD and “doing TDD in Smalltalk”, which is kind-of fair enough as Kent and he and the others developed their thinking about TDD in Smalltalk and that’s the style of TDD that was first widely discussed on the wiki and spread from there.

Ron says that he does use “test doubles” for:

“slow” operations, and operations across an interface to software that I don’t have control of

and of course mocks are very handy in those cases. But that’s not what they’re for. Ron says:

Perhaps our system relies on a slow operation, such as a database access […] When we TDD such a thing, we will often build a little object that pretends to be a database […] that responds instantly without actually exercising any of the real mechanism. This is dead center in the Mock Object territory,

Well, no. Again, you can use mocks for such tests, but you’ll only get much value from that if your test cares about, say, what the query to the database is (rather than merely using the result). And while it will make your tests go fast, that’s not the real motivation for the mock handy as it may be.

A Brief History Lesson

Mocks were invented to solve a very specific problem: how to test Java objects which do not expose any state. Really not any. No public fields, no public getters. It was kind-of a whim of a CTO. And the solution was to pass in a collaborating object which would test the object which was the target of the test “from the inside” by expecting to be called with certain values (in a certain order, blah blah blah) by the object under test and failing the test otherwise.

A paper from 2001 by the originators of mocks describes the characteristics of a good mock very well:

A Mock Object is a substitute implementation to emulate or instrument other domain code. It should be simpler than the real code, not duplicate its implementation, and allow you to set up private state to aid in testing. The emphasis in mock implementations is on absolute simplicity, rather than completeness. […] We have found that a warning sign of a Mock Object becoming too complex is that it starts calling other Mock Objects – which might mean that the unit test is not sufficiently local. [emphasis added]

the object under test in a mock object test is surrounded by a little cloud of collaborating mocks which are simple, incomplete and local. UPDATE: Nat Pryce reminds me that process calculi, such as CSP, had an influence on the JMock approach to mocking.

Ron talks about Detroit/Smalltalk TDD-ers developing their test doubles by this means:

just code such a thing up […] Generally we’d build them up because we work very incrementally – I think more incrementally than London Schoolers often do – so it is natural for our mock objects to come into being gradually. [emphasis added]

I don’t know where he gets that impression about the “LondonSchool”. In my experience, in London and elsewhere, mocks made with frameworks also come into being gradually, one expectation or so at a time. How else? UPDATE: Rachel Davies reminds me that the originators of mocking had a background in Smalltalk programming anyway.

Ron speculates that mocks are likely to be more popular amongst programmers who work with libraries that they don’t control, and I expect so. Smalltalkers don’t do that much, almost everyone else does, lots. He speculates that mocks are likely to be more popular amongst programmers who work with distributed systems of various kinds, and I expect so. Smalltalkers don’t do that much, almost everyone else does, lots. Now, if we could all write our software in Smalltalk the world would undeniably be a better place, but…

In fact, I suspect that Smalltalkers write a lot of mocks, but that these tend to develop quite naturally into the real objects. The Smalltalk environment and tools affords that well. Almost everyone else’s environment and tooling fights against that every step of the way. And Smalltalkers won’t generally use a mocking framework, although there are some super cute ones, because the don’t have to overcome the stumbling blocks that languages like Java put in the way of anyone who actually wants to get anything done.

Tools

Anyway, there’s this thing about tools. Tools have affordances, and good tools strongly afford using them the right way and weakly—or not at all—afford using them the wrong way. And there are very special purpose tools, and there are tools that are very flexible. I read somewhere that the screwdriver is the most abused tool in the toolbox, because a steel rod that’s almost sharp at one end and has a handle at the other is just so damn useful. But that doesn’t mean that it’s a good idea to use one as a chisel. I grew up on a farm and I remember an old Ferguson tractor which was started by using a (very large) screwdriver to short between the starter motor solenoid and the engine block. Also not a good idea.

That we can do these things with them does not make screwdrivers bad. And the screwdriver does not encourage us to be idiots—it just doesn’t stop us. And so it is with mocks—they are enormously powerful and useful and flexible and will not stop us from being stupid. In particular, they will not stop us from doing our design work badly. And neither will TDD.

What I think they do do, in fact, is make the implementation of bad design conspicuously painful—remember that line about the next test being hard to write? But programmers tend to suffer from very bad target fixation when a tool becomes difficult to use and they put their head down and power through, when they should really stop and take a step back and think about what the hell they’re doing.

Written by keithb

November 3, 2015 at 9:41 pm

Posted in Uncategorized