Agile Testing book club: Let them feel pain

This is the second part is a series of exercises where I highlight one detail from a chapter or two of Agile Testing by Janet Gregory and Lisa Crispin. Part one of the series can be found here. This installment comes from Chapter 3.

Let them feel pain

This chapter is largely about making the transition into agile workflows, and the growing pains that can come from that. I’ve mentioned before on this blog that when I went through that transition, I worried about maintaining the high standard of testing that we had in place. The book is coming from a slightly different angle, of trying to overcome reluctance to introducing good quality practices, but the idea is the same. This is the sentence that stuck out most to me in the whole chapter:

Let them feel pain: Sometimes you just have to watch the train wreck.

I did eventually learn this lesson, though it took probably 6 months of struggling against the tide and a tutorial session by Mike Sowers at STAR Canada on metrics before it really sunk in. Metrics are a bit of a bugaboo in testing these days, but just hold your breath and power through with me for a second. Mike was going over the idea of “Defect Detection Percentage”, which basically just asks what percentage of bugs you caught before releasing. The usefulness of it was that you can probably push it arbitrarily high, so that you catch 99% of bugs before release, but you have to be willing to spend the time to do it. On the other end, maybe your customers are happy with a few bugs if it means they get access to the new features sooner, in which case you can afford to limit the kinds of testing you do. If you maintain an 80% defect detection percentage and still keep your customers happy, it’s not worth the extra time testing it’d take to get that higher. Yes this all depends on how you count bugs, and happiness, and which bugs you’re finding, and maybe you can test better instead of faster, but none of that is the point. This is:

If you drop some aspect of testing and the end result is the same, then it’s probably not worth the effort to do it in the first place.

There are dangers here, of course. You don’t want to drop one kind of testing just because it takes a lot of time if it’s covering a real risk. People will be happy only until that risk manifests itself as a nasty failure in the product. As ever, this is an exercise of balancing concerns.

Being in a bad spot

Part of why this idea stuck with me at the time was that the rocky transition I was going through left me in a pretty bad mental space. I eventually found myself thinking, “Fine, if nobody else cares about following these established test processes like I do, then let everybody else stop doing them and we’ll see how you like it when nothing works anymore.”

This is the cynical way of reading the advice from Janet and Lisa to “let them feel pain” and sit back to “watch the train wreck”. In the wrong work environment you can end up reaching the same conclusion from a place of spite, much like I did. But it doesn’t have to come from a negative place. Mike framed it in terms of metrics and balancing cost and benefit in a way that provided some clarity for an analytical mind like mine, and I think Lisa and Janet are being a bit facetious here deliberately. Now that I’m working in a much more positive space (mentally and corporately) I have a much better way of interpreting this idea: the best motivation for people to change their practices is for them to have reason to change them.

What actually happened for us when we started to drop our old test processes was that everything was more or less fine. The official number of bugs recorded went down overall, but I suspect that was as much a consequence of the change in our reporting habits in small agile teams as anything else. We definitely still pushed bugs into production, but they weren’t dramatically worse than before. What I do know for sure is that nobody came running to me saying “Greg you were right all along, we’re putting too many bugs into production, please save us!”

If that had happened, then great, there would be motivation to change something. But it didn’t, so we turned our attention to other things entirely.

Introducing change (and when not to)

When thinking about this, I kept coming back to two other passages that I had highlighted earlier in the same chapter:

If you are a tester in an organization that has no effective quality philosophy, you probably struggle to get good quality practices accepted. The agile approach will provide you with a mechanism for introducing good quality-oriented practices.

and also

If you’re a tester who is pushing for the team to implement continuous integration, but the programmers simply refuse to try, you’re in a bad spot.

Agile might provide a way of introducing new processes, but it doesn’t mean that anybody is going to want to embrace or even try them. If you have to twist some arms to get a commitment to try something new for even one sprint, if it doesn’t have a positive (or at least neutral) impact you better be prepared to let the team drop it (or convince them that the effects need more time to be seen). If everybody already feels that the deployment process is going swimmingly, why do you need to introduce continuous integration at all?

It might be easy when it’s deciding not to keep something new, but when already established test processes were on the line, this was a very hard thing for me to do. In a lot of ways it was like being forced to admit that we had been wrong about what testing we needed to be doing, even though all of it had been justified at one time or another. We had to realize that certain tests were generating pain for the team, and the only way we could tell if that was really worth it was to drop them and see what happens.

The take away

Today I’m in a much different place. I’m no longer coping with the loss or fragmentation of huge and well established test processes, but rather looking at establishing new processes on my team and those we work with. As tempting as it is to latch onto various testing ideas and “best” practices I hear about, it’s likely wasted effort if I don’t first ask “where are we feeling the most pain?”

Testing like you’re laughing

I was in a brainstorming meeting recently. The woman running the meeting started setting up an activity by dividing in the board into several sections. In one, she wrote “Lessons Learned” and in a second she wrote “Problem Areas”. The idea was that we’d each come up with a few ideas to put into each category and then discuss.

I immediately asked, “What if one of the lessons I learned is that we have a problem area?”

To her credit, she gave a perfectly thoughtful and reasonable answer about how to differentiate the two categories. The details don’t matter; what was important was that others in the room started joking that, as the “only QA” in the room, I immediately started testing her activity and trying to break it. This was all in good fun, and I joked along saying “Sorry, I didn’t mean to be hard on you right away.”

“You’re the QA,” she said, “It’s your job!”

This tickled an idea in the back of my mind but it didn’t come to me right away. Later that day, though, I realized what the answer to that should have been:

“As long as I’m QA-ing with you, not at you.”

Footnote: There’s nothing significant in the use of “QA” over “testing” here; I’m using “QA” only because that’s the lingo used where I am. It works just as well if you replace “QA” with “tester” and “QA-ing” with “testing”, whether or not you care about the difference.

The Greg Score: 12 Steps to Better Testing

Ok, I’ll admit right off the bat that this post is not going to give you 12 steps to better testing on a silver platter, but bear with me.

A while back, I was trying to figure out a way for agile teams without a dedicated tester or QA expert on their team to recognize bottlenecks and inefficiencies in their testing processes. I wanted a way of helping teams see where they could improve their testing by sharing the expertise that already existed elsewhere in the company. I had been reading about maturity models, and though they can be problematic—more on that later—it lead me to try to come up with a simple list of good practices teams could aim to adopt.

When I started floating the idea with colleagues and circulating a few early drafts, a friend of mine pointed out that what I was moving towards was a lot like a testing version of the Joel Test:

The Joel Test: 12 Steps to Better Code

Now, to be clear, that Joel Test is 18 years old, and it shows. It’s outdated in a lot of ways, and even a little insulting (“you’re wasting money by having $100/hour programmers do work that can be done by $30/hour testers”). It might be more useful as a representation of where software development was in 2000 than anything else, but some parts of it still hold up. The concept was there, at least. The question for me was: could I come up with a similarly simply list of practices for testing that teams could use to get some perspective on how they were doing?

A testers’ version of the Joel Test

In my first draft I wrote out ideas for good practices, why teams would adopt it, and examples of how it would apply to some of the specific products we worked on. I came up with 20-30 ideas after that first pass. A second pass cut that nearly in half after grouping similar things together, rephrasing some to better expose the core ideas, and getting feedback from testers on a couple other teams. I don’t have a copy of the list that we came up with any more, but if I were to come up with one today off the top of my head it might include:

  1. Do tests run automatically as part of every build?
  2. Do developers get instant feedback when a commit causes tests to fail?
  3. Can someone set up a clean test environment instantly?
  4. Does each team have access to a test environment that is independent of other teams?
  5. Do you keep a record of tests results for every production release?
  6. Do you discuss as a team how features should be tested before writing any code?
  7. Is test code version controlled in sync with the code it tests?
  8. Does everybody contribute to test code?
  9. Are tests run at multiple levels of development?
  10. Do tests reliably pass when nothing is wrong?

I’m deliberately writing these to be somewhat general now, even though the original list could include a lot of technical details about our products and existing process. After I left the company, someone I had worked with on the idea joked with me that they had started calling the list the “Greg Score”. Unfortunately the whole enterprise was more of a spider than a starfish and as far as I know it never went anywhere after that.

I’m not going to go into detail about what I mean about each of these or why I thought to include it today, because I’m not actually here trying to propose this as a model (so you can hold off on writing that scathing take down of why this is a terrible list). I want to talk about the idea itself.

The problem with maturity models

When someone recently used the word “mature” in the online community in reference to testing, it sparked immediate debate about what “maturity” really means and whether it’s a useful concept at all. Unsurprisingly, Michael Bolton has written about this before, coming down hard against maturity models, in particular the TMMi. Despite those arguments, the only problem I really see is that the TMMi is someone else’s model for what maturity means. It’s a bunch of ideas about how to do good testing prioritized in a way that made sense to the people writing it at the time. Michael Bolton just happens to have a different idea of what a mature process would look like:

A genuinely mature process shouldn’t emphasize repeatability as a virtue unto itself, but rather as something set up to foster appropriate variation, sustainability, and growth. A mature process should encourage risk-taking and mistakes while taking steps to limit the severity and consequence of the mistakes, because without making mistakes, learning isn’t possible.

— Michael Bolton, Maturity Models Have It Backwards

That sounds like the outline for a maturity model to me.

In coming up with my list, there were a couple things to emphasize.

One: This wasn’t about comparing teams to say one is better than another. There is definitely a risk it could be turned into a comparison metric if poorly managed, but even if you wanted to it should prove impossible pretty quickly because:

Two: I deliberately tried to describe why a team would adopt each idea, not why they should. That is, I wanted to make it explicit that if the reasons a team would consider adopting a process didn’t exist, then they shouldn’t adopt it. If I gave this list to 10 teams, they’d all find at least one thing on it that they’d decide wasn’t important to their process. Given that, who cares if one team has 2/10 and another has 8/10, as long as their both producing the appropriate level of quality and value for their contexts? Maybe the six ideas in between don’t matter in the same way to each team, or wouldn’t have the same impact even if you did implement them.

Third: I didn’t make any claims that adopting these 10 or 12 ideas would equate to a “fully mature” or “complete” process, they were just the top 10 or 12 ideas that this workgroup of testers decided could offer the best ROI for teams in need. It was a way of offering some expertise, not of imposing a perfect system.

Different models for different needs

This list doesn’t have everything on it that I would have put on it two years ago, and it likely has things on it that I’ll disagree with two years from now. (Actually I wrote that list a couple days ago and I am already raising my eyebrow at a couple of them.) I have no reason to expect that this list would make a good model for anybody else. I don’t even have any reason to expect that it would make a good model for my own team since I didn’t get their input on it. Even if it was, I wouldn’t score perfectly on it. If you could then that means the list is out of date or no longer useful.

What I do suggest is to try to come up a list like this for yourself, in your own context. It might overlap with mine and it might not. What are the key aspects of your testing that you couldn’t do without, and what do you wish you could add? It would be very interesting to have as many testers as possible to write their own 10-point rubric for their ideal test process to see how much overlap there actually is, and then do it again in a year or two to see how it’s changed.


Agile Testing book club: Everyone is a Tester

If you’ve even dipped your toe into the online testing community, there’s a good chance that you’ve heard Agile Testing by Janet Gregory and Lisa Crispin as a recommended read. A couple weeks ago I got my hands on a copy and thought it would be a useful exercise to record the highlights of what I learn along the way. There is a lot in here, and I can tell that what resonates will be different depending on where I am mentally at the time. What I highlight will no doubt be just one facet of each chapter, and a different one from what someone else might read.

So, my main highlight from Chapters 1 and 2:

everyone is a tester

Janet and Lisa immediately made an interesting distinction that I would never have thought of before: they don’t use “developer” to refer to the people writing the application code, because everybody on an agile team is contributing to the development. I really like emphasizing this. I’m currently in an environment where we have “developers” writing code and “QA” people doing tests, and even though we’re all working together in an agile way, I can see how those labels can create a divide where there should not be one.

Similarly surprising and refreshing was this:

Some folks who are new to agile perceive it as all about speed. The fact is, it’s all about quality—and if it’s not, we question whether it’s really an “agile” team. (page 16)

The first time I encountered Agile, it was positioned by managers as being all about speed. Project managers (as they were still called) positioned it as all about delivering something of value sooner than possible otherwise, which is still just emphasizing speed in a different way. If asked myself, I probably would have said it was about being agile (i.e., able to adapt to change) because that was the aspect of it that made it worth adopting compared to the environment we worked in before. Saying it’s all about quality? That was new to me, but it made sense immediately, and I love it. Delivering smaller bits sooner is what lets you adapt and change based on feedback, sure, but you do that so you end up with something that everyone is happier with. All of that is about quality.

So now, if everybody on the team should be counted as a developer, and everything about agile is about delivering quality, it makes perfect sense that main drive for everybody on the team should be delivering that quality. The next step is obvious: “Everyone on an agile team is a tester.” Everyone on the team is a developer and everyone on the team is a tester. That includes the customer, the business analysts, the product owners, everybody. Testing has to be everybody’s responsibility for agile-as-quality to work. Otherwise how do you judge the quality of what you’re making? (Yes, the customer might be the final judge of quality means to them, but they can’t be the only tester any more than a tester can be.)

Now, the trick is to take that understanding and help a team to internalize it.

What the Bug!? (An attempt at knowledge sharing, two ways)

When I was making the transition from waterfall style projects to agile teams in a previous company, one of the main things I struggled with was the loss of the testing team as we all became generic “software engineers”. We all still did the same testing tasks, but none of us were “testers” any more. There were a lot of positive effects from the change, but I kept feeling like without dedicated testers focused on improving our testing craft, we’d stagnate.

What I was missing, as I eventually realized, was a testing community. A recent episode of the AB Testing podcast talked all about building a community of testers, which brought all of this back to mind. A few of the ideas Alan and Brent talked about I had actually tried at the time. One in particular was the idea of highlighting major fails of the week in a newsletter, even offering prizes for the “winner”. Back in 2016 I heard a similar idea at a STARCanada talk, where the engineering group at AppFolio would send an email to everybody in the company for every bug they found describing what was learned, again emphasizing that finding these bugs was a positive thing, not a blame game. (Sorry I can’t find now who gave that talk; if it was you let me know!)

The reason the idea stuck with me at the time was primarily because I had started to notice that as our newly agile teams specialized in subsets of what was previously a monolithic product, we started to loose visibility on bugs that other teams ran into. Different teams were getting bit by the same issues. It didn’t help that the code base was also old enough that newer people would run into old bugs, spend potentially hours debugging it, and then hear “oh yeah, that’s a known problem.” (My response to that was to scream “It isn’t known if people don’t know it!” silently to myself.)

Here’s what I tried to do about it:

What The Bug!?
Honestly, the part of the idea I’m most proud of might be the name

I wanted teams to start talking more about the bugs they found, both so that others could learn from them and so that we could all tune our spidey senses to the sorts of issues that were actually important. There wasn’t much appetite for an email newsletter—people didn’t seem to read the newsletters the company already had—but we ended up trying two alternatives, one of which was pretty successful and one of which really wasn’t.

Building A knowledge base

The first idea was to solicit short and easily digestible details about every production bug that got logged into our bug tracker. Anybody who logged a bug would get an email asking them to answer three questions. The key was that the answers had to be short—think one tweet—and written at a level that any developer in the company would be able to understand. Bonus points for memorable descriptions. The questions were roughly:

  1. What was the bug?
  2. What was important about it?
  3. What one lesson can we learn from it?

The answers were linked back to the original ticket and tagged with a bunch of meta data like which team found it, the part of the system it was found in, what language the code was in, and any other keywords that would make it easy to find again. The idea was that if I was going to start writing something in a new language or working on a new part of the system, I could go look up the related tags and immediately find a list of easily digestible problems that I should stay alert for. I think it was an okay idea, but there were issues.

First problem: People were pretty bad at writing descriptions of bugs that were short, but also useful and interesting. It not only took a lot of creativity, but in order to do it well you also really had to examine what was important about the bug in the first place. The example I used as a bad answer to the second question was “This caused an error every Tuesday”; What caused what kind of error every WHY TUESDAY!? This was especially problematic for the third question, where often the answers that came back were “testing is important” or “we’ll test this next time”. True, but shallow. What I was really hoping for would have been “There are different kinds of Unicode characters, so always consider testing different character planes, byte lengths, and reserved code points”. To really make the knowledge base as useful as it could have been, it would have needed committed editors who would talk to the people involved and craft a really good summary with them.

Second problem: The response rate was pretty lousy. It might have been that targeting every production bug was just too much. Not everybody is going to see something interesting in every bug, and not everybody is as interested in learning from them. That was part of the culture that I wanted to change—I wanted everybody else to be as excited about these bugs as I was!—but it wasn’t going to happen over night.

Third problem: It might seem minor, but the timing of the email prompt coming the day after the bug was logged was often just too soon to have digested what the real lesson learned was. This turned up as a problem as I asked around about why people weren’t taking part. They just didn’t have the answers yet.

All of this created a chicken-and-egg problem. Until people saw the value of this project, saw interesting summaries, and got excited to contribute their own experiences, we wouldn’t get the content we needed to build that interest or excitement. And, in all honesty, though there was a conscious effort with this to make an accessible library of bugs compared to the technical JIRA tickets, we were still basically asking people to log a bug in one database every time they logged a bug in another database. We needed something more active and engaging.

Welcome to the What The Bug!? Show

At the time we already had a weekly all-hands meeting every Friday afternoon where anybody could contribute a segment to demo something, talk about something interesting, or anything else they wanted. I was doing short segments on quality and testing topics every few weeks to try to promote testing in general, but it was largely a one-man show. A fellow tester/developer that I was working with on the What The Bug!? knowledge base had the idea to take just a couple minutes each week to present our favourite bugs of the week.

Turning a passive library of bugs into a weekly road show was a big success. We were basically getting the benefit of a bug-of-the-week newsletter with the added bonus of an already established live audience. Again, the branding helped to sell it, and because the two of us both embraced the idea that any production bug could be turned into an 2 minute elevator pitch for something interesting to learn, we were able to make pretty fun presentations. We at least got laughs and had people thinking about testing if only for 5 minutes, but the real highlight for me was that afterwards I had people come up to me and say “you know, I found something last week that would make a good segment next time.”

This was possible in part because she and I had both spent that time soliciting user-friendly descriptions from people logging bugs, so we knew what had been going on across all the teams for the last couple weeks. We had a lot to choose from for the feature. It would have been harder to do without that knowledge base, being limited to only the bugs we knew from the teams we worked directly with. I suspect, though, that once the weekly What The Bug!? segments built up enough momentum that people took on presenting their own bugs, the need for the email prompts and the knowledge base would fade away. An archive of the featured bugs could still be a useful resource for new people coming on board, but it would no longer be the primary driver.

Where are things now

I left the company shortly after starting What The Bug!?, but I recently had a chance to check in with an old colleague and inquire about where these two manifestations of the idea ended up. Unsurprisingly in retrospect, the knowledge base has largely fizzled. The combination of asking people to volunteer extra writing work and the low ROI of a poorly written archive doomed it pretty early on. On the flip side, bugs are still a regular topic at those weekly meetings, although I’m sorry to report that they dropped the What The Bug!? branding. If anybody else wants to try something similar, feel free to use the name, all I ask for is a hat tip. (A quick google of the phrase only turns up some etymologists and an edible insect company, but if someone else had the same idea in the context of software, do let me know and I’ll pass the hat tip on.)