Little Alchemy: A silly little test exercise

Lately I’ve been entertaining myself with a silly little mobile app called Little Alchemy 2, a simple revamp of an earlier concept game. It distracted me enough this week that I didn’t prepare a respectable blog post. While questioning my life choices, I started to convince myself that maybe I had been playing at testing this whole time. Testers, myself included, love to make analogies to science, and this is a game that simulates the precursor to modern chemistry. Of course it can be used as a tortured metaphor for software testing! Follow me on my journey…

Exploration

Screenshot of Little AlchemyThe main concept of the game is to drag icons of real life objects on top of each other, combining them to make new items.

At first there are only a few options: Water, Earth, Fire, Air. The classics. Not hard to just play around with different combinations and see how this game works.

Fire + Water = Steam. Makes sense.

Water + Earth = Mud. Great.

Now I wonder what happens if I combine steam and mud….

After a while you start to get a sense for how this universe works.

Developing heuristics

After a while I start to see patterns. If I combine the same thing with itself I get a bigger version of that thing; can I do that with everything? Fire seems transform most items. When I find out how to make paper or wood, I’m probably going to try to throw those on the fire too.

Combinations quickly multiply.

Before long I have hundreds of items of all sorts. Too many to keep in my head. I start to think I’m probably missing combinations, or I’ve forgotten what combinations I’ve already tried. I know I tried putting wood into fire but I did I put wood into the campfire? Am I missing some obvious combination that’s right in front of me?

Using an Oracle

That’s when I start googling around for hints and suggestions. This part gets a bit cheaty, but at the time it was what I needed to do to get a handle on the problem and keep making progress. I found a site that would randomly show me an item, and I could use that to see if I had the pieces I needed to make it. No more guessing, I was given the right answer.

I suppose this is where I could veer off into automation, but where’s the fun in that? After a while, I started to exhaust the hint system anyway; with only random items the ratio of new to known items started to decline. ROI went down. My oracle was getting less useful, not telling me anything new.

Brute force

There were still an intractable number of items and I had seen enough unexpected combinations that I didn’t trust myself to reason them all out myself. So instead, I turned to brute force.

First item on the list. Try to combine with every item below it. Repeat.

Now I should really think about automation if my goal was to just find all combinations. This is a pure algorithm with finite steps since the game promises a finite number of items. But things start to change after using this for a few iterations manually. Happily the game removes items that have no undiscovered combinations, so in theory the complexity will peak and game will start get simpler again. (Wouldn’t it be nice if software always got simpler with time?) Moreover, a little bit of brute force starts to show patterns that I hadn’t been aware of before. I start to skip ahead: “aha! if that combination works, then I bet this other one will too…” One new strategy begets another!

Inference and deduction

I reach a breaking point where items are being removed from play more and more often. This feels like the home stretch but brute forcing it will still take ages. Often, though, an item only has one or two combinations left to discover before it’s exhausted. I use this to my advantage.

Enter another oracle of sorts; this time, it’s the documentation of everything I’ve done so far. For any item still on the board, the game tells me how many undiscovered combinations an item has, items that have been used to make it, and all the other items it has been used to make so far. This is all data I can analyse to look for yet more patterns, and spot gaps in patterns that have been missed so far. The rate at which I clear items off the board goes way up, and I’m still using the same manual interactions I’ve used all along.

End game

I’m not there yet. Do I still have more to learn? Is another strategy going to make itself obvious before I finish the game, or an old one resurface as the dynamics change?

And what am I going to explore next?

My first game of TestSphere

Today (as I write this, last week as it is published) I had my first experience playing TestSphere. I’ve had a deck for ages but only recently suggested trying to play it with the QA community of practice in my department. Going from never having played it at all to facilitating a session with a whole group was quite a leap and I wasn’t at all sure how it would go. Here’s some of my observations about the experience.

Test sphere cards laid out on a table

Seven thoughts about TestSphere

1. Ten’s a crowd: The weekly meeting of the group usually has anywhere from 4 to 16 people attending, with the typical number around 12. I planned on playing the standard game, which the box says is best for 4 to 8 people. I was prepared to split us into two groups if needed, but in the end tried playing with the full group of 10 that came that day.

2. One for all or a bunch for each: The instructions say to reveal one or more cards depending on the experience level of the group, though it’s not clear to me which way those should correlate. I decide to go with one card of each colour so there would be a variety of types ofthings to think about. This turned out to be exactly the wrong number. Though I deliberately put us as a small table, people still had to pick up cards from the middle to read them. As soon as we started, 5 people were reading cards and 5 people were doing nothing. Should I do this again, I would try one extreme or the other: 1 or 2 cards that the whole group could focus on together, or 3-5 cards each to think about independently and have people play cards from their own hand. In the latter case I can then imagine combo play (“I have a card that applies to that story!” or “I have an experience with that too, plus this other concept from my hand”) but let’s not get carried away.

3. Combining cards: Nobody attempted to combine multiple cards into a single story, which I thought would be part of the fun of trying to “win”. This may have just been because people were passing cards around one at a time rather than looking at them as a group. I suspect it would have been easier to combine cards with fewer people or ones that was already familiar with the cards.

4. Minimalism: We didn’t make use of most of the text on the cards. The examples are great and really show the amount of good work Beren Van Daele and the MoT put into designing the deck, but it was just too much to make use of in this format. While the extra text is useful to fully understand the concept, a minimal deck with just the concept, slogan, and a simple graphic might be less intimidating. (The Easter egg here is that Minimalism is one of the cards we talked about in our group today; going back and reading the card again I’m really torn by this since the examples really do illuminate it in a way the slogan alone doesn’t, and the three are so different from each other that even limiting it to one would not be quite the same.)

5. Waiting patiently: The group naturally developed a pattern of picking up new cards as soon as they came up and holding on to them until it was their turn to tell the story. I wouldn’t say that I expected it to be a raucous fight for cards and who got to tell their story first, but I didn’t expect it to be so calm and orderly either. Once or twice this resulted in someone who had picked up a card just to read it seemingly getting stuck into telling a story about that card whether they meant to or not.

6. Everybody had a story: The energy of the game varied quite a bit depending on who was speaking. Some people are just better story tellers or more comfortable with public speaking than others. Nonetheless, I was quite happy that nobody dominated the conversation too much, and by the end everybody had shared at least once. I had laid out a rule at the beginning that if two people had a story to share we would defer to whoever hadn’t spoken yet, but we only had to invoke it once.

7. My QA is not your QA: Several times I was surprised with the stories people told given the card they picked up, often struggling to see what the connection was. To me this illustrates how differently people think, which would keep this interesting to play with another group of people. Not only that, but they’ll likely work quite easily outside of QA circles. At one point we had only one person left who hadn’t collected any cards yet. “I’m a developer,” he said, “I only have developer stories.” But when prompted he was able to pick up a card just as easily as anybody else.

The forgotten debrief

In the end, we shared about 15 stories in 50 minutes. Overall I think it was a good experience, and it was a neat way to hear more about everybody’s experiences on other teams. Unfortunately I didn’t manage time well and we got kicked out of the meeting room before I had a chance to debrief with anybody about their experience with the game. Some ideas for focus questions I had jotted down (roughly trying to follow an ORID model) were:

  1. What are some of the concepts and examples that came up on the cards?
  2. Were there concepts someone else talked about that you also had a story for? Were any concepts totally new to you?
  3. Did anything surprise you about the experiences others shared? What did you learn about someone that you didn’t know before? What did or didn’t work well about this experience?

and finally:

  1. Would you play again?