The ongoing weekend testing adventure – EWT03

A write-up of European Weekend Testing session 3 – EWT03

I had to skip the previous weekend testing session involving Bing maps (which was a pity – I absolutely adore map applications), but today I was able to participate in EWT03. It was again an interesting experience. Not too many participants this time, which made for a more cozy atmosphere. Today’s testers were Marlena Compton, Anna Baik, Markus Gärtner and myself.

The Mission: You work in a small-medium company, and your manager has been asked to evaluate switching the company over to using Google calendar. He needs a quick assessment from you before his conference call this afternoon. Use the FCC CUTS VIDS touring heuristic to guide you.

We started firing off a bunch of questions at the boss, but he was in a hurry and kind of unresponsive – something about an important golf game he had to attend. He just wanted a quick assessment of  the tool, he apparently wasn’t into answering today. So much for questioning – we were on our own pretty soon. I wonder if there’s any way to keep an imaginary boss from leaving a virtual meeting. Still not sure about that.

We decided to divide the coverage since we started late and there was less than half an hour of testing time left. We ended up cutting the FCC CUTS VIDS into – surprise! – FCC, CUTS and VIDS (yes, sometimes you just go for the obvious). I settled for the VIDS:

  • Variability tour: look for things you can change in the application – and then you try to change them.
  • Interopeability tour: what does this application interact with?
  • Data tour: identify the major data elements of the application.
  • Structure tour : find everything you can about what comprises the physical product (code, interfaces, hardware, files, etc…)

In the meanwhile the boss had magically reappeared to give us a quarter of an hour extra – he probably forgot his bag of golf clubs. But I quickly realised that time would still be way too short to look at all aspects and concentrated on variability and interoperability – and on a list of existing bugs as well, something I remembered from the first session. In my quest for things that can be changed in the application, I settled for the system date, not really a part of the application but still something that can change while you’re using it. Some interesting bugs ensued, including a very severe one in Skype group chat – not really part of the mission. I thought I had lost connection but it turned out that the order of the chat messages was totally messed up. It took a while before I realised what had happened and lost a good deal of time because of that. Later on, I switched to interoperability but I also lost my internet connection for for a while, which was annoying but in retrospect also pointed me to a synchronisation bug.

The debriefing, moderated by Anna, revealed some good points from the other participants. Marlena covered the FCC (feature, complexity, claims) part and did a feature tour first, which proved very helpful. She explored all functionalities to get more familiar with them before she went off to investigate other areas. She wondered how the rest of us could start testing without doing such a feature tour/exploration first. But in fact, I think we all started with a bit of general exploration. I know I did. It feels kind of natural, probing around the surface for a while before before diving into the abyss of detail. Concerning the splitting up of the heuristic among several testers, this turned out to be a bit distracting, confusing and counter-productive. There is quite some overlap between the different areas and sticking to only some of them might mean that you’re missing out on otherwise obvious problems. This led us to the use of sapience in testing, being able to picking the heuristics that are useful for the given features.

We didn’t get to brief the boss in the end, so you could say that our mission didn’t really succeed. But hey, some say that failure breeds success. Soon, things will be looking so bright we’ll have to wear shades.

Be careful about the claims you make

About heuristics and an unbreakable phone.

An oracle is a principle or a mechanism that helps us in recognizing if the software under test is working according to some criteria. A heuristic is a fallible way of solving a problem; it can also help us discover and explore. James Bach and Michael Bolton have provided us with a lot of helpful heuristics already. One of them is a consistency heuristic mnemonically called HICCUPPS. Michael wrote an article about his use of this particular heuristic here (pdf). The idea is that a product should be consistent with:
  • History (consistent with it’s past behavior)
  • Image (consistent with the image the organization wants to project)
  • Comparable Products (consistent with a similar product)
  • Claims (product behaves the way some document or person says it should – e.g. in advertisements)
  • User’s expectations (consistent with what user wants or expects from the product)
  • Product (behavior should be consistent throughout the whole product) 
  • Purpose (consistent with its apparent purpose)
  • Statutes (behaving in compliance with legal or regulatory requirements) 

Although this is all about hardware, I was reminded of the ‘Claims’-part of this heuristic when I recently saw a video of a BBC reporter at the Consumer Electronics Show (CES) who broke the alledgedly unbreakable SONIM phone on his first attempt. Examples don’t come any clearer than this. Don’t go about telling that your product is unbreakable unless you’re 100% sure it is. Then again, how can you ever be 100% sure? When you put up a fish tank for everyone to use, expect the unexpected. That’s an open invitation for creative people to think ‘out of the tank’. I guess they’ll have to revise their marketing tagline. How about ‘Unbreakable by anything but edges of fish tanks’?.

* edited typo: ‘Statutes’ instead of ‘Statuses’

Weekend Testing hits Europe – EWT01

The first European Weekend Testing session (January 16, 2010)

Last saturday, weekend testing hit European ground. The Weekend testing phenomenon has been the talk of testingtown for a couple of months now. They originally started out as the Bangalore Weekend Testers and are gradually taking their brilliant mix of cybergathering / testing / learning / mentoring / communication / pairing to a higher level. I  became intrigued when I first heard about them through a column by Fiona Charles and some namedropping at Eurostar shortly after. I liked the idea of personal and professional development, while serving the open-source community as well. My first thoughts were “what a shame we don’t have this over here” and “wouldn’t European testers like this too?”.

They do, apparently.

Markus Gärtner and Anna Baik – whom I met online in the Software Testing Club, set up the first European chapter and planned the first session with the help of Ajay Balamurugadas, a weekend testing veteran from Bangalore. On saturday january 16, some great  testers – and me –  gathered online for a unique experience. Among the participants: Markus, Anna and Ajay of course, Phil Kirkham, Jeroen De Cock, Anne-Marie CharrettThomas Ponnet, Maik Nogens and Jassi.

After some initial babylonic confusion in a skype group call, we decided to settle for a group chat. There was a round of introductions first, during which everyone stated his name, experience and what he/she wanted to take away from this session. After that, Ajay (aka the great facilitator) revealed the mission and the target to us, whilst stating enigmatically “that there were traps”. Right then I knew that this wasn’t going to be just another round of testing. The target turned out to be a piece of open-source image-editing software by the name of Splashup. From that moment on, there was a lot of questioning going on – one of the most important testing skills. People explored, asked questions, gave feedback. Some started off on the wrong foot, but quickly recovered and learned from that. We made a lot of assumptions as well, and not always good ones. In the beginning I was a bit distracted by all the chat messages flying around while testing, but once I developed that second pair of eyes, it worked out just fine. The first hour flew by and everyone sent in their bug reports on time (yes, there *was* a deadline). Peer pressure, in a good way.

Next up was an interesting group discussion. Everyone shared their impressions, approaches, things they learned and the traps they fell into. Ajay did a great job in moderating this, and made sure all testers were involved. Here are a couple of things I took away from this first session:  

  • When confronted with testing within a really short timespan, it is good to quickly decide on your personal mission and focus, and stick to that. Try to stay focused on the areas you chose and don’t go wandering off, unless of course you accidentally open pandora’s box and bugs are thrown at you. I sticked to the file opening and saving functions – and there’s plenty to report about that. 
  • I think the way we start testing largely depends on our domain knowledge as well. When confronted with a completely new product, you probably start exploring in an attempt to primordially learn about the application first. Knowing the domain makes you less of an explorer – and more of a bug hunter, which may be dangerous – make sure you don’t block out the learning. I had worked with Splashup before, which gave me a (misplaced?) sense of ‘expertise’, so I dove in and went on an immediate bughunt.
  • Always question the mission you get. Question your assumptions about the mission. The main goal might not always be a mere bughunt.
  • There are some great exploratory testing tools out there. I used Session Tester, which allowed me to quickly generate an html-report at the end. It is great for notetaking too. I really like the ‘Prime Me’-function in there – great to get new testing ideas when you’re stuck. I didn’t have to use it this time – ideas galore.
  • When working with image editing programs, I immediately tend to go in that ‘comparable product’-mode and start comparing it with Photoshop. While that may be ok to have some idea of the expected behaviour of filters and effects, it might not be ideal for an open source application. It makes you expect too much. Free tools often cater for other needs – their users have different expectations.
  • Afterwards we all agreed that it was strange that we didn’t pair up while testing, or that we didn’t do a short ‘who will do what’-briefing before we started.  Divide and conquer, we didn’t. I think that was mainly due to the fact that it was all pretty new to us. This will be easier once we get to know eachother better. Another lesson learned in software testing.

As some scientist already pointed out (and actually proved too), time is relative. The planned two hours quickly became three. I have the impression that everyone was thrilled about this fresh kind of collaboration. Interacting with passionate fellow testers, talking to them, learning from them – it all made for a great experience. Sure, it is probably not always easy to attend the sessions, you actually have to plan for it – make sure the kids are entertained elsewhere. But the reward is significant.

People might argue “weekends, isn’t that supposed to be quality time away from work?” – I admit that this thought crossed my mind too. I now realise that this *is* quality time, and far away from work as well. Quality learning time. And fun fun fun – did I mention it’s lots of fun, too?

Parcel delivery frenzy

FedEx parcel delivery frenzy

When I stumbled upon this little FedEx-screenshot, I couldn’t help but wondering about the carbon footprint of the package involved. Sent from Ontario, California (yes, it’s actually the Fedex hub in California, not Ontario, Canada) with destination The Netherlands AN – crossing the Atlantic four times in the process.

As good software testers, we tend to ask ourselves “is there a problem here?”

Ambiguous country codes, for sure. The North-American branch probably interpreted Netherlands AN as The Netherlands while the European branch interpreted the code as The Netherlands Antilles, and after that some serious ping-ponging ensued. Actually, the real destination is in the Caribbean-  AN does stand for Antilles. Which makes me wonder: is there actual scanning of labels going on? If so, they better check their labeling/scanning/routing software.  Or are employees just interpreting and sorting the countries themselves? If so, they’d better teach them some basic geography – no rest for the geographically challenged! There could also be two packages circulating with the same label, but I guess in that case the date/time figures and locations wouldn’t make too much sense. Ah well… Maybe people should start thinking more about ways how programs could fail instead of just confirming the happy paths. But wait, let’s not only blame the testers, give the developers some credit as well – courtesy of Jerry Weinberg:

“If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.” 

Happy two thousand and System.NullreferenceException!

A comment on the Y2.01K bugs detected at Symantec and the DSVG bank group.

First of all, happy 2010 everyone!

And a happy new year for the German banks as well, althought it must have been a not-so-happy-one for the DSVG bank group. Yesterday, they issued a statement  that some 20 million debit cards issued by the banks belonging to the group were affected by a “millennium bug”-like problem. Apparently the problem stemmed from a chip on the cards which, due to a programming fault, wouldn’t correctly process the number 2010. The group said cash machines were adjusted hours after the problem emerged to ensure that customers can withdraw money, but there may still be problems using some debit-card terminals. Those should be fixed by Monday, it said.

Monday? Like in Monday, January 11th? Like in “almost a week from now”? In full winter sales period, with the number of payment transactions generally hitting historical highs (who’s paying in cash, nowadays?), that seems kind of disastrous to me.  

The same problem hit Symantec as well. The Symantec Endpoint Protection Manager (SEPM) server handled all dates greater than December 31, 2009 11:59pm as “out of date”.

Back in 1999, computer experts widely believed that hardware and software systems would fail as the clocks rolled over to the year 2000 because computers and other devices, which used only two digits to represent the year, would mistake the year 2000 for the year 1900. In the end, however, the so-called “millennium bug” caused few problems, because a lot of companies had anticipated the turn of the millennium with all possible resources. The computer software business was booming at the time. There were lots of new jobs. Granted, most of them were repetitive and rote programming and testing jobs, but jobs nontheless. Freshly graduated students were sent off to bootcamps to become ruthless programmers. I still dream about riding dinosaurs while yelling Cobol commands at them – but I digress. There also was a high demand for testers. A wide range of different people were transformed into software testers. Finally, companies were acknowledging the need for testing. Great! But apparently, not all testing (and programming, for that matter) was up to standard.

Now I don’t know the exact processing happening at Symantec or in the DSVG bank terminals, but this indeed looks like a sibling of that good old millennium bug. Maybe this was the result of a cheap and dirty Y2K bug fix where programmers put in a simple if <10 = 20xx otherwise the date is 19xx. In that case, this unwanted emergent behaviour doesn’t seem too hard to detect. Or does it? My 2 cent’s worth: this is what you get when performing scripted testing based on poorly thought-out boundary value analysis (aka domain testing) without exploring the software to know more about the risks. Or maybe they did explore, but they were only focused on that one known boundary called 2000. In both cases, testers failed to acknowledge other boundaries in the software that maybe simple talks with the developers would have detected anyhow. Michael Bolton talked about this in his Eurostar tutorial this year: the actual boundaries in a system may not be the ones we are told about – that’s why we need to explore. He also wrote this interesting article (pdf) on domain testing in Better Software magazine.