How Often Do You Really Fix A "Failing" Automated Check

Do we really fix a failing automated check? Or do we simply defunct one and create a new one.

I saw a tweet the other day from an old colleague of mine.
It got me thinking, how often do we really fix a failing automated check? By fix in this instance, my thoughts started with getting it passing again, getting it green. Even though I prefer not to talk about passing and failing, for the context of this post, passing means satisfying the algorithm and failing means didn't satisfy the algorithm.

Lots of discussion followed on Twitter after I tweeted, "You very rarely 'fix' a 'failing' automated check", https://twitter.com/FriendlyTester/status/592972631705559040, but going to try and summarise the thoughts.

So lets run through an example, I have a sign up form that has 10 fields on, I also have many checks that are all passing on this form. A new field has been introduced to the form, a mandatory field, this has caused 5 of the checks to now fail. In order to get those checks passing again, then need to be instructed to populate the mandatory field. Does this class as fixing the check?

I don't believe so, for me this is a new check, it just so happens to reuse 95% of code from a now defunct check. So we haven't fixed it, we have created a new one. Semantic argument some of you may be thinking, but I don't think so. For me, its a very important distinction.

How often have you been in this situation, the build goes red, someone investigates, 5 checks are failing, then you hear "fix those checks!". If you haven't, replace check with test, and try again. It's certainly something I have experienced multiple times.

Someone then runs the checks locally and immediately sees that the new field isn't being populated, "oh look it's not populating the mandatory field, I will make it do that", they do it, run them, they all pass! Job done......

Here lays the problem, how do they know that mandatory field is even meant to be there. Well likelihood is they did know, which then leads to the question, why weren't they "fixed" as part of the initial development work? One problem could be the separation of coding the feature and creating automated checks, happens in a lot of places, especially where 'testing' is a separate activity. It could be they run them after and then fix where required. But I feel it's because teams don't stop to think about the impact changes will have on the checks upfront. Once the work is done, the checks are ran like a safety net, failures investigated and 'fixed'.

So what's my gripe here? Well I feel more people need to give their use of automated checks more focus, you could write the cleanest automation possible, but if you don't know what they are checking, if anything at all, what use is that to you? It's like the building the wrong product right. We should be thinking at the time of developing new features, what new checks are going to be needed here, and importantly what checks are not. Are existing checks now checking enough, what else should they be looking at.

Checking can be an important part of an approach to testing, evidently very important at companies where there have created hundreds or thousands of automated checks. If you are going to rely on automated checks so much, then you need to apply a lot of critical thinking to what checks you create and continue to maintain. As repeated many times, your automated checks will only check what they are instructed to, they are completely blind to everything else.

Designing effective checks, by effective I mean adds value to your testing, supports you in tackling your testing problem can be a difficult process and requires a lot of skill, some are more obvious. It isn't something someone who doesn't understand testing should be doing. Now, turning those into automated checks, sure, I could see that being done by someone who doesn't understand testing, it would be far form ideal though in my opinion.

The reason I say it's far from ideal relates to this tweet from the start of the year.
Now of course this depends on your framework of choice and the architecture you are building on, but the creation of automated checks can be one of much exploration. It can also be the case of adding some keywords to a spreadsheet, but even in that scenario much exploration would have already been done.

The application needs to be explored at a level you may have not yet done. In the example of the web, the html needs to be explored for testability, is this a site I can easy automate? At the API level we may discover json elements that we have no idea what they do, or where they come from, we need to work these things out, we need to test.  Also as we are automating a check we may become aware of other attributes/values that we should be checking, and adjust the check accordingly, again though, this requires thought, requires skill. There is also the process of testing the automated check, something I have previously written about here.

Feel myself going slightly of topic, so lets try to wrap this up. Testing encompasses checking (James Bach & Michael Bolton, http://www.satisfice.com/blog/archives/856), your automated checks should be supporting your testing, helping you tackle your testing problem. Regularly review your automated checks, don't be afraid to delete them, always be evaluating there effectiveness. If you find yourself 'fixing' lots of automated checks, take the time to stop and think about what you are really doing.
  1. How could the situation have been avoided?
  2. Could it have been done earlier?
  3. What is this check even checking?
  4. I have "fixed" this many time already, is this an effective check?
  5. What information is this checking giving me?
Don't chase green, chase automated checks that support your testing problem. Don't blindly "fix" automated checks. Also for another post, something that we discussed in my automation workshop at LetsTest and Nordic Testing Days recently, do you checks clearly indicate their intention? Sure we can read the code and see what it does, see what it is checking, but what about its original intention, where is that? Be writing about this soon, your "fix" may actually break the algorithm, and therefore mis direct testing effort.

Remember the code isn't the check, the check is the algorithm. The code is the implementation of it. They may not always align, especially over time and with lots of fixes. Focus on both.

P.s thanks to Maaret and Toby for their posts, here and here respectively. I intended to think about this more back at the original tweet time, their blogging gave me the nudge needed.

p.p.s I should add, that I believe its OK to say you are fixing check if you are changing the implementation of the algorithm, as long as those changes don't alter the original algorithm. Such as changing the data or updating some locators. Or even the url's. Things along those lines.

3 comments:

  1. Greetings Rich,

    To me there are two things that might need fixing:

    1) The intent of the check has changed i.e. the test idea behind the check is no longer the same
    2) The plumbing under the covers needs to modifying somehow

    In any given situation when your check is failing, it could be either situation or both that need to be addressed. In your example, it sounds like you're doing 2) only. Which seems valid.

    I'm not sure if that helps any but I thought I'd share.

    Vernon

    ReplyDelete
    Replies
    1. So number 1 is exactly what I am talking about. I don't believe you are fixing the check in that situation, you would delete and create a new one.
      Number two I didn't address, I would add something at the end. Essentially though, if the algorithm stays the same, but the implementation changes, then that is ok, I would class that as fixing. E.g. changing locators or data.

      Delete
  2. I think this is a good post, assuming the perspective that you just want the checks to pass so you can move on to other tasks, or that people have the time to watch checks run to figure out what's going on.

    There's a better way, but it requires tracking business requirements of the product, say, in a list. A list allows the team to track behavior items, prioritize them, relate them to automated checks and tests and record and report on what's working and what's not. This can take the place of documentation for checks and even for the simpler test cases.

    Automation is better, faster, more reliable and more actionable if it's short and simple. Automation that is too long and complicated becomes brittle and tends to hide quality information, which is the opposite of what you want.

    for example, in the example described in the OP with the ten fields on the page and a new required field, there is at least six new business requirements with that product change. The highest-priority one is probably that the field is required. So, given GUI/web page automation, this is best measured with a new check, not a fixed one or one that adds a new field to fill. Fixing or changing existing checks is an invitation to spend too much time on less-important product behaviors, and too much time trying to diagnose check failures.

    The Atomic Check pattern (the least dependent pattern of MetaAutomation) shows an optimal approach to doing this. Working sample code is available to show a self-documenting atomic check, and I'm nearly complete with a much more flexible and scalable example of how to achieve this. Figure 3.1 of my book gives a simple UML activity diagram for how this works with automation.

    ReplyDelete