Who Tests The Checks

A common phrase to hear in the testing community, be it on Twitter or on forums, is "Who tests the tests?" Or in my case that would be "Who tests the checks?" or even "Who checks the checks?".

I am referring to automated checks, if you have written a check for the system, what's checking the check? What's checking that check that's checking the original check, we could continue but I hope you get the theme.

Well the answer is simple, you do and your team does. Thanks for reading...... OK I shall elaborate.

A common approach to creating checks is to create it locally against a local or deployed instance, work on it until it passes, then check it in, job done. Well it went green, so it's clearly working, but then later on that day/week it starts failing, you have made the build red then the pressure is on to fix it.

The way I view this is that automation is a tool, it's a piece of software that you are creating to assist with testing the system. So if you are testing the system that the developers are creating, which is a tool for your customers, why not test a system you are creating as a tool for your team?

If your check is passing, explore how you can make it fail, then make it fail, and importantly is the check giving you sufficient information about why it failed (Going to write more about this soon).

So here are a few examples of how I test my checks.

If you check produces some prerequisite data, what happens if the exact same data is already there? Should it handle this scenario? Does it give you good feedback, What happens if the method for creating this data fails, perhaps is direct to the DB and you alter the connection string or its via an API and you alter your credentials, does the error direct you here or just tell you the check failed?

What if tearing down the database after a test run fails, what happens to your check then? Should it make a new record with the same data or perhaps it should error.

Alter Assertions
You have written a check because you want to check something about the system, so your check will have 1-N assertions in it. Alter them, ensuring that the check should now fail.

For example if you are checking some text on the screen is displayed, perhaps it's an error message, change the message you are expecting by 1-N characters and check that it now fails. Reverse that scenario, if you have access to the source code, change the text on the screen which should yield the same result.

Run the tests at least three times
I have seen and written checks that have this unique ability to pass for a while, then fail once then pass again. Some people refer to them as "flaky" or I remember the guys from Salesforce at the Selenium Conference called them "flappers". Either way you write them, but majority of the time you won't discover them until they are on the CI. I have seen several reasons why a test can be flaky, majority of them and down to timing issues. So I have found running them at least three times locally increase my confidence I have written a stable check.

Alter the environment
Always creating your checks locally? If so you may come across situations where your check is only passing because a locally deployed site on an awesome machine is fast! But as soon as you run it on CI it starts to fail. So to mitigate this risk, I sometimes use Fiddler to slow down my connection to see how the check then performs. I have in the past also logged on the CI machine or a VM and ran my check in isolation to ensure it passes.

To get the most out of automated checks you want to be running them on CI. This comes with a potential concurrency issue, because depending on your set up, the same slave could be running several tests in parallel, therefore could your test be impacted by another test? Such as deleting shared data, or a test could clear the DB while another test is still running. I sometimes call this test bleed.

More Automation
So what about more automation to test the automation? I try to avoid this however I do feel there is certain scenarios where this could be an acceptable approach.
If you are using a third party API/library and you decide to write some extension methods for it, then it could add some value to write some checks for it.

However if you have gone to the effort of creating a suite of automated checks you should be running them all the time, so you should find out very quickly when something in your architecture has broken, so you could take the view that there is little value in spending time creating automated checks for the checks.

Code/Peer review
As mentioned earlier in this post, automation is software, where the customer is your team. So if you have a practice of doing peer/code reviews for your application code do the same for your automation code. You will also then take advantage of the "alter the environment" approach as the reviewer will execute the test on their machine.

In summary, I take the view that your automation is software, software you or your team is producing for the team, so test it. It will save you time in the long run in my experience, as many hours have been spent investing failing checks to find its something obvious.

Update: Feb 16th 2016. Bas Djikstra just wrote something similar, also worth reading. http://www.ontestautomation.com/do-you-check-your-automated-checks/


  1. Good points! Another thing I like to do, when creating an automated check, is to deliberately assert naively stupid behavior instead of the "real" expected behavior. Then I watch the way the test fails.

    This gives me two important pieces of information:
    1. is the check checking what I want it to check?
    2. if the check fails, will it tell me what failed in a readable, actionable way?

    (This is kind of a restatement of your Alter Assertions technique; the only difference is that I start with a bogus assertion.)

    1. Thanks for commenting. Point 2 is what I was referring to for another post, about ensuring your architecture clearly highlights the failure. Is it the test? Is it the architecture, what exact method is, but also importantly what was it actually trying to do? Will probably work on this next. As very relevant to recent work at $client.

  2. Great article, I disagree with the "More Automation" though.
    You state that "you should find out very quickly when something in your architecture has broken, so you could take the view that there is little value in spending time creating automated checks for the checks."
    In my experience the "checks for the checks" or "checks for the check infrastructure" serve two purposes:
    1. They are a living documentation of the capabilities of the check infrastructure, reducing the need to read outdated wiki pages, knowledge being shared by word of mouth, ...
    2. They reduce the bottleneck of a single person maintaining the check infrastructure (I've seen that happen way too often) - with the checks in place nobody should be afraid to make additions / changes. If the check infrastructure checks fail it's usually easier to understand what broke than if regular checks fail.

    That being said, having checks for the checks of of the check infrastructure might be overdoing it...

    1. Thanks for commenting, glad you liked it.

      I highlighted both sides of the argument.
      Viewing it as living documentation is one idea, however I tend to deal with that good naming conventions and comments in the code where the logic isn't easy to follow.

      However I can see a benefit when it comes to changing the architecture, but again I do think it depends on your suite of checks. If your check suite is fast to run, lets say <1 hour, then feedback would be quick of your changes.

    2. I think you're suggesting running the suite to identify problems with the suite? Perhaps you'd refactor the test code and then run it to see whether everything still passes? That has some dangers...

      I enjoyed reading Michael Feathers' book about legacy code recently and wrote about it (http://qahiccupps.blogspot.co.uk/2013/07/the-oh-in-coding.html) while considering whether I'd want to ask for unit tests inside our test code. I decided not in the end.