Automated tests that need to be tested?

We are so fortunate today to have so many automated testing libraries, frameworks and tools available that make creating automated tests quite easy. Some even allow people who do not have any coding experience to create automated tests. If you’re new to automated testing, should you just dive in with one of those tools and crank out some tests?

Or perhaps you’ve got programming experience, and you’ve automated some regression tests, but they keep sporadically failing in your continuous integration. You and your team are spending way too much time diagnosing failures to see if changes to the production code caused regression failures or if it’s just something wrong with the automated test script. If you haven’t had the time to learn good automated testing patterns and principles, troubleshooting and maintaining automated scripts can slow your team’s ability to deliver new features.

While creating effective automated testing is one thing, investigating test failures and maintaining the tests is another, much bigger challenge! We need to be able to trust the results of our automated testing so that we can feel confidence in what we deploy to production, especially if our team is practicing continuous delivery (CD). We should aim to avoid needing to test our automated tests!

I’d like to discuss some principles and patterns that can help you keep your automated testing as simple and trustworthy as possible. Automated testing should free your team up for more important testing activities that require their human capabilities, not suck up all your time. This is an especially important consideration for UI tests, where potentially complicated scenarios can easily lead to overly complicated tests. Don’t fear the learning curve. Experiment with patterns and find what works for your team.

Anti-patterns and how they attract us

In my experience, the quality of our automated regression tests is as important - or perhaps even more important - than that of the production code. The tests (or checks if that term is more meaningful to you) help guard against regression failures when changes are introduced. They also document how the system works. When I learned some of the widely accepted coding design patterns, I found many benefits in applying them to the code of my automated testing. Testers who lack this knowledge naturally approach automated testing in the same way they would manually testing. Unfortunately, that can lead to unreliable tests and maintenance headaches.

Even experienced programmers can fall into a trap of automation anti-patterns such as complicated tests with many steps that answer multiple questions. I’ve worked on apps where the data model is hideously complicated. If I set up the test data inside the test itself, it took many steps through the UI to do so. That led to a temptation to “chain” tests together and make them dependent on each other, despite knowing better based on my own programming experience.

Once, I ended up with 25 “chained” test scripts, each testing different functionality and with multiple assertions, operating on test data set up by the previous scripts in the chain. There were lots of problems with this. We couldn’t break up the suite to run in parallel to save time in our continuous integration. If one test failed, the others couldn’t run. What was worse was that often a test early in the chain would do something unexpected that wasn’t caught by an assertion, causing a failed assertion farther down the chain. That type of failure is crazy hard to track down. These non-deterministic tests can really cost your team.

Long, multi-step tests like that, especially if they are at the UI level, have much more potential to be “flaky” or “fragile”, meaning they're subject to timeouts, unexpected system states, and small changes in the UI code. Even if you’re using a tool like mabl that can “self-heal” minor issues, and thus makes the test more stable, an end-to-end test with many steps is always going to be hard to troubleshoot. It’s also harder for the team to remember over time all the issues a long, involved test is supposed to check.

Keep it simple

Simple does not necessarily mean easy, but I find that some up-front investment into deciding what functionality or quality attributes to test with each test script or journey pays off over the long term. I like to bring together a small, cross-functional group of team members, including testers, developers, architects, database experts and operations people. Together, we draw the system architecture on a whiteboard to map what is being tested by existing tests at various levels in the style of type of test and level - unit, API or service, UI. We can see what high-risk areas lack regression test coverage and where could benefit from automated testing.

Think about what you want to learn from your tests. Each test should have a single clear purpose derived from one business rule that you want to check. Avoid any unnecessary details that may make it hard to understand the test or troubleshoot when it fails. This is the “One Clear Purpose” pattern. This is even more critical if your team is guiding development with business-facing tests or practicing behavior-driven development (BDD) or acceptance test-driven development (ATDD)).

Each test should be independent. If it requires setup in terms of test data and a particular state in the app, such as navigating to a particular UI page with certain data contained on the page, there are several good ways to accomplish that setup. For example, your team can create “fixture” or “canonical” data for tests and run a script to refresh that data into a test environment database before running a test suite in your continuous integration (CI). Or, if your application has an API, your test script can send an API request to populate the data. If you’re testing with mabl you can use webhooks for this purpose.

I know how exciting it can be to start using a cool new tool, but resist the temptation to “test everything at once” and create a test script with dozens of steps. Instead, create at least one test for each business rule, and don’t test more than one business rule in each test. Give your tests descriptive, unique names so that you know what each one tests. It may seem like an anti-pattern to have so many individual test scripts or journeys, but we have great tools such as source code control systems to manage that.

Whenever a particular test fails, you’ll know just from the name of the test what business rule that test was checking. Checking the results to discover if it’s a true regression failure or if something changed in the app that requires a change to the test code is a simple matter, because the test only checks one thing.

If your application really does require complicated test code that includes logic, such as IF statements, you will have to test your tests. In fact, I would recommend creating them with test-driven development (TDD), just as with production code. Your automated regression tests are essential to allow your team to keep making changes and deliver new value to your customers frequently.

“But I already have all these long, multi-step automated test scripts…”

If your team spends a lot of time investigating automated test failures, or ignoring or turning off failing tests because “they’re flaky”, your automated testing is adding to your technical debt, which is slowing down your team. If your team is also doing, or trying to move towards continuous delivery, and you can’t trust your automated regression tests, how can you feel confident with deploying frequent changes to production?

You can start paying down that technical debt by budgeting time to refactor your existing tests and apply good patterns such as One Clear Purpose. I’ve found that it’s not too difficult to break multi-step tests down into single-purpose tests and rename them with descriptive names. Don’t be afraid to ask for help. Your fellow delivery team members can help find better ways to create the setup for data and application state that each script or journey needs in order to do the test. 

Another possibility is to identify a test tool that makes it easier for you to create maintainable, granular tests, and use it to start writing tests for new features going forward. I was on a team that produced a financial services app for 9 years. During that time, for various reasons we had to start introducing business logic into our UI, and doing “Ajax-y” things for better usability, which meant our existing UI test tool was no longer adequate. We kept all those older regression tests for as long as the features they tested were still present. We did research and experimented with newer tools to fit our new needs.

Experiment and grow step-by-step

Don’t be afraid to investigate new technology for automated testing. Spend time on testing forums, listening to testing podcasts, going to meetups to learn what’s on the horizon. Your product is probably using newer technology all the time as well. You may need multiple tools for different purposes.

If you’re just starting out on your automated testing journey, you have a great opportunity to start simple and apply good automated testing and code design patterns to your test code. If your product requires a more complex automation approach, build it step by step. I find it is especially helpful for developers and testers to collaborate, since they have complementary skills and perspectives.

Thinking about how to test a new feature can result in better design and implementation of that feature. If you already have an established base of tests, you can always review them and see if they can be improved with these new ideas. Looking over your tests for possible improvements is a good idea for teams that have been following these ideas for a while as well.