A Framework for Using Generative AI in Software Testing

Get a Free Trial

Creating, executing, and maintaining reliable tests has never been easier.

Generative AI, particularly popular AI tools like ChatGPT, have quickly become one of the hottest topics in software testing and software development. But how these emerging technologies will impact the industry is still unclear. According to Stack Overflow’s 2023 Developer Survey, 55.17% of respondents are interested in using AI for software testing, but only 2.85% say they have a high degree of trust in AI tools. The trust gap and rising demand for AI skills is pushing quality professionals to navigate unknown waters.

This guide will help quality professionals understand the limits of current generative AI tools, where AI can be used in the testing lifecycle, and how quality teams can start building AI skills.

Understanding Generative AI Tools

Generative AI tools like ChatGPT are good at producing large quantities of information, but the quality of those results is often lacking. A recent Purdue University study found that ChatGPT answered 52% of software engineering questions incorrectly. Despite those inaccuracies, the study noted that answers were thorough and typically addressed all aspects of each question. Though the accuracy of generative AI tools likely varies across different tools and models, and is likely to evolve over time, the study is a good indicator of where generative AI can be most useful to quality teams. Tasks that require creating comprehensive planning, generating large volumes of data, or exploring new ideas are best suited for generative AI adoption, while tasks demanding high levels of accuracy or specialized knowledge are where human knowledge still shines.

Building Generative AI Skills for Software Testing

Due to the inaccuracy of current generative AI tools, one of the most effective ways software testing teams can build their AI skills is through practice and progressively updating answers. ChatGPT and similar models often produce more accurate responses with more refined queries and better information, so learning to structure potential problems and using feedback to refine answers can significantly improve the value of generative AI for development teams.

To help everyone harness these resources efficiently, quality leaders should create a structure for sharing knowledge and reviewing generative AI responses. Processes that have already been proven to help build a culture of quality, such as peer reviews, paired programming, and hackathons, can help software testers and developers create more targeted queries and check ChatGPT responses for accuracy. These practices give everyone the opportunity to build generative AI skills without increasing the risk of faulty responses creating issues in development pipelines.

Opportunities for Generative AI in the Software Testing Life Cycle

Breaking down the software testing life cycle will help quality professionals assess when to leverage generative AI.

Requirement Phase Testing: Also known as Requirement Analysis, this stage involves gathering functional and non-functional requirements, which will shape the team’s software testing strategy.

Generative AI tools like ChatGPT can be helpful in brainstorming new testing requirements, particularly if a team is looking to increase test coverage across a new aspect of quality like accessibility or performance. Quality engineers have the expertise to assess any generative AI suggestions for feasibility, and can tailor their prompts according to their customers and application.

Test Planning: This stage is when quality teams transform test requirements into a software testing strategy. In addition to determining their testing strategy, quality professionals will establish test environment needs, test limitations, and their testing schedule.

Test planning typically requires complex logic with application and team-specific constraints, both of which limit the value of generative AI tools. ChatGPT is more useful for problems that require “common knowledge” (i.e. information that can be found on public web pages), though providing specific information on available tools and requirements can help augment test planning for some scenarios. Generally, however, the complex reasoning required at this stage of the software testing life cycle limits the value of generative AI tools.

Test Case Development: This is when testing teams create and update test cases. Generative AI tools can further reduce the effort needed for test case development by helping quality teams create test data. Given specific examples, ChatGPT can quickly create new test data that can be converted into a data table for a scalable testing strategy.

As noted in the Purdue University study, one of ChatGPT’s strengths is the comprehensive nature of its responses. This makes it well-suited for summarizing new tests, though final results will need to be refined by a team member.

Test Environment Setup: Working with developers, quality engineers spend this phase deciding in which environments tests will be executed. They’ll determine the required architecture, set up the necessary environments, and perform smoke tests on the build. Like the test planning stage, this phase of the software testing life cycle requires specialized skills that are less likely to be supplemented by generative AI.

Test Execution: Quality teams run their testing strategy at this stage of the software testing life cycle. With a test automation solution that features autohealing, this stage is already less time-consuming thanks to artificial intelligence. An effective test automation solution will further reduce the effort needed for this phase by allowing quality teams to quickly execute tests in parallel, then automatically share comprehensive test results directly into Jira, Slack, or Microsoft Teams. This makes it easier for quality teams to document and track defects through retesting.

Test Cycle Closure: At the final stage of the software testing life cycle, quality engineers will gather metrics and assess testing success. They’ll evaluate opportunities to improve testing efficiency, test coverage, product quality, and team efficiency. This phase is also an ample opportunity to discuss efforts to leverage AI in software testing and how comfortable the team feels in harnessing these emerging tools.

Read mabl’s tips for using ChatGPT for low-code test automation.

Thinking Long-Term about Generative AI in Software Development

Generative AI is still in the early stages of development, with more transformative changes sure to come in the future. By considering how popular tools like ChatGPT can augment existing quality engineering and software testing efforts, quality leaders can improve testing efficiency and start empowering their teams to navigate this new era.

Join a community of quality leaders this November at mabl Experience as we explore the future of software quality, including generative AI. Save your virtual seat or register for our live event in Boston for free!

A Framework for Using Generative AI in Software Testing

Get a Free Trial

Understanding Generative AI Tools

Building Generative AI Skills for Software Testing

Opportunities for Generative AI in the Software Testing Life Cycle

Thinking Long-Term about Generative AI in Software Development

Quality Engineering Resources

mabl Named “AI Quality Management Solution of the Year” in 2025 AI Breakthrough Awards Program

When Free Isn't Free: The Real Economics of Open-Source Automation

The Real Cost of Open Source in Test Automation