AI Test Generation

AI (Artificial Intelligence) is much in the news these days. I started wondering if it would be useful in test case generation. So, using ChatGPT, I first read in the C and C++ standards document (in separate trials). I then asked ChatGPT to generate test cases for every normative statement in the standards. I also asked DeepSeek for similar results for C:

ChatGPT:

Upload: N5014.pdf
"Write test cases in C++ that test every chapter of the document."
Great — thanks for sharing the C++ Working Draft (N5014).

Upload: N3220.pdf
"Write C language test cases for each section of chapter 6."
I've bundled all the Chapter 6 C test cases into a single archive.
"Write C language test cases for each section of chapter 7."
I put together a compact, header-by-header Chapter 7 test suite and zipped it for you.

DeepSeek:

Upload: N3220.pdf
"Generate C test cases that test all normative statements."
Here are comprehensive test cases covering normative statements from the provided C24 draft (N3220), organized by section.

The results are...

AI-generated GPT C Test Results

AI-generated GPT C++ Test Results

AI-generated DeepSeek C Test Results

My conclusion: AI excels at the quantity of tests, but not the quality of hand-generated tests.

Availability of the 25a Release

The 25a release of the Plum Hall Test Suites is available now. This release contains infrastructure improvements, bug fixes and preliminary tests for the proposed C++26 / C26 standards.

There are 113 new LVS test cases, documented in "newcases-lvs24a-lvs25a.txt", in multiple directories. These new test cases predominantly pertain to the C++23 and proposed C++26 standards. This release also contains initial support for testing freestanding C++ in t161.dir/. Directory t01a.dir/ contains some tests for modules, though more detailed testing is found in xvs25a.

The C++ suite also now adds tests for undefined behavior, which were previously only available for the C language. The tests are in conform/undeftests/u*.in. Software, especially FREESTANDING, should not contain any constructs which have undefined behavior as per the standards. ISO 26262 in particular has a focus on dealing with undefined/unspecified behavior of C/C++ and on preventing runtime errors. Obeying language standards is recommended by all current safety standards.

LVS

There are 128 new LVS test cases, documented in “newcases-lvs24a-lvs25a.txt”, in multiple directories. These new test cases predominantly pertain to the proposed C++26 standards.

XVS

The XVS release also adds new test cases for modules and coroutines in the directories t01a.dir and t01b.dir respectively. There are 45 new test cases, documented in “newcases-xvs24a-xvs25a.txt”, in multiple directories. These new test cases predominantly pertain to the proposed C++26 standards.

CVS

There are 103 new CVS test case files, documented in “newcases-cvs24a-cvs25a.txt”, in multiple directories. These new test cases predominantly pertain to the proposed C26 standards. The new test case names are prefixed by c2x_ (C23) and c2y_ (C26).

The 25a release represents 5+ years of test case bug fixing, infrastructure improvements, and new test cases for C17, C20, C23, C++20, C++23, C++26, language and library enhancements. There are many improvements in enhancing the test cases themselves, scripting and enhancing the reporting of the results, through the new html interfaces for reporting coverage, commentary on the intent of the test cases and improved standards conformance reporting.

These new test cases predominantly pertain to the C26 proposed standards. The new test case names are prefixed by c2x_ and c2y_. The 25a release represents 3+ years of test case bug fixing, infrastructure improvements, and new test cases for C17, C20, C23, C26, language and library enhancements. There are many improvements in enhancing the test cases themselves, scripting, and enhancing the reporting of the results through the new html interfaces for reporting test case coverage, commentary on the intent of the test cases and improved standards conformance reporting.

Testing

It is very important that you review envsuite(.bat), flags.h and compiler-flags.h to choose the correct settings for your compiler and the standards version you wish to test against. envsuite(.bat) sets the standards year to test against and sets up compiler-specific parameters. flags.h sets general compilation flags for the standars year. compiler-flags.h sets up compilation flags for specific compilers. Modify these settings if your compiler is implemented in the list, or add custom settings for your compiler. It is recommended that after configuring for your environment and needs that the tests be run with the buildmax(.bat) script:

Installation