CITS3007 lab 7 (week 9)

The aim of testing is to identify and remove defects from a project – mistakes in the source code or configuration/data files that cause it to deviate from its prescribed behaviour.¹ Vulnerabilities are a particular class of defects where the resulting failure compromises security goals for a system.

To be able to test a program, or part of it, we have to know what its intended behaviour is, or by definition we can’t test it. Documentation is therefore an important part of any software project. The documentation for a function or other piece of code tells us what it should do, and testing tries to find situations in which the code does something else.

A program specification defines the behaviour expected of an entire program, and can be used directly for testing that program. However, it doesn’t say anything about the behaviour of individual functions. Those are normally documented within the source file that contains them, and the documentation for all public-facing functions, macros and data structures forms the API (“Application Programming Interface”) for that file.²

2.1. Documenting an API

Typically, the specification documentation for functions is contained in documentation blocks: specially formatted comments or annotations which can be extracted and displayed by documentation tools. For example, the documentation block below is from a previous year’s project:

/** Decrypt a given ciphertext using the Caesar cipher, using a specified key, where the
  * characters to decrypt fall within a given range (and all other characters are copied
  * over unchanged).
  *
  * Calling `caesar_decrypt` with some key $n$ is exactly equivalent to calling
  * `caesar_encrypt` with the key $-n$.
  *
  * \param range_low A character representing the lower bound of the character range to be
  *           encrypted
  * \param range_high A character representing the upper bound of the character range
  * \param key The encryption key
  * \param cipher_text A null-terminated string containing the ciphertext to be decrypted
  * \param plain_text A pointer to a buffer where the decrypted text will be stored. The
  *           buffer must be large enough to hold a C string of the same length as
  *           cipher_text (including the terminating null character).
  *
  * \pre `cipher_text` must be a valid null-terminated C string
  * \pre `plain_text` must point to a buffer of identical length to `cipher_text`
  * \pre `range_high` must be strictly greater than `range_low`.
  * \pre `key` must fall within range from 0 to `(range_high - range_low)`, inclusive.
  */
void caesar_decrypt(char range_low, char range_high, int key, const char * cipher_text, char * plain_text);

Documentation blocks normally have some way of formatting the documentation for easy reading, of documenting particular parts of a function (like parameters or the return value), and of referring to other, related functions. In this lab, we will use the Doxygen tool, which is expressly designed for extracting API documentation from C and C++ files. It uses Markdown conventions for formatting, and special tags (like \param, \return and \ref) to pick out particular portions of a function – these are described in the Doxygen tool’s documentation. You likely have encountered similar tools to Doxygen previously for other languages: Java uses the javadoc tool, Python projects typically use pydoc or sphinx, Rust uses rustdoc, and Haskell uses haddock.

Note that documentation blocks do not serve the same purpose as inline comments (comments contained within the body of a function). (In fact, in some languages, documentation blocks may not be comments at all. Python uses strings instead of comments, and Rust internally uses the #[doc] annotation.) Documentation blocks should always be included for any function that forms part of an API, so that other programmers know how to use that function, and documentation blocks can be as extensive as needed. If you are using a C library – say, the FLAC library, which allows you to encode, decode and manipulate audio files in the FLAC format – then your primary way of knowing what the functions in that library do is by referring to the API documentation. Not only do you not need to know what any inline comments say, but for commercial software libraries, you might not have any access to them or to the source code at all.³

In contrast to documentation blocks, inline comments are only for the use of programmers who need to fix or enhance existing functions, and typically should be used sparingly – excessive inline commenting makes code harder to read. In general, inline comments should not say what the code is doing – anyone who understands the programming language should be able to see that – but rather why it is doing it.

Inline commenting of your code

Don’t over-comment your code! In this unit, we value clarity and conciseness. Over-commenting can detract from both, and it will be difficult to achieve high marks if your code is excessively or unnecessarily commented.

You should assume that the person marking your code is an experienced C programmer and does not require explanations of basic language features.

If you do feel you need to add inline comments, then focus on explaining why you are doing something, rather than what you are doing. The code itself should make the ‘what’ clear; comments should provide additional context or reasoning.

2.2. Running doxygen

Ensure you have a copy of your group’s project code available, and download it into your development environment. (Or, if you don’t have it quickly available, you can also

Change directory to your directory containing your source code, and run doxygen -g. It generates a file called Doxyfile, used to configure the exact contents and formatting of the API documentation for a project. To work well with a C project, a few changes are needed – you can download a Doxyfile with these changes in the code .zip file for this lab.

Download and extract the Doxyfile from that zip file, and run the command doxygen. The Doxygen tool will use the configuration contained in the Doxyfile configuration file to generate HTML documentation contained in an html subdirectory of your project.

The easiest way of viewing the HTML documentation is usually by using an editor like VS Code to open the HTML files in the development VM; alternatively, it’s possible to copy the files from the development VM to the host – see this StackOverflow answer.

Some of the functions in the project code already have documentation blocks written for them (e.g. handle_login), but others do not. As you work through the project, it’s a good idea to update the documentation blocks based on your understanding of what services the different functions perform, and what they require from the caller in return.

If you’re unfamiliar with good practices for writing API documentation, take a look at

2.3. Writing your project code

The .zip file for this lab contains an alternative Makefile, together with some additional files, that you may find useful for the project.

When testing your project, you’ll need to compile your code with a range of optimizations and sanitizers (as well as making use of the static analysers we have looked at). Some warnings and sanitizers only work well when code is quite highly optimized (option -O2 to GCC), and high optimization levels also often elicit bugs you otherwise wouldn’t have identified. But testing at low optimization levels is useful too – entirely different bugs may appear.

2.4. Writing and running tests

Once a specification is available for a function (or even before then!), it’s possible to start writing tests for it. A test is meant to look at the behaviour of a system or function in response to some input, and make sure that it aligns with what we expect.

In C, we often also need to add a fourth part, “Cleanup”. Some languages have automatic memory and resource management – when open files or allocated memory are no longer in use, they are “garbage collected” – but C is not one of these. In C, it’s up to the programmer to ensure they dispose of resources after use (for instance by free()-ing allocated memory and closing open files).

In C, cleanup means to dispose of any used resources, and to make sure the test we’ve just run won’t interfere with the results of any future tests.

C is a particularly challenging language to write tests for, because a misbehaving function under test can overwrite the stack frame of the function that’s calling it, meaning we can no longer rely on the results of our test.

It’s a good idea, therefore, to enable any dynamic checks we can that will help us catch misbehaviour like this – for instance, by using the Google sanitizers or tools like Valgrind. Additionally, the Check unit-testing framework, which we use in this lab, by default uses the fork() system call to run tests in a separate address space from the test framework, which prevents the framework from being affected by any memory corruption that occurs.

The use of checkmk isn’t necessary – we could write the tests by hand in C if we wanted – but it saves us having to write some repetitive boilerplate code.

to see the Check framework in action. You should see that several tests were run, perhaps that some pass, and perhaps that others failed.

Check can output results in multiple formats. You might find the output of the check_account binary more readable if you run it with the following environment variables and options:

Here, make check_account builds our test-runner program. CK_TAP_LOG_FILE_NAME=- tells it to output results using the “TAP” format for test results, and prove is a Perl program which formats those results and summarizes them (see man prove for details). Leaving off the “--verbose” flag to prove will result in just a summary being printed.

Try and adjust the Makefile so that your code is compiled and run with the UBSan and ASan sanitizers (look at previous labs for hints on how to do so). MSan (memory sanitizer) is another dynamic analyser worth looking at, but note that it can’t be enabled at the same time as ASan – they instrument your code in incompatible ways.

It’s a good idea to enable the sanitizers while developing your project. If they detect memory errors, you may need to debug your program using gdb. The output of AddressSanitizer should includes a stack trace which reveals where the bug was detected. (Include the “-g” option to gcc for better information). If running

to track down what causes a bug, you probably will want to set the environment variable CK_FORK to “no”, like this:

This inhibits Check’s usual behaviour of forking off a separate process in which to run each test.

2.5. Project work

Although your project should be completed in your own group, it’s fine to discussing with other students or the lab facilitators the general concepts of testing, and how you might come up with more tests for your code – in fact, this is encouraged. Besides the tests contained in the .ts file, what additional tests will you need? How will you ensure your test expectations are correct?

We use the terms “defect” and “failure” generally in line with their definitions in ISO/IEC/IEEE standard 24765(“Systems and software engineering – Vocabulary”). A failure is deviation of the behaviour of a system from its specification, and a defect is an error or fault in the static artifacts (software code, configuration or data files, or hardware) of a system which, if uncorrected, can give rise to a failure.↩︎
In some languages, like Java and Rust, the implementations of datatypes, functions or methods are located in the same place as their specification. Individual items can usually be declared public or private.
   In other languages, like C and C++, the implementations are in a different file (the .c or .cpp) file to the specifications (which appear in a header file, with extension .h or .hpp).
   And some other languages yet are a sort of combination, like Ada and Haskell. In those languages, the implementation appears in the body of the file, and a specification near the top in a module or package “header”.
   Best practice in C is to document the public parts of a .c file in the header file, to keep the implementations in the .c file, and for everything that isn’t intended to be public to be made static (private).
   However, to keep things simple in the project, we expect project groups only to submit .c files, and to put documentation headers in the .c files if needed – not in the .h files.↩︎
For an example of such a C library, see the Intel IPP multimedia library. Although the library is free for use, the source code is properietary and not available.↩︎

CITS3007 lab 7 (week 9) – Testing

1. Code and tools

2. Testing, documentation and APIs

2.1. Documenting an API

2.2. Running `doxygen`

2.3. Writing your project code

2.4. Writing and running tests

2.5. Project work

1. Code and tools

2. Testing, documentation and APIs

2.1. Documenting an API

2.2. Running doxygen

2.3. Writing your project code

2.4. Writing and running tests

2.5. Project work

2.2. Running `doxygen`