This lab explores the role of testing in secure software development.
We will be using the Doxygen documentation tool. Install it in your VM with
$ sudo apt-get update
$ sudo apt-get install --no-install-recommends doxygen graphviz
We will also make use of the Check unit testing framework. Install it with
$ sudo apt-get install check
The aim of testing is to identify and remove defects from a project – mistakes in the source code or configuration/data files that cause it to deviate from its prescribed behaviour.1 Vulnerabilities are a particular class of defects where the resulting failure compromises security goals for a system.
To be able to test a program, or part of it, we have to know what its intended behaviour is, or by definition we can’t test it. Documentation is therefore an important part of any software project. The documentation for a function or other piece of code tells us what it should do, and testing tries to find situations in which the code does something else.
A program specification defines the behaviour expected of an entire program, and can be used directly for testing that program. However, it doesn’t say anything about the behaviour of individual functions. Those are normally documented within the source file that contains them, and the documentation for all public-facing functions, macros and data structures forms the API (“Application Programming Interface”) for that file.2
Typically, the specification documentation for functions is contained in documentation blocks: specially formatted comments or annotations which can be extracted and displayed by documentation tools. For example, the documentation block below is from a previous year’s project:
/** Decrypt a given ciphertext using the Caesar cipher, using a specified key, where the
* characters to decrypt fall within a given range (and all other characters are copied
* over unchanged).
*
* Calling `caesar_decrypt` with some key $n$ is exactly equivalent to calling
* `caesar_encrypt` with the key $-n$.
*
* \param range_low A character representing the lower bound of the character range to be
* encrypted
* \param range_high A character representing the upper bound of the character range
* \param key The encryption key
* \param cipher_text A null-terminated string containing the ciphertext to be decrypted
* \param plain_text A pointer to a buffer where the decrypted text will be stored. The
* buffer must be large enough to hold a C string of the same length as
* cipher_text (including the terminating null character).
*
* \pre `cipher_text` must be a valid null-terminated C string
* \pre `plain_text` must point to a buffer of identical length to `cipher_text`
* \pre `range_high` must be strictly greater than `range_low`.
* \pre `key` must fall within range from 0 to `(range_high - range_low)`, inclusive.
*/
void caesar_decrypt(char range_low, char range_high, int key, const char * cipher_text, char * plain_text);
Documentation blocks normally have some way of formatting the
documentation for easy reading, of documenting particular parts of a
function (like parameters or the return value), and of referring to
other, related functions. In this lab, we will use the Doxygen tool, which is expressly designed
for extracting API documentation from C and C++ files. It uses Markdown
conventions for formatting, and special tags (like \param
,
\return
and \ref
) to pick out particular
portions of a function – these are described in the Doxygen tool’s documentation. You
likely have encountered similar tools to Doxygen previously for other
languages: Java uses the javadoc
tool, Python projects typically use pydoc
or sphinx
, Rust
uses rustdoc
,
and Haskell uses haddock
.
Note that documentation blocks do not serve the same purpose
as inline comments (comments contained within the body of a function).
(In fact, in some languages, documentation blocks may not be comments at
all. Python uses strings
instead of comments, and Rust internally uses the #[doc]
annotation.) Documentation blocks should always be included
for any function that forms part of an API, so that other programmers
know how to use that function, and documentation blocks can be
as extensive as needed. If you are using a C library – say, the FLAC
library, which allows you to encode, decode and manipulate audio files
in the FLAC format – then your primary way of knowing what the functions
in that library do is by referring to the API documentation.
Not only do you not need to know what any inline comments say, but for
commercial software libraries, you might not have any access to them or
to the source code at all.3
In contrast to documentation blocks, inline comments are only for the use of programmers who need to fix or enhance existing functions, and typically should be used sparingly – excessive inline commenting makes code harder to read. In general, inline comments should not say what the code is doing – anyone who understands the programming language should be able to see that – but rather why it is doing it.
Inline commenting of your code
Don’t over-comment your code! In this unit, we value clarity and conciseness. Over-commenting can detract from both, and it will be difficult to achieve high marks if your code is excessively or unnecessarily commented.
You should assume that the person marking your code is an experienced C programmer and does not require explanations of basic language features.
If you do feel you need to add inline comments, then focus on explaining why you are doing something, rather than what you are doing. The code itself should make the ‘what’ clear; comments should provide additional context or reasoning.
doxygen
Ensure you have a copy of your group’s project code available, and download it into your development environment. (Or, if you don’t have it quickly available, you can also
Change directory to your directory containing your source code, and
run doxygen -g
. It generates a file called
Doxyfile
, used to configure the exact contents and
formatting of the API documentation for a project. To work well with a C
project, a few changes are needed – you can download a Doxyfile with
these changes in the code .zip file for this lab.
Specific changes we make to the original Doxyfile include changing
INPUT
to src
OPTIMIZE_OUTPUT_FOR_C
to YES
RECURSIVE
to YES
SOURCE_BROWSER
to YES
Download and extract the Doxyfile
from that zip file,
and run the command doxygen
. The Doxygen tool will use the
configuration contained in the Doxyfile
configuration file
to generate HTML documentation contained in an html
subdirectory of your project.
The easiest way of viewing the HTML documentation is usually by using an editor like VS Code to open the HTML files in the development VM; alternatively, it’s possible to copy the files from the development VM to the host – see this StackOverflow answer.
Some of the functions in the project code already have documentation
blocks written for them (e.g. handle_login
), but others do
not. As you work through the project, it’s a good idea to update the
documentation blocks based on your understanding of what services the
different functions perform, and what they require from the caller in
return.
Doxygen features
Doxygen has a number of features we will not use, but which are very useful for exploring the structure of large codebases. It can create UML-style diagrams of the classes or functions contained in the code, as well as showing graphically which functions or methods make use of others. You can read more about those features here.
If you’re unfamiliar with good practices for writing API documentation, take a look at
The .zip file for this lab contains an alternative Makefile, together with some additional files, that you may find useful for the project.
The new files include the following features:
We already recommend you compile all code for this unit with
-Wall -Wextra -pedantic-errors
. The Makefile for this
week’s lab includes a Make variable EXTRA_CFLAGS
which
defines a number of other warning and error options which can stop you
making mistakes in your code.
Development teams, particularly in security-sensitive environments, frequently use flags like these to enforce best practices and improve the overall quality of the code.
As we’ve seen in previous labs, -pedantic-errors
enforces stricter compliance with the C standard – it turns any
non-standard behaviour into an error, which helps catch potentially
problematic code that might work on some compilers but is technically
not conforming to the C standard. This is useful to ensure that code
behaves consistently across different compilers.
The option -Werror=vla
turns warnings about
Variable Length Arrays (VLAs) into errors. VLAs are
arrays whose size is determined at runtime, which can lead to stack
overflows, attacker induced [stack clashes], or unpredictable behaviour
if not carefully managed. Banning VLAs can enhance the stability and
predictability of the program, which is especially important in
security-sensitive code.
You might be familiar with GCC warning about “implicitly defined
functions”; the -Werror=implicit-function-declaration
option turns that too into an error. In C, using a function before its
declaration can lead to unpredictable behaviour – the compiler is forced
to make assumptions about the parameters and return type which are often
incorrect. By making this an error, code is forced to be more explicit,
improving reliability and reducing the risk of bugs.
The GCC documentation pages explain in detail what the other flags do; in general, all of them aim to improve
by catching a variety of potential issues early.
A banned.h
file is a special header file used in some
development environments to explicitly prohibit the use of certain
functions or programming practices that the development team has decided
should never be used in their codebase. It helps
enforce coding standards and ensures that developers avoid potentially
dangerous or inappropriate behaviour in their code.
This file typically contains declarations of functions that are considered unsafe, inappropriate, or incompatible with the design goals of the project. The functions in this list are usually replaced by safer, more controlled alternatives that adhere to the project’s standards.
For example, the gets()
function in C is notorious for
its potential to cause buffer overflows, as it doesn’t check the size of
the input. A team might take the very reasonable decision to ban the use
of gets()
(and possibly other functions like
scanf()
) in the project because these can easily be
exploited by attackers if used improperly.
FILE
pointer-based I/O functions like
fopen()
, fprintf()
, and fclose()
might be banned in certain projects because they are not well-suited for
structured logging or when multiple components need to interact cleanly
with each other. Instead, the team may use custom logging functions or
more explicit input/output management that avoids relying on global
stdout
or stderr
(which can create conflicts
or problems during testing, or in environments such as multi-threaded
or multi-process systems). The
banned.h
header we include in this week’s code bans such
functions for exactly this reason – no code in CITS3007 assessments is
supposed to print to stdout
or stderr
unless
explicitly asked to.
A typical banned.h
might look like this:
// banned.h - Header file to prohibit certain unsafe or inappropriate functions
#ifndef BANNED_H
#define BANNED_H
// Prohibit the use of dangerous or insecure functions
#define gets(...) _Pragma("GCC warning \"gets() is banned. Use fgets() instead.\"") // Prevent usage of gets
#define strcpy(...) _Pragma("GCC warning \"strcpy() is banned. Use strncpy() instead.\"")
#define sprintf(...) _Pragma("GCC warning \"sprintf() is banned. Use snprintf() instead.\"")
// Prohibit use of file pointer-based I/O functions
#define fopen(...) _Pragma("GCC warning \"fopen() is banned. Use custom logging functions instead.\"")
#define fprintf(...) _Pragma("GCC warning \"fprintf() is banned. Use custom logging functions instead.\"")
#define fclose(...) _Pragma("GCC warning \"fclose() is banned. Use custom logging functions instead.\"")
#endif // BANNED_H
(This is not exactly the way ours operates, but gives you an idea of how they look.)
When the banned.h
file is included in the source code
(usually by a central header that is included throughout the project),
the compiler will generate warnings (or errors, depending on the project
settings) whenever one of the banned functions is used. This prevents
developers from accidentally using functions that are considered unsafe,
inappropriate, or incompatible with the project’s design principles.
check_account.ts
set of example unit
testsTesting is important in any software project, but it is absolutely critical in a security-conscious environment. Security vulnerabilities often arise from subtle bugs: memory errors, unexpected edge cases, or incorrect assumptions about how code behaves. Thorough testing helps to catch these problems early, before they can become serious vulnerabilities.
In this lab, we show how to use the libcheck
framework,
which is specifically designed for C development. It provides
process isolation, meaning that if a test crashes or
triggers a memory error, it won’t crash the entire test suite – only the
individual test process. This is extremely useful when writing
security-sensitive code, because it allows you to aggressively test
error handling and boundary cases without destabilising the test
harness.
You are welcome to use any other C testing framework you are comfortable with, but consistent, thorough, and automated testing is expected. Good testing practices are essential not just for correctness, but also for security: well-tested code is much harder to exploit.
Take a look at some of the tests in check_account.ts
and
see if you can follow what they are doing. Consider other tests – are
there edge cases you can identify? What if some string input is
at the largest size it can plausibly be for a parameter, or the smallest
size – what will happen? Are degenerate cases (e.g. empty strings)
allowed for some or all string parameters, and if so, does your code
behave correctly when they are passed? What about functions like
account_update_password
– how should they behave if
implemented correctly? (One question to think about: when invoked twice
with the same password, do you expect
account_update_password
to produce the same result, or a
different one?)
When testing your project, you’ll need to compile your code with a
range of optimizations and sanitizers (as well as making use of
the static analysers we have looked at). Some warnings and sanitizers
only work well when code is quite highly optimized (option
-O2
to GCC), and high optimization levels also often elicit
bugs you otherwise wouldn’t have identified. But testing at low
optimization levels is useful too – entirely different bugs may
appear.
Once a specification is available for a function (or even before then!), it’s possible to start writing tests for it. A test is meant to look at the behaviour of a system or function in response to some input, and make sure that it aligns with what we expect.
It can be useful to think of a test as being composed of three parts
In C, we often also need to add a fourth part,
“Cleanup”. Some languages have automatic
memory and resource management – when open files or allocated memory
are no longer in use, they are “garbage collected” – but C is not one of
these. In C, it’s up to the programmer to ensure they dispose of
resources after use (for instance by free()
-ing allocated
memory and closing open files).
means preparing whatever resources are required for our test. This could include initializing data structures needed, creating and populating files or database, or starting programs running (say, a webserver).
means invoking the behavior we want to test. In C, this will typically mean calling a function, which we call the function under test.
means to look at the resulting state of the system and see if it is what we expected. If a function returns a value, it might mean checking to make sure that value is the correct one. If the function instead writes data to a file or database, it might mean examining the file or database to see whether the changes made are the ones we expected. Sometimes asserting just requires comparing two values, but other times we might need to make a more thorough investigation.
In C, cleanup means to dispose of any used resources, and to make sure the test we’ve just run won’t interfere with the results of any future tests.
C is a particularly challenging language to write tests for, because a misbehaving function under test can overwrite the stack frame of the function that’s calling it, meaning we can no longer rely on the results of our test.
It’s a good idea, therefore, to enable any dynamic checks we can that
will help us catch misbehaviour like this – for instance, by using the
Google sanitizers or
tools like Valgrind. Additionally,
the Check
unit-testing framework, which we use in this lab, by default uses the
fork()
system call to run tests in a separate address space
from the test framework, which prevents the framework from being
affected by any memory corruption that occurs.
If you run make test
, the Makefile will
checkmk
tool to generate “.c
” files from the tests contained in
check_account.ts
, andThe use of checkmk
isn’t necessary – we could write the
tests by hand in C if we wanted – but it saves us having to write some
repetitive boilerplate code.
Try running
$ make test
to see the Check framework in action. You should see that several tests were run, perhaps that some pass, and perhaps that others failed.
Check can output results in multiple
formats. You might find the output of the check_account
binary more readable if you run it with the following environment
variables and options:
CK_TAP_LOG_FILE_NAME=- prove --verbose ./check_account
Here, make check_account
builds our test-runner program.
CK_TAP_LOG_FILE_NAME=-
tells it to output results using the
“TAP” format for test results,
and prove
is a Perl program which formats those results and
summarizes them (see man prove
for
details). Leaving off the “--verbose
” flag to
prove
will result in just a summary being printed.
Try and adjust the Makefile so that your code is compiled and run with the UBSan and ASan sanitizers (look at previous labs for hints on how to do so). MSan (memory sanitizer) is another dynamic analyser worth looking at, but note that it can’t be enabled at the same time as ASan – they instrument your code in incompatible ways.
It’s a good idea to enable the sanitizers while developing your
project. If they detect memory errors, you may need to debug your
program using gdb
. The output of AddressSanitizer should
includes a stack trace which reveals where the bug was detected.
(Include the “-g” option to gcc for better information). If running
$ gdb -tui ./check_account
to track down what causes a bug, you probably will want to set the
environment variable CK_FORK
to “no
”, like
this:
$ CK_FORK=no gdb -tui ./check_account
This inhibits Check’s usual behaviour of fork
ing off a
separate process in which to run each test.
Although your project should be completed in your own group, it’s fine to discussing with other students or the lab facilitators the general concepts of testing, and how you might come up with more tests for your code – in fact, this is encouraged. Besides the tests contained in the .ts file, what additional tests will you need? How will you ensure your test expectations are correct?
We use the terms “defect” and “failure” generally in line with their definitions in ISO/IEC/IEEE standard 24765(“Systems and software engineering – Vocabulary”). A failure is deviation of the behaviour of a system from its specification, and a defect is an error or fault in the static artifacts (software code, configuration or data files, or hardware) of a system which, if uncorrected, can give rise to a failure.↩︎
In some languages, like Java and Rust, the
implementations of datatypes, functions or methods are
located in the same place as their specification.
Individual items can usually be declared public or
private.
In other languages, like C and C++, the implementations are in a
different file (the .c
or .cpp
) file to the
specifications (which appear in a header file, with extension
.h
or .hpp
).
And some other languages yet are a sort of combination, like Ada and
Haskell. In those languages, the implementation appears in the body of
the file, and a specification near the top in a module or package
“header”.
Best practice in C is to document the public parts of a
.c
file in the header file, to keep the implementations in
the .c
file, and for everything that isn’t intended to be
public to be made static
(private).
However, to keep things simple in the project, we expect project
groups only to submit .c
files, and to put documentation
headers in the .c
files if needed – not in the
.h
files.↩︎
For an example of such a C library, see the Intel IPP multimedia library. Although the library is free for use, the source code is properietary and not available.↩︎