CITS3007 lab 6 (week 7) – Static analysis

The aim of this lab is to familiarize you with some of the static analysis tools available for analysing C and C++ code, and to try a dynamic analysis/fuzzing tool (AFL).

Static vs dynamic analysis

Recall from lectures that static analysis tools analyse a program for defects without running it, whereas dynamic analyses are done at runtime.

You already have experience with one sort of static analysis tool – compilers. Compilers are an example of a static analysis tool, because (in addition to producing compiled output) nearly all compilers attempt to detect one sort of defect, namely type errors: cases where the programmer performs operations on a data item which are not appropriate for its type. (C is sometimes referred to as “weakly typed” because it is possible to implicitly convert between many types – for instance, to treat unsigned integral types as signed, or vice versa.) Compilers operate on the source code of a program, but static analysis tools also exist that analyse binary artifacts (such as binary executables or libraries) – the Ghidra reverse engineering framework is an example of one of these. Compilers typically only perform a fairly limited range of checks for possible defects, so it’s often useful to augment them with other static analysis tools.

You might have already experimented with one of the key dynamic analysers we look at, too, the Google sanitizers included with GCC. These perform dynamic analysis of your program, and therefore require your program to be run in order to work. They operate by injecting extra instructions into your program at compile time, deliberately altering the way your program behaves. Enabling them often requires little more than adding something like “-fsanitize=address,undefined -fno-omit-frame-pointer -g -O1” to your GCC options.

Then, many undefined behaviours (like going out of bounds) will trigger the sanitizers to display a stack trace and an explanation of the error – making C behave much more like a language with exceptions (Python or Java, say) when it comes to trying to diagnose bugs and vulnerabilities (as opposed to its usual behaviour, which is to give no visible sign at all that something could be wrong with the program).

When completing the unit project, it will be up to your group to decide what static and dnynamic analysis tools to use on your code in order to find defects and possible vulnerabilities.

Compiler options

Before looking at standalone static analysis tools, we’ll first discuss options that are already built into your compiler. It’s important to ensure you’re making good use of the features already included in GCC.

At a minimum, the compiler options you use for CITS3007 work should include the following:

  -std=c11 -pedantic-errors -Wall -Wextra -Wconversion

You can find a list of all GCC’s warning-related options here. You can easily find recommendations for more extensive warning options than the minimum ones above by Googling for them (one set of recommendations can be found here).

Other important practices to bear in mind are:

Compile at multiple optimization levels

You should make sure to compile at multiple levels of optimization. GCC can perform different analyses, and thus output different warnings, depending on what level of optimization you ask it for. The -O0 option disables all optimizations (GCC’s default behaviour), and -O1 and -O2 enable progressively more optimizations.

You can get documentation on all of GCC’s optimization options here, and obtain a brief list by running gcc --help=optimizers.

Compile with and without debugging symbols

Compiling with debug symbols enabled (GCC’s -g option) can prevent some bugs from appearing – so, even if you use debug symbols to assist you in debugging your code, it’s important to compile and test your code without symbols added, as well.

Compile and test with and without sanitizers

In later classes we will look at the sanitizers included with GCC in more detail. It’s a good idea to test your code both with and without sanitizers enabled (the ASan and UBSan sanitizers are particularly effective at detecting errors).

Compile with different compilers

It can be helpful to try compiling your code using different compilers – although all C compilers should detect errors mandated by the C standard, what other sorts of analyses they do and the warnings they produce can vary from tool to tool. On the CITS3007 standard development environment (CDE), the GCC and Clang compilers are both available.

Compile on different platforms

Sometimes it can be useful to ensure your code is compiled and run on multiple platforms – for example, MacOS, Windows, and BSD Unixes, in addition to Linux. The CITS3007 project is only required to compile and operate correctly on Linux systems. But it can still be useful to see how it behaves on other, closely related, systems – sometimes this can expose bugs or assumptions you didn’t detect on Linux. It will be up to you to decide if this is a strategy you wish to pursue.

1. Setup

In the CITS3007 standard development environment (CDE), download the source code for the dnstracer program, which we’ll be analysing, and extract it:

$ wget http://www.mavetju.org/download/dnstracer-1.9.tar.gz
$ tar xf dnstracer-1.9.tar.gz
$ cd dnstracer-1.9

Dnstracer is a somewhat old piece of code – the homepage appears not have changed since 2002. It wasn’t written with the intention to conform to a particular C standard, but we’ll see if we can adapt the codebase to work with the C11 standard, issued a little under 10 years after the codebase stopped being maintained. This sort of task is not uncommon: your team may be tasked with refactoring and improving old, “legacy” code, which still seems to work, adding tests, and bringing it in line with modern standards.

Note that the Dnstracer download link is an “http” link rather than an “https” link – what problems could this cause? You can read more about Dnstracer by following the relevant links at http://www.mavetju.org/unix/general.php. It is used to graphically depict the chain of servers involved in a Domain Name System (DNS) query. The DNS protocol is an important part of the modern Internet, but we won’t be examining it in detail – Dnstracer is just a sample program used here to test analysis tools on.

It’s still used as a network reconnaissance tool as part of penetration testing.

2. Building and analysis

2.1. Building

We will be analysing the Dsntracer program, which is subject to a known vulnerability, CVE-2017-9430. You can read more about the Dnstracer program at https://www.mavetju.org/unix/general.php.

You can build Dnstracer by running the following commands in your development environment:

$ ./configure
## a number of outputs of automatically run tests should appear here
$ make
## We expect this to produce many warnings and errors -- see below

Let’s examine what these are doing.

./configure and the GNU Autotools

If we want to write a C program that can be compiled and run on many systems, we need some platform-independent way of detecting what operating system we are running on, what tools (compilers, linkers, and scripting tools) are available, and exactly what options they support – unfortunately, these can vary widely from platform to platform.

How can we detect these things, and use that knowledge when compiling our program? One way is to use the suite of tools known as GNU Autotools. Rather than write a Makefile ourselves, we create a template for a Makefile – named Makefile.in – and use the GNU Autotools to create a ./configure script which will gather details about the system it is running on, and use those details to generate:

a proper Makefile from the template, and
a config.h file which should be #included in our C source files – this incorporates information about the system being compiled on, and defines symbols that let us know what functions and headers are available on that target system.

(Specifically, Dnstracer is using the tools Autoconf and Automake – GNU Autotools contains other tools as well which are outside the scope of this lab.)

The GNU Autotools are sometimes criticised as not being very easy to use. Alternatives to the GNU Autotools for writing platform-independent code and building on a range of platforms include the tools Meson and Cmake.

The ./configure script generates two files, a Makefile and config.h, which incorporate information about the system being compiled on. However, the content of those two files is only as good as the developer makes it – if they don’t enable the compiler warnings and checks that they should, then the final executable can easily be buggy. The output of the make command above should show us the final compilation command being run. (Type make clean, then make again if you need to see what the output was.)

## Lots of output ... and eventually:
gcc -DHAVE_CONFIG_H -I. -I. -I.     -g -O2 -c `test -f 'dnstracer.c' || echo './'`dnstracer.c

It also includes a warning about a possible vulnerability (marked with -Wformat-overflow).

We know from earlier classes that invoking GCC without specifying a C standard (like C11) and enabling extra warnings can easily result in code that contains bugs and security vulnerabilities – so the current version of the Makefile is insufficient for our purposes.

How can we enable extra warnings from GCC? If you run ./configure --help, you’ll see that we can supply a number of arguments to ./configure, and some of these are incorporated into the Makefile and use to invoke GCC. Let’s try to increase the amount of checking our compiler does (and improve error messages) by switching our compiler to clang, and enabling more compiler warnings:

$ CC=clang CFLAGS="-pedantic -Wall -Wextra" ./configure
## we expect this to produce many warnings and/or errors...
$ make clean all

If you look through the output, you’ll now see many warnings that include the following:

passing 'unsigned char *' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign]

Although this is useful information, there are so many of these warnings it’s difficult to see other potentially serious issues. (And many of them may be harmless – an example of false positives from the compiler warnings). So we’ll disable those. GCC tells us that they’re enabled using the flag -Wpointer-sign, so we can use the flag -Wno-pointer-sign to disable them. Run:

$ CC=clang CFLAGS="-pedantic -std=c11 -Wall -Wextra -Wno-pointer-sign" ./configure
$ make clean all
## _still_ many warnings and/or errors. but slightly fewer than before

The -pedantic flag tells the compiler to adhere strictly to the C11 standard (-std=c11); compilation now fails, however: the author of Dnstracer did not write properly compliant C code. In particular:

proper #includes for strncasecmp, strdup and getaddrinfo are missing
proper #includes for the struct addrinfo type are missing

Check man strncasecmp, and you’ll see it requires #include <strings.h>, which is missing from the C code. man strdup tells us that this is a Linux/POSIX function, not part of the C standard library; to inform the compiler we want to use POSIX functions, we should add a line #define _POSIX_C_SOURCE 200809L to our C code. This is called a feature-test macro. Furthermore, it’ll be useful to make use of static asserts, so we should include the assert.h header. Edit the dnstracer.c file, and add the following near the top of the file (e.g. just after the first block comment):

#define _POSIX_C_SOURCE 200809L
#define _DEFAULT_SOURCE
#include <assert.h>

The #defines need to appear before we start include-ing header files. If we now run make clean all, we should have got rid of many compiler-generated warnings. But many more problems exist.

Writing portable C code – non-standard extensions to C

If we want to write portable C code – code that will work with other C compilers and/or other operating systems – it’s important to specify what C standard we’re wanting to adhere to (in this case, C11), and to request that the compiler strictly adhere to that standard.

When we invoke gcc or clang with the arguments “-std=c11 -pedantic”, this disables many compiler-specific extensions. For example: the C standard says it’s impermissible to declare a zero-length array (e.g. int myarray[0]), but by default, gcc will let you do so without warning. In general, disabling compiler-specific extensions is a good thing: it ensures we don’t accidentally use gcc-only features, and makes our code more portable to other compilers.

One reason some people don’t add those arguments is because (as we saw above) doing so may make their programs stop compiling. But this is because they haven’t been sufficiently careful about distinguishing between functions that are part of the C standard library, and functions which are extensions to C provided by their compiler, or are specific to the operating system they happen to be compiling on.

For example, fopen is part of the C standard library; if you run man fopen, you’ll see it’s include in the stdio.h header file. (And if you look under the “Conforming to” heading in the man page, you’ll see it says POSIX.1-2001, POSIX.1-2008, C89, C99 – fopen is part of the C89 (and later) versions of the C standard.)

On the other hand, strncasecmp is not part of the C standard library: it was originally introduced by BSD (the “Berkeley Standard Distribution”), a previously popular flavour of Unix, and is now a GCC extension. It is part of the POSIX standard for Unix-like operating systems. (If you look under the “Conforming to” heading in the man page, you’ll see it says 4.4BSD, POSIX.1-2001, POSIX.1-2008.)

Using -std=c11 -pedantic encourages you to be more explicit about what compiler- or OS-specific functions you’re using. strncasecmp is usually only found on Unix-like operating systems. It isn’t available, for instance, when compiling on Windows with the MSVC compiler; if you want similar functionality, you need the _strnicmp function.

Sometimes when using a function from a standard other than the C standards, your compiler will require you to specify exactly what extensions and standards you want to enable. For instance, man strdup (rather obliquely) tells you that adding

#define _POSIX_C_SOURCE 200809L

to your C code is one way of making the strdup function available.

Note that you should put the above #define before any #includes: the #define is called a “feature-test macro”, and it’s acting as a sort of signal to the compiler, telling it what parts of any later-appearing header files to process, and what to ignore.

Using -std=c11 -pedantic doesn’t guarantee your code conforms with the C standard (though it does help). Even with those flags enabled, it’s still quite possible to write non-conforming programs. As the gcc manual says:

Some users try to use -Wpedantic to check programs for strict ISO C conformance. They soon find that it does not do quite what they want: it finds some non-ISO practices, but not all – only those for which ISO C requires a diagnostic, and some others for which diagnostics have been added.

From a security point of view, it’s easier to audit code that’s explicit about what libraries it’s using, than code which leaves that implicit; so specifying a C standard and -pedantic is usually desirable.

Project tip

Failing to use non-standard functions correctly has been a frequent source of lost marks in the unit project in previous years. Make sure you understand how to use non-standard functions correctly and experiment with them on your own.

2.2. Static analysis

We’ll identify some problems with Dnstracer using Flawfinder – read “How does Flawfinder Work?”, here: https://dwheeler.com/flawfinder/#how_work. Flawfinder is a linter or static analysis tool that checks for known problematic code (e.g. code that calls unsafe functions like strcpy and strcat). Install Flawfinder by running the command sudo apt install flawfinder in your development environment, and then try using it by running:

$ flawfinder *.c

The output typically includes a message at the end with tips on what Flawfinder’s output means. (You may get a somewhat different message at the time this lab is run.)

$ flawfinder *.c
# ... many lines omitted ...
Not every hit is necessarily a security vulnerability.
You can inhibit a report by adding a comment in this form:
// flawfinder: ignore
Make *sure* it's a false positive!
You can use the option --neverignore to show these.

In general, you’ll see a lot of output. This is a common problem when applying static analysis tools to an existing codebase for the first time. It’s usually preferable to start coding with static analysers already enabled – but sometimes we inherit legacy code and simply have to make do. Commercial static analysis companies will often provide technical experts to your organisation, as part of the process of setting the tool up, who can help configure the tool to as nearly as possible show only the warnings you want. But we are using a free and open source tool, and will have to do this job ourselves. Any good static analysis tool should allow us to ignore particular bits of code that would be marked problematic, either temporarily, or because we can prove to our satisfaction that the code is safe.

The output of flawfinder is not especially convenient for browsing.

It’s often more convenient to integrate the output with the editor or IDE (integrated development environment) you’re using.

You might like to see if you can find a way of integrating the output of flawfinder into your preferred editor or IDE (e.g. VS Code or Vim).

In flawfinder’s output, see if you can find mentions of “rr_types”. You should be able to identify that Flawfinder is giving us a general warning about any array with static bounds (which is, really, all arrays in C11). However, there’s another issue here – what is it?

The declared size of the array, and the number of the elements should match up; if someone changes one but not the other, that could introduce problems. We’ll add a more reliable way of checking this.

Remove the size 256 from the array declaration.

Below it, add

static_assert(sizeof(rr_types) / sizeof(rr_types[0]) == 256,
                 "rr_types should have 256 elements");

We’re now statically checking that the number of elements in rr_types (i.e., the size of the array in bytes, divided by the size of one element) is always 256.

We’ll now look at warnings from a program called clang-tidy. It can be run from the command-line (see man clang-tidy for details); try running

$ clang-tidy --checks='-clang-diagnostic-pointer-sign' --extra-arg="-DHAVE_CONFIG_H -I. -Wno-pointer-sign" dnstracer.c --

(NB: This is a “quick and dirty” way of getting clang-tidy to run; better instructions will be provided in future weeks.)

We need to give clang-tidy correct compilation arguments (like -I.), or it won’t know where the config.h header is and will mis-report errors. We want it not to report problems with pointers being coerced from signed to unsigned or vice versa (i.e., the same issue flagged by gcc with -Wno-pointer-sign), so we disable that check by putting a minus in front of clang-diagnostic-pointer-sign.

2.3. Dynamic analysis

Let’s see how Dnstracer is supposed to be used. It will tell us the chain of DNS name servers that needs to be followed to find the IP address of a host. For instance, try

$ ./dnstracer -4 -s ns.uwa.edu.au www.google.com
$ ./dnstracer -4 -s ns.uwa.edu.au www.arranstewart.io

These commands say to use a local UWA nameserver (ns.uwa.edu.au) and to follow the chain of nameservers needed to get IP addresses for two hosts (www.google.com and www.arranstewart.io). The “Google” host is fairly dull; it seems the UWA nameserver stores that IP address directly itself. The second is a little more interesting, as it requires name servers run by Hurricane Electric to be queried.

Now re-compile with gcc at the O2 optimization level, and try some specially selected input:

$ CC=gcc CFLAGS="-pedantic -g -std=c11 -Wall -Wextra -Wno-pointer-sign -O2" ./configure
$ make clean all
$ ./dnstracer -v $(python3 -c 'print("A"*1025)')

You should see the message

*** buffer overflow detected ***: terminated
Aborted (core dumped)

This is the “denial of service” problem reported in CVE-2017-9430. A buffer overflow occurs, but gets caught by gcc’s inbuilt protections and causes the problem to crash. This is better than a buffer overflow being allowed to execute unchecked, but is still a problem: in general, a user should not be able to make a program segfault or throw an exception based on data they provide. Doing so for e.g. a server program – e.g. if a server ran code like Dnstracer’s and allowed users to provide input via, say, a web form – could result in one user being able to force the program to crash, and create a denial of service for other users. (In the present case, Dnstracer isn’t a server, though, so the risk is actually very minimal.)

Note that the problem doesn’t show up when compiling with clang, and only appears at the O2 optimization level (which is often applied when software is being built for distribution to users). Recall that at higher optimization levels, the compiler tends to make stronger and stronger assumptions that no Undefined Behaviour ever occurs, and this can lead to vulnerabilities.

One can analyse the dumped core file in gdb to find the problematic code.

We need to run the following to ensure core dumps work properly on Ubuntu:

$ ulimit -c unlimited
$ sudo systemctl stop apport.service
$ sudo systemctl disable apport.service

ulimit and systemctl

User accounts on Linux have limits placed on things like how many files they can have open at once, and the maximum size of the stack in programs the user runs. You can see all the limits by running ulimit -a.

Some of these are “soft” limits which an unprivileged user can change – running ulimit -c unlimited says there should be no limit on the size of core files which programs run by the user may dump. Others have “hard” limits, like the number of open files. You can run ulimit -n 2048 to change the maximum number of open files for your user to 2048, but if you try a number above that, or try running ulimit -n unlimited, you will get an error message:

bash: ulimit: open files: cannot modify limit: Operation not permitted

The systemctl command is used to start and stop system services – programs which are always running in the background. In this case, we want to stop the apport service: it intercepts segfaulting programs, and tries to send information about the crash to Canonical’s servers. But we don’t want that – we want to let the program crash, and we want the default behaviour for segfaulting programs, which is to produce a memory-dump in a core file.

Run the bad input again, and you should get a message about a core dump being generated; then run gdb. The commands are as follows:

$ ./dnstracer -v $(python3 -c 'print("A"*1025)')
$ gdb -tui ./dnstracer ./core

In GDB, run the commands backtrace, then frame 7: you should see that a call to strcpy on about line 1628 is the cause.

We’ll try to find this flaw using a particular dynamic analysis technique called “fuzzing”. Static analysis analyses the static artifacts of a system (like the code source files); dynamic analysis actually runs the program. We’ll use a program called afl-fuzz¹ to find that bug for us, and identify input that will trigger it.

Fuzzers are very effective at finding code that can trigger program crashes, and afl-fuzz would normally be able to find this vulnerability (and probably many others) by itself if we just let it run for a couple of days. To speed things up, however – because in this case we already know what the vulnerability is – we’ll give the fuzzer some hints.

By default, afl-fuzz requires our program take its input from standard input (though there are ways of altering this behaviour). So to get our program working with afl-fuzz, we’ll add the following code at the start of main (search in Vim for argv to find it, or use the Tagbar pane and search for main):

    int  new_argc = 2;
    char **new_argv;
    {
    new_argv = malloc(sizeof(char*) * new_argc + 1);

    // copy argv[0]
    size_t argv0_len = strlen(argv[0]);
    new_argv[0] = malloc(argv0_len + 1);
    strncpy(new_argv[0], argv[0], argv0_len);
    new_argv[argv0_len] = '\0';

    // read in argv[1] from file
    const size_t BUF_SIZE = 4096;
    char buf[BUF_SIZE];
    ssize_t res = read(0, buf, BUF_SIZE - 1);
    if (res > BUF_SIZE)
      res = BUF_SIZE;
    buf[res] = '\0';

    new_argv[1] = malloc(sizeof(char) * (res + 1));
    strncpy(new_argv[1], buf, res);
    new_argv[1][res] = '\0';

    // set argv[2] to NULL terminator
    new_argv[new_argc] = NULL;
    }

    argv = new_argv;
    argc = new_argc;

The aim here is to get some input from standard input, but then to ensure the input we’ve just read will work properly with the rest of the codebase (which expects to operate on arguments in argv).

So we create new versions of argv and argc (lines 1–2 of the above code), containing the data we want (obtained from standard input – line 15), and then we replace the old versions of argv and argc with our new ones (lines 28–29). If what the code is doing is not clear, try stepping through it in a debugger to see what effect each line has.

AFL requires some sample, valid inputs to work with. Run the following:

$ mkdir -p testcase_dir
$ printf 'www.google.com' > testcase_dir/google
$ python3 -c 'print("A"*980, end="")' > testcase_dir/manyAs

We also need to ideally allow afl-fuzz to instrument the code (i.e., insert extra instructions so it can analyze what the running code is doing) – though afl-fuzz will still work even without this step. Recompile Dnstracer by running the following:

$ CC=/usr/bin/afl-gcc CFLAGS="-pedantic -g -std=c11 -Wall -Wextra -Wno-pointer-sign -O2" ./configure
$ make clean all

Instrumenting for afl-fuzz

Some of the dynamic analysis tools we have seen (like the Google sanitizers, ASan and UBsan) are built into GCC, so to use them, we just have to supply GCC with appropriate command-line arguments (e.g. -fsanitize=address,undefined).

However, AFL-fuzz is not part of GCC, and it takes a different approach. It provides a command, afl-gcc, which behaves very similarly to normal GCC, but additionally adds in the instrumentation that AFL-fuzz needs.

So we can perform the instrumentation by specifying the option CC=/usr/bin/afl-gcc to the ./configure command: this specifies a particular compiler that we want to use.

Then, to do the fuzzing, run

$ afl-fuzz -d -i testcase_dir -o findings_dir -- ./dnstracer

A “progress” screen should shortly appear, showing what AFL-fuzz is doing – something like this:


             american fuzzy lop ++2.59d (dnstracer) [explore] {-1}
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│        run time : 0 days, 0 hrs, 0 min, 18 sec      │  cycles done : 2     │
│   last new path : 0 days, 0 hrs, 0 min, 0 sec       │  total paths : 70    │
│ last uniq crash : none seen yet                     │ uniq crashes : 0     │
│  last uniq hang : none seen yet                     │   uniq hangs : 0     │
├─ cycle progress ───────────────────┬─ map coverage ─┴──────────────────────┤
│  now processing : 66*0 (94.3%)     │    map density : 0.02% / 0.23%        │
│ paths timed out : 0 (0.00%)        │ count coverage : 1.71 bits/tuple      │
├─ stage progress ───────────────────┼─ findings in depth ───────────────────┤
│  now trying : splice 4             │ favored paths : 26 (37.14%)           │
│ stage execs : 60/64 (93.75%)       │  new edges on : 30 (42.86%)           │
│ total execs : 58.0k                │ total crashes : 0 (0 unique)          │
│  exec speed : 3029/sec             │  total tmouts : 0 (0 unique)          │
├─ fuzzing strategy yields ──────────┴───────────────┬─ path geometry ───────┤
│   bit flips : n/a, n/a, n/a                        │    levels : 8         │
│  byte flips : n/a, n/a, n/a                        │   pending : 29        │
│ arithmetics : n/a, n/a, n/a                        │  pend fav : 0         │
│  known ints : n/a, n/a, n/a                        │ own finds : 68        │
│  dictionary : n/a, n/a, n/a                        │  imported : n/a       │
│   havoc/rad : 29/29.2k, 39/28.0k, 0/0              │ stability : 100.00%   │
│   py/custom : 0/0, 0/0                             ├───────────────────────┘
│        trim : 50.13%/169, n/a                      │             [cpu:322%]
└────────────────────────────────────────────────────┘

The AFL-fuzz documentation gives an explanation of this screen here.

We’ve given afl-fuzz a very strong hint here about some valid input that’s almost invalid (testcase_dir/manyAs); but given time and proper configuration, many fuzzers will be able to identify such input for themselves.

After about a minute, afl-fuzz should report that it has found a “crash”; hit ctrl-c to stop it, and look in findings_dir/crashes for the identified bad input.

Crash files

Inside the findings_dir/crashes directory should be files containing input that will cause the program under test to crash. For instance, on one run of AFL-fuzz, a “bad input” file is produced called “findings_dir/crashes/id:000000,sig:06,src:000083,time:25801+000001,op:splice,rep:16”. The filename gives information about the crash that occurred and how the input was derived.

“id:000000” is an ID for this crash – this is the first and only crash found, so the ID is 0.
“sig:06” says what signal caused the program to crash. You can get a list of Linux signals and their numbers by running the command “kill -L”: signal 6 is “SIGABRT”, which is raised when a program calls the abort() function. abort() typically gets called by the process itself; in this case, the code added by gcc to detect buffer overflows detects an overflow has occured, and “bails out” by calling abort().
“src:000083” isn’t too important to understand, but matches up the crash with an item in AFL-fuzz’s “queue” of inputs to try (also available under the findings_dir).
“time:25801+000001” gives information about when the crash occurred.
“op:splice,rep:16” gives information about what AFL-fuzz did to one of our inputs to get the new input that caused the crash. In this case, it performed a “splice” operation (inserting new characters into the input string) 16 times.

Since gcc’s buffer overflow protections are enabled, we should expect a crash to occur exactly when the input is long enough to overflow the buffer – at that point, gcc’s protection code detects that something has been written outside the buffer bounds, and calls abort(). So all AFL-fuzz has to do to trigger a crash is lengthen the input string enough. But AFL-fuzz monitors the code paths the program under test is executing – that’s what the “instrumentation” step is for – and can thus “learn” to explore quite complicated input structures – see this post by the main developer of AFL-fuzz, Michał Zalewski, in which AFL-fuzz “learns” how to generate valid JPEG files, just from being given the input string “hello”.

In general, running a fuzzer on potentially vulnerable software is a pretty “cheap” activity: one can leave a fuzzer running for several days with simple, valid input, and check at the end of that period to see what problems have been discovered.

Challenge exercise

We’ve seen how we could have detected CVE-2017-9430 in advance, by letting a fuzzer generate random inputs and attempt to crash the Dnstracer program.

Can you work out the best way of fixing the problem, once detected?

3. Further reading on fuzzing

Take a look at The Fuzzing Book (by Andreas Zeller, Rahul Gopinath, Marcel Böhme, Gordon Fraser, and Christian Holler) at https://www.fuzzingbook.org, in particular the “Introduction to Fuzzing” at https://www.fuzzingbook.org/html/Fuzzer.html.

Fuzzing doesn’t apply just to C programs; the idea behind fuzzing is to randomly generate inputs in hopes of revealing crashes or other bad behaviour by a program. The Fuzzing Book demonstrates how the techniques by applying them to Python programs, but they are generally applicable to any language.

Fuzzing has been very successful at finding security vulnerabilities in software – often much more so than writing unit tests, for instance. An issue with unit tests is that human testers can’t generate as many tests as a fuzzer can (fuzzers will often generate at least thousands per second), and often have trouble coming up with test inputs that are sufficiently “off the beaten path” of normal program execution to trigger vulnerabilities.

Fuzzers often work well with some of the dynamic sanitizers which we’ve seen gcc and clang provide. The sanitizers (such as ASAN, the AddressSanitizer, and UBSan, the Undefined Behaviour sanitizer) help with making a program crash if memory-access problems or undefined behaviour are detected.

You can read more about AFL-fuzz at https://afl-1.readthedocs.io/en/latest/fuzzing.html, and if you have time, experiment with the honggfuzz fuzzer (https://github.com/google/honggfuzz) or using AFL-fuzz in combination with sanitizers.

4. Other tools

This lab introduces several static analysis tools – but these should not be the only tools you use to analyse your code. In practice, different tools will detect different possible defects, so it’s important to use a range of tools to reduce the chances of defects creeping into your code.

It’s therefore recommended your group try other static analysis tools to see what benefit they provide. Some suggested tools include:

Cppcheck

Cppcheck aims to have a low false-positive rate, and performs what is called “flow analysis” – it can detect when a construct in your code could cause problems later (or earlier) in the program.

Clang static analyzer

This is a different tool to clang-tidy – the Clang project has a number of distinct static analysis tools associated with it. The clang static analyser not only performs extensive static analysis of your code, but is capable of describing the problems using easy-to-understand diagrams produced from your code, like this one:

The simplest way to run the static analyser is usually from the command-line.

Sparse

This tool is used for analysing the Linux kernel source code – it can catch mismatched types, implicit casts, and other type-related bugs that the compiler may not warn about. It supports the use of annotations on the code which can help the compiler and analyser more clearly understand the intent of your code.

Ikos

Developed by NASA, Ikos performs what is called abstract interpretation on a program – a static analysis technique that approximates the set of all possible paths through the program, and what state the program would be in at each point.

“AFL” stands for “American Fuzzy Lop”, a type of rabbit; afl-fuzz was developed by Google. Read about it further at https://github.com/google/AFL. (If you have time, you might like to try using another fuzzer, honggfuzz, by reading the documentation at https://github.com/google/honggfuzz.)↩︎