CITS3007 lab 6 (week 7) – Static analysis
The aim of this lab is to familiarize you with some of the static analysis tools available for analysing C and C++ code, and to try a dynamic analysis/fuzzing tool (AFL).
Static vs dynamic analysis
Recall from lectures that static analysis tools analyse a program for defects without running it, whereas dynamic analyses are done at runtime.
You already have experience with one sort of static analysis tool – compilers. Compilers are an example of a static analysis tool, because (in addition to producing compiled output) nearly all compilers attempt to detect one sort of defect, namely type errors: cases where the programmer performs operations on a data item which are not appropriate for its type. (C is sometimes referred to as “weakly typed” because it is possible to implicitly convert between many types – for instance, to treat unsigned integral types as signed, or vice versa.) Compilers operate on the source code of a program, but static analysis tools also exist that analyse binary artifacts (such as binary executables or libraries) – the Ghidra reverse engineering framework is an example of one of these. Compilers typically only perform a fairly limited range of checks for possible defects, so it’s often useful to augment them with other static analysis tools.
You might have already experimented with one of the key dynamic
analysers we look at, too, the Google sanitizers included with
GCC. These perform dynamic
analysis of your program, and therefore require your program to be
run in order to work. They operate by injecting extra instructions into
your program at compile time, deliberately altering the way your program
behaves. Enabling them often requires little more than adding something
like
“-fsanitize=address,undefined -fno-omit-frame-pointer -g -O1
”
to your GCC options.
Then, many undefined behaviours (like going out of bounds) will trigger the sanitizers to display a stack trace and an explanation of the error – making C behave much more like a language with exceptions (Python or Java, say) when it comes to trying to diagnose bugs and vulnerabilities (as opposed to its usual behaviour, which is to give no visible sign at all that something could be wrong with the program).
When completing the unit project, it will be up to your group to decide what static and dnynamic analysis tools to use on your code in order to find defects and possible vulnerabilities.
Before looking at standalone static analysis tools, we’ll first discuss options that are already built into your compiler. It’s important to ensure you’re making good use of the features already included in GCC.
At a minimum, the compiler options you use for CITS3007 work should include the following:
-std=c11 -pedantic-errors -Wall -Wextra -Wconversion
You can find a list of all GCC’s warning-related options here. You can easily find recommendations for more extensive warning options than the minimum ones above by Googling for them (one set of recommendations can be found here).
Other important practices to bear in mind are:
You should make sure to compile at multiple levels
of optimization. GCC can perform different analyses,
and thus output different warnings, depending on what level of
optimization you ask it for. The -O0
option disables all
optimizations (GCC’s default behaviour), and -O1
and
-O2
enable progressively more optimizations.
You can get documentation on all of GCC’s optimization options here,
and obtain a brief list by running
gcc --help=optimizers
.
Compiling with debug symbols enabled (GCC’s -g
option)
can prevent some bugs from appearing – so, even if you
use debug symbols to assist you in debugging your code, it’s important
to compile and test your code without symbols added, as
well.
In later classes we will look at the sanitizers included with GCC in more detail. It’s a good idea to test your code both with and without sanitizers enabled (the ASan and UBSan sanitizers are particularly effective at detecting errors).
It can be helpful to try compiling your code using different compilers – although all C compilers should detect errors mandated by the C standard, what other sorts of analyses they do and the warnings they produce can vary from tool to tool. On the CITS3007 standard development environment (CDE), the GCC and Clang compilers are both available.
Sometimes it can be useful to ensure your code is compiled and run on multiple platforms – for example, MacOS, Windows, and BSD Unixes, in addition to Linux. The CITS3007 project is only required to compile and operate correctly on Linux systems. But it can still be useful to see how it behaves on other, closely related, systems – sometimes this can expose bugs or assumptions you didn’t detect on Linux. It will be up to you to decide if this is a strategy you wish to pursue.
In the CITS3007 standard development environment (CDE), download the source code for the dnstracer program, which we’ll be analysing, and extract it:
$ wget http://www.mavetju.org/download/dnstracer-1.9.tar.gz
$ tar xf dnstracer-1.9.tar.gz
$ cd dnstracer-1.9
Dnstracer is a somewhat old piece of code – the homepage appears not have changed since 2002. It wasn’t written with the intention to conform to a particular C standard, but we’ll see if we can adapt the codebase to work with the C11 standard, issued a little under 10 years after the codebase stopped being maintained. This sort of task is not uncommon: your team may be tasked with refactoring and improving old, “legacy” code, which still seems to work, adding tests, and bringing it in line with modern standards.
Note that the Dnstracer download link is an “http” link rather than an “https” link – what problems could this cause? You can read more about Dnstracer by following the relevant links at http://www.mavetju.org/unix/general.php. It is used to graphically depict the chain of servers involved in a Domain Name System (DNS) query. The DNS protocol is an important part of the modern Internet, but we won’t be examining it in detail – Dnstracer is just a sample program used here to test analysis tools on.
It’s still used as a network reconnaissance tool as part of penetration testing.
We will be analysing the Dsntracer program, which is subject to a known vulnerability, CVE-2017-9430. You can read more about the Dnstracer program at https://www.mavetju.org/unix/general.php.
You can build Dnstracer by running the following commands in your development environment:
$ ./configure
## a number of outputs of automatically run tests should appear here
$ make
## We expect this to produce many warnings and errors -- see below
Let’s examine what these are doing.
./configure
and the GNU Autotools
If we want to write a C program that can be compiled and run on many systems, we need some platform-independent way of detecting what operating system we are running on, what tools (compilers, linkers, and scripting tools) are available, and exactly what options they support – unfortunately, these can vary widely from platform to platform.
How can we detect these things, and use that knowledge when compiling
our program? One way is to use the suite of tools known as GNU Autotools.
Rather than write a Makefile ourselves, we create a template
for a Makefile – named Makefile.in
– and use the GNU
Autotools to create a ./configure
script which will gather
details about the system it is running on, and use those details to
generate:
config.h
file which should be #include
d
in our C source files – this incorporates information about the system
being compiled on, and defines symbols that let us know what functions
and headers are available on that target system.(Specifically, Dnstracer is using the tools Autoconf and Automake – GNU Autotools contains other tools as well which are outside the scope of this lab.)
The GNU Autotools are sometimes criticised as not being very easy to use. Alternatives to the GNU Autotools for writing platform-independent code and building on a range of platforms include the tools Meson and Cmake.
The ./configure
script generates two files, a
Makefile
and config.h
, which incorporate
information about the system being compiled on. However, the content of
those two files is only as good as the developer makes it – if they
don’t enable the compiler warnings and checks that they should, then the
final executable can easily be buggy. The output of the
make
command above should show us the final compilation
command being run. (Type make clean
, then make
again if you need to see what the output was.)
## Lots of output ... and eventually:
gcc -DHAVE_CONFIG_H -I. -I. -I. -g -O2 -c `test -f 'dnstracer.c' || echo './'`dnstracer.c
It also includes a warning about a possible vulnerability (marked
with -Wformat-overflow
).
We know from earlier classes that invoking GCC without specifying a C standard (like C11) and enabling extra warnings can easily result in code that contains bugs and security vulnerabilities – so the current version of the Makefile is insufficient for our purposes.
How can we enable extra warnings from GCC? If you run
./configure --help
, you’ll see that we can supply a number
of arguments to ./configure
, and some of these are
incorporated into the Makefile and use to invoke GCC. Let’s try to
increase the amount of checking our compiler does (and improve error
messages) by switching our compiler to clang
, and enabling
more compiler warnings:
$ CC=clang CFLAGS="-pedantic -Wall -Wextra" ./configure
## we expect this to produce many warnings and/or errors...
$ make clean all
If you look through the output, you’ll now see many warnings that include the following:
passing 'unsigned char *' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign]
Although this is useful information, there are so many of these
warnings it’s difficult to see other potentially serious issues. (And
many of them may be harmless – an example of false positives
from the compiler warnings). So we’ll disable those. GCC tells us that
they’re enabled using the flag -Wpointer-sign
, so we can
use the flag -Wno-pointer-sign
to disable them.
Run:
$ CC=clang CFLAGS="-pedantic -std=c11 -Wall -Wextra -Wno-pointer-sign" ./configure
$ make clean all
## _still_ many warnings and/or errors. but slightly fewer than before
The -pedantic
flag tells the compiler to adhere strictly
to the C11 standard (-std=c11
); compilation now fails,
however: the author of Dnstracer did not write properly compliant C
code. In particular:
#include
s for strncasecmp
,
strdup
and getaddrinfo
are missing#include
s for the struct addrinfo
type are missingCheck man strncasecmp
, and you’ll see it requires
#include <strings.h>
, which is missing from the C
code. man strdup
tells us that this is a Linux/POSIX
function, not part of the C standard library; to inform the compiler we
want to use POSIX functions, we should add a line
#define _POSIX_C_SOURCE 200809L
to our C code. This is
called a feature-test macro. Furthermore, it’ll be useful to
make use of static asserts, so we should include the
assert.h
header. Edit the dnstracer.c
file,
and add the following near the top of the file (e.g. just after the
first block comment):
#define _POSIX_C_SOURCE 200809L
#define _DEFAULT_SOURCE
#include <assert.h>
The #define
s need to appear before we start
include
-ing header files. If we now run
make clean all
, we should have got rid of many
compiler-generated warnings. But many more problems exist.
Writing portable C code – non-standard extensions to C
If we want to write portable C code – code that will work with other C compilers and/or other operating systems – it’s important to specify what C standard we’re wanting to adhere to (in this case, C11), and to request that the compiler strictly adhere to that standard.
When we invoke gcc
or clang
with the
arguments “-std=c11 -pedantic
”, this disables many
compiler-specific extensions. For example: the C standard says it’s
impermissible to declare a zero-length array
(e.g. int myarray[0]
), but by default, gcc
will let you do so without warning. In general, disabling
compiler-specific extensions is a good thing: it ensures we don’t
accidentally use gcc
-only features, and makes our code more
portable to other compilers.
One reason some people don’t add those arguments is because (as we saw above) doing so may make their programs stop compiling. But this is because they haven’t been sufficiently careful about distinguishing between functions that are part of the C standard library, and functions which are extensions to C provided by their compiler, or are specific to the operating system they happen to be compiling on.
For example, fopen
is part of the C standard library; if
you run man fopen
, you’ll see it’s include in the
stdio.h
header file. (And if you look under the “Conforming
to” heading in the man page, you’ll see it says
POSIX.1-2001, POSIX.1-2008, C89, C99
– fopen
is part of the C89 (and later) versions of the C standard.)
On the other hand, strncasecmp
is not part of
the C standard library: it was originally introduced by BSD (the
“Berkeley Standard Distribution”), a previously popular flavour of Unix,
and is now a GCC extension. It is part of the POSIX standard for
Unix-like operating systems. (If you look under the “Conforming to”
heading in the man page, you’ll see it says
4.4BSD, POSIX.1-2001, POSIX.1-2008
.)
Using -std=c11 -pedantic
encourages you to be more
explicit about what compiler- or OS-specific functions you’re using.
strncasecmp
is usually only found on Unix-like operating
systems. It isn’t available, for instance, when compiling on Windows
with the MSVC compiler; if you want similar functionality, you need the
_strnicmp
function.
Sometimes when using a function from a standard other than the C
standards, your compiler will require you to specify exactly what
extensions and standards you want to enable. For instance,
man strdup
(rather obliquely) tells you that adding
#define _POSIX_C_SOURCE 200809L
to your C code is one way of making the strdup
function
available.
Note that you should put the above
#define
before any #include
s:
the #define
is called a “feature-test macro”, and it’s
acting as a sort of signal to the compiler, telling it what parts of any
later-appearing header files to process, and what to ignore.
Using -std=c11 -pedantic
doesn’t guarantee your
code conforms with the C standard (though it does help). Even with those
flags enabled, it’s still quite possible to write non-conforming
programs. As the gcc
manual says:
Some users try to use
-Wpedantic
to check programs for strict ISO C conformance. They soon find that it does not do quite what they want: it finds some non-ISO practices, but not all – only those for which ISO C requires a diagnostic, and some others for which diagnostics have been added.
From a security point of view, it’s easier to audit code that’s
explicit about what libraries it’s using, than code which leaves that
implicit; so specifying a C standard and -pedantic
is
usually desirable.
Project tip
Failing to use non-standard functions correctly has been a frequent source of lost marks in the unit project in previous years. Make sure you understand how to use non-standard functions correctly and experiment with them on your own.
We’ll identify some problems with Dnstracer using Flawfinder – read
“How does Flawfinder Work?”, here: https://dwheeler.com/flawfinder/#how_work. Flawfinder is
a linter or static analysis tool that checks for known problematic code
(e.g. code that calls unsafe functions like strcpy
and
strcat
). Install Flawfinder by running the command
sudo apt install flawfinder
in your development
environment, and then try using it by running:
$ flawfinder *.c
The output typically includes a message at the end with tips on what Flawfinder’s output means. (You may get a somewhat different message at the time this lab is run.)
$ flawfinder *.c
# ... many lines omitted ...
Not every hit is necessarily a security vulnerability.
You can inhibit a report by adding a comment in this form:
// flawfinder: ignore
Make *sure* it's a false positive!
You can use the option --neverignore to show these.
In general, you’ll see a lot of output. This is a common problem when applying static analysis tools to an existing codebase for the first time. It’s usually preferable to start coding with static analysers already enabled – but sometimes we inherit legacy code and simply have to make do. Commercial static analysis companies will often provide technical experts to your organisation, as part of the process of setting the tool up, who can help configure the tool to as nearly as possible show only the warnings you want. But we are using a free and open source tool, and will have to do this job ourselves. Any good static analysis tool should allow us to ignore particular bits of code that would be marked problematic, either temporarily, or because we can prove to our satisfaction that the code is safe.
The output of flawfinder is not especially convenient for browsing.
It’s often more convenient to integrate the output with the editor or IDE (integrated development environment) you’re using.
You might like to see if you can find a way of integrating the output of flawfinder into your preferred editor or IDE (e.g. VS Code or Vim).
In flawfinder’s output, see if you can find mentions of
“rr_types
”. You should be able to identify that Flawfinder
is giving us a general warning about any array with static
bounds (which is, really, all arrays in C11). However, there’s another
issue here – what is it?
The declared size of the array, and the number of the elements should match up; if someone changes one but not the other, that could introduce problems. We’ll add a more reliable way of checking this.
Remove the size 256
from the array
declaration.
Below it, add
static_assert(sizeof(rr_types) / sizeof(rr_types[0]) == 256,
"rr_types should have 256 elements");
We’re now statically checking that the number of elements in
rr_types
(i.e., the size of the array in bytes, divided by
the size of one element) is always 256.
We’ll now look at warnings from a program called
clang-tidy
. It can be run from the command-line (see
man clang-tidy
for details); try running
$ clang-tidy --checks='-clang-diagnostic-pointer-sign' --extra-arg="-DHAVE_CONFIG_H -I. -Wno-pointer-sign" dnstracer.c --
(NB: This is a “quick and dirty” way of getting clang-tidy to run; better instructions will be provided in future weeks.)
We need to give clang-tidy
correct compilation arguments
(like -I.
), or it won’t know where the
config.h
header is and will mis-report errors. We want it
not to report problems with pointers being coerced from signed to
unsigned or vice versa (i.e., the same issue flagged by gcc
with -Wno-pointer-sign
), so we disable that check by
putting a minus in front of
clang-diagnostic-pointer-sign
.
Let’s see how Dnstracer is supposed to be used. It will tell us the chain of DNS name servers that needs to be followed to find the IP address of a host. For instance, try
$ ./dnstracer -4 -s ns.uwa.edu.au www.google.com
$ ./dnstracer -4 -s ns.uwa.edu.au www.arranstewart.io
These commands say to use a local UWA nameserver (ns.uwa.edu.au) and to follow the chain of nameservers needed to get IP addresses for two hosts (www.google.com and www.arranstewart.io). The “Google” host is fairly dull; it seems the UWA nameserver stores that IP address directly itself. The second is a little more interesting, as it requires name servers run by Hurricane Electric to be queried.
Now re-compile with gcc
at the O2
optimization level, and try some specially selected input:
$ CC=gcc CFLAGS="-pedantic -g -std=c11 -Wall -Wextra -Wno-pointer-sign -O2" ./configure
$ make clean all
$ ./dnstracer -v $(python3 -c 'print("A"*1025)')
You should see the message
*** buffer overflow detected ***: terminated
Aborted (core dumped)
This is the “denial of service” problem reported in CVE-2017-9430. A buffer overflow occurs, but gets caught by gcc’s inbuilt protections and causes the problem to crash. This is better than a buffer overflow being allowed to execute unchecked, but is still a problem: in general, a user should not be able to make a program segfault or throw an exception based on data they provide. Doing so for e.g. a server program – e.g. if a server ran code like Dnstracer’s and allowed users to provide input via, say, a web form – could result in one user being able to force the program to crash, and create a denial of service for other users. (In the present case, Dnstracer isn’t a server, though, so the risk is actually very minimal.)
Note that the problem doesn’t show up when compiling with
clang
, and only appears at the O2
optimization
level (which is often applied when software is being built for
distribution to users). Recall that at higher optimization levels, the
compiler tends to make stronger and stronger assumptions that no
Undefined Behaviour ever occurs, and this can lead to
vulnerabilities.
One can analyse the dumped core
file in gdb
to find the problematic code.
We need to run the following to ensure core dumps work properly on Ubuntu:
$ ulimit -c unlimited
$ sudo systemctl stop apport.service
$ sudo systemctl disable apport.service
ulimit and systemctl
User accounts on Linux have limits placed on things like how many
files they can have open at once, and the maximum size of the stack in
programs the user runs. You can see all the limits by running
ulimit -a
.
Some of these are “soft” limits which an unprivileged user can change
– running ulimit -c unlimited
says there should be no limit
on the size of core files which programs run by the user may dump.
Others have “hard” limits, like the number of open files. You can run
ulimit -n 2048
to change the maximum number of open files
for your user to 2048, but if you try a number above that, or try
running ulimit -n unlimited
, you will get an error
message:
bash: ulimit: open files: cannot modify limit: Operation not permitted
The systemctl
command is used to start and stop system
services – programs which are always running in the background. In this
case, we want to stop the apport
service: it intercepts
segfaulting programs, and tries to send information about the crash to
Canonical’s servers. But we don’t want that – we want to let the program
crash, and we want the default behaviour for segfaulting programs, which
is to produce a memory-dump in a core
file.
Run the bad input again, and you should get a message about a core
dump being generated; then run gdb
. The commands are as
follows:
$ ./dnstracer -v $(python3 -c 'print("A"*1025)')
$ gdb -tui ./dnstracer ./core
In GDB, run the commands backtrace
, then
frame 7
: you should see that a call to strcpy
on about line 1628 is the cause.
We’ll try to find this flaw using a particular dynamic
analysis technique called “fuzzing”. Static analysis analyses the static
artifacts of a system (like the code source files); dynamic analysis
actually runs the program. We’ll use a program called
afl-fuzz
1 to find that bug for us, and
identify input that will trigger it.
Fuzzers are very effective at finding code that can trigger program
crashes, and afl-fuzz
would normally be able to find this
vulnerability (and probably many others) by itself if we just let it run
for a couple of days. To speed things up, however – because in this case
we already know what the vulnerability is – we’ll give the
fuzzer some hints.
By default, afl-fuzz
requires our program take its input
from standard input (though there are ways of altering this behaviour).
So to get our program working with afl-fuzz
, we’ll add the
following code at the start of main
(search in Vim for
argv
to find it, or use the Tagbar pane and search for
main
):
int new_argc = 2;
char **new_argv;
{
new_argv = malloc(sizeof(char*) * new_argc + 1);
// copy argv[0]
size_t argv0_len = strlen(argv[0]);
new_argv[0] = malloc(argv0_len + 1);
strncpy(new_argv[0], argv[0], argv0_len);
new_argv[argv0_len] = '\0';
// read in argv[1] from file
const size_t BUF_SIZE = 4096;
char buf[BUF_SIZE];
ssize_t res = read(0, buf, BUF_SIZE - 1);
if (res > BUF_SIZE)
res = BUF_SIZE;
buf[res] = '\0';
new_argv[1] = malloc(sizeof(char) * (res + 1));
strncpy(new_argv[1], buf, res);
new_argv[1][res] = '\0';
// set argv[2] to NULL terminator
new_argv[new_argc] = NULL;
}
argv = new_argv;
argc = new_argc;
The aim here is to get some input from standard input, but then to
ensure the input we’ve just read will work properly with the rest of the
codebase (which expects to operate on arguments in
argv
).
So we create new versions of argv
and
argc
(lines 1–2 of the above code), containing the data we
want (obtained from standard input – line 15), and then we replace the
old versions of argv
and argc
with our new
ones (lines 28–29). If what the code is doing is not clear, try stepping
through it in a debugger to see what effect each line has.
AFL requires some sample, valid inputs to work with. Run the following:
$ mkdir -p testcase_dir
$ printf 'www.google.com' > testcase_dir/google
$ python3 -c 'print("A"*980, end="")' > testcase_dir/manyAs
We also need to ideally allow afl-fuzz to instrument the code (i.e., insert extra instructions so it can analyze what the running code is doing) – though afl-fuzz will still work even without this step. Recompile Dnstracer by running the following:
$ CC=/usr/bin/afl-gcc CFLAGS="-pedantic -g -std=c11 -Wall -Wextra -Wno-pointer-sign -O2" ./configure
$ make clean all
Instrumenting for afl-fuzz
Some of the dynamic analysis tools we have seen (like the Google
sanitizers, ASan and UBsan) are built into GCC, so to use them, we just
have to supply GCC with appropriate command-line arguments
(e.g. -fsanitize=address,undefined
).
However, AFL-fuzz is not part of GCC, and it takes a different
approach. It provides a command, afl-gcc
, which behaves
very similarly to normal GCC, but additionally adds in the
instrumentation that AFL-fuzz needs.
So we can perform the instrumentation by specifying the option
CC=/usr/bin/afl-gcc
to the ./configure
command: this specifies a particular compiler that we want to use.
Then, to do the fuzzing, run
$ afl-fuzz -d -i testcase_dir -o findings_dir -- ./dnstracer
A “progress” screen should shortly appear, showing what AFL-fuzz is doing – something like this:
american fuzzy lop ++2.59d (dnstracer) [explore] {-1}
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│ run time : 0 days, 0 hrs, 0 min, 18 sec │ cycles done : 2 │
│ last new path : 0 days, 0 hrs, 0 min, 0 sec │ total paths : 70 │
│ last uniq crash : none seen yet │ uniq crashes : 0 │
│ last uniq hang : none seen yet │ uniq hangs : 0 │
├─ cycle progress ───────────────────┬─ map coverage ─┴──────────────────────┤
│ now processing : 66*0 (94.3%) │ map density : 0.02% / 0.23% │
│ paths timed out : 0 (0.00%) │ count coverage : 1.71 bits/tuple │
├─ stage progress ───────────────────┼─ findings in depth ───────────────────┤
│ now trying : splice 4 │ favored paths : 26 (37.14%) │
│ stage execs : 60/64 (93.75%) │ new edges on : 30 (42.86%) │
│ total execs : 58.0k │ total crashes : 0 (0 unique) │
│ exec speed : 3029/sec │ total tmouts : 0 (0 unique) │
├─ fuzzing strategy yields ──────────┴───────────────┬─ path geometry ───────┤
│ bit flips : n/a, n/a, n/a │ levels : 8 │
│ byte flips : n/a, n/a, n/a │ pending : 29 │
│ arithmetics : n/a, n/a, n/a │ pend fav : 0 │
│ known ints : n/a, n/a, n/a │ own finds : 68 │
│ dictionary : n/a, n/a, n/a │ imported : n/a │
│ havoc/rad : 29/29.2k, 39/28.0k, 0/0 │ stability : 100.00% │
│ py/custom : 0/0, 0/0 ├───────────────────────┘
│ trim : 50.13%/169, n/a │ [cpu:322%]
└────────────────────────────────────────────────────┘
The AFL-fuzz documentation gives an explanation of this screen here.
We’ve given afl-fuzz a very strong hint here about some
valid input that’s almost invalid
(testcase_dir/manyAs
); but given time and proper
configuration, many fuzzers will be able to identify such input for
themselves.
After about a minute, afl-fuzz should report that it has found a
“crash”; hit ctrl-c to stop it, and look in
findings_dir/crashes
for the identified bad input.
Crash files
Inside the findings_dir/crashes
directory should be
files containing input that will cause the program under test to crash.
For instance, on one run of AFL-fuzz, a “bad input” file is produced
called
“findings_dir/crashes/id:000000,sig:06,src:000083,time:25801+000001,op:splice,rep:16
”.
The filename gives information about the crash that occurred and how the
input was derived.
id:000000
” is an ID for this crash – this is the first
and only crash found, so the ID is 0.sig:06
” says what signal
caused the program to crash. You can get a list of Linux signals and
their numbers by running the command “kill -L
”: signal 6 is
“SIGABRT
”, which is raised when a program calls the abort()
function. abort()
typically gets called by the process
itself; in this case, the code added by gcc to detect buffer overflows
detects an overflow has occured, and “bails out” by calling
abort()
.src:000083
” isn’t too important to understand, but
matches up the crash with an item in AFL-fuzz’s “queue” of inputs to try
(also available under the findings_dir
).time:25801+000001
” gives information about when the
crash occurred.op:splice,rep:16
” gives information about what
AFL-fuzz did to one of our inputs to get the new input that caused the
crash. In this case, it performed a “splice” operation (inserting new
characters into the input string) 16 times.Since gcc’s buffer overflow protections are enabled, we should expect
a crash to occur exactly when the input is long enough to overflow the
buffer – at that point, gcc’s protection code detects that something has
been written outside the buffer bounds, and calls abort()
.
So all AFL-fuzz has to do to trigger a crash is lengthen the input
string enough. But AFL-fuzz monitors the code paths the program
under test is executing – that’s what the “instrumentation” step is for
– and can thus “learn” to explore quite complicated input structures –
see this
post by the main developer of AFL-fuzz, Michał Zalewski, in which
AFL-fuzz “learns” how to generate valid JPEG files, just from being
given the input string “hello
”.
In general, running a fuzzer on potentially vulnerable software is a pretty “cheap” activity: one can leave a fuzzer running for several days with simple, valid input, and check at the end of that period to see what problems have been discovered.
We’ve seen how we could have detected CVE-2017-9430 in advance, by letting a fuzzer generate random inputs and attempt to crash the Dnstracer program.
Can you work out the best way of fixing the problem, once detected?
Take a look at The Fuzzing Book (by Andreas Zeller, Rahul Gopinath, Marcel Böhme, Gordon Fraser, and Christian Holler) at https://www.fuzzingbook.org, in particular the “Introduction to Fuzzing” at https://www.fuzzingbook.org/html/Fuzzer.html.
Fuzzing doesn’t apply just to C programs; the idea behind fuzzing is to randomly generate inputs in hopes of revealing crashes or other bad behaviour by a program. The Fuzzing Book demonstrates how the techniques by applying them to Python programs, but they are generally applicable to any language.
Fuzzing has been very successful at finding security vulnerabilities in software – often much more so than writing unit tests, for instance. An issue with unit tests is that human testers can’t generate as many tests as a fuzzer can (fuzzers will often generate at least thousands per second), and often have trouble coming up with test inputs that are sufficiently “off the beaten path” of normal program execution to trigger vulnerabilities.
Fuzzers often work well with some of the dynamic sanitizers
which we’ve seen gcc
and clang
provide. The
sanitizers (such as ASAN, the AddressSanitizer,
and UBSan, the Undefined
Behaviour sanitizer) help with making a program crash if
memory-access problems or undefined behaviour are detected.
You can read more about AFL-fuzz at https://afl-1.readthedocs.io/en/latest/fuzzing.html, and
if you have time, experiment with the honggfuzz
fuzzer (https://github.com/google/honggfuzz) or using AFL-fuzz
in combination with sanitizers.
This lab introduces several static analysis tools – but these should not be the only tools you use to analyse your code. In practice, different tools will detect different possible defects, so it’s important to use a range of tools to reduce the chances of defects creeping into your code.
It’s therefore recommended your group try other static analysis tools to see what benefit they provide. Some suggested tools include:
Cppcheck aims to have a low false-positive rate, and performs what is called “flow analysis” – it can detect when a construct in your code could cause problems later (or earlier) in the program.
This is a different tool to clang-tidy – the Clang project has a number of distinct static analysis tools associated with it. The clang static analyser not only performs extensive static analysis of your code, but is capable of describing the problems using easy-to-understand diagrams produced from your code, like this one:
The simplest way to run the static analyser is usually from the command-line.
This tool is used for analysing the Linux kernel source code – it can catch mismatched types, implicit casts, and other type-related bugs that the compiler may not warn about. It supports the use of annotations on the code which can help the compiler and analyser more clearly understand the intent of your code.
Developed by NASA, Ikos performs what is called abstract interpretation on a program – a static analysis technique that approximates the set of all possible paths through the program, and what state the program would be in at each point.
“AFL” stands for “American Fuzzy Lop”, a type of rabbit;
afl-fuzz
was developed by Google. Read about it further at
https://github.com/google/AFL. (If you have time, you
might like to try using another fuzzer, honggfuzz
, by
reading the documentation at https://github.com/google/honggfuzz.)↩︎