CITS3007 lab 6 (week 8) – Binary data formats, program analysis

This lab explores

1 Reading and writing binary data

In C programming, working with files to store and retrieve data is a fundamental task. You have already encountered text files, which store data as a sequence of characters, encoded in a specific character set (for instance, ASCII or UTF-8) – the .c and .h source files we use for C programming are examples of such files. Text files are also used for configuration files and many document formats, and can be opened and modified with text editors such as vim.

Binary files, on the other hand, do not (primarily) contain human-readable text. Rather, they contain data that can only be easily read or displayed using a program, including images (such as JPEG or PNG images), executables (which come in formats like ELF, used on Linux, and PE, used on Windows), and binary document formats (like MS Word or Adobe PDF). They can also be used to store structured records (which, in C, we would describe and manipulate using structs). If you open a binary file with a text editor like vim (try opening an executable you have creating in one of the previous labs, for instance), you will see a jumble of non-human-readable characters and symbols. Unlike text files, binary files lack a clear, human-interpretable structure when viewed in a text editor – the formats they are in are optimized for efficient processing by programs, not for human readability.

When working with file formats, you will often encounter the terms “serialization” and “deserialization” (sometimes called “marshalling” and “unmarshalling”). Serialization refers to the process of converting complex data structures, such as objects or structs in a programming language, into a sequence of bytes that can be easily written to a (typically binary) file or transmitted over a network. The serialized data represents the original data’s structure and values in (ideally) a compact and platform-independent manner. Conversely, deserialization is the process of reconstructing complex data structures from the serialized binary data. The structs and formats we use in this lab (and in the project) are very simple, but complex structs could included nested structs, unions, and pointers to other structs, making the tasks of serialization and deserialization more difficult.

C has the functions the fread and fwrite functions for reading and writing binary data to a file. The linked cppreference.com pages show how the two functions can be used to read and write an array of doubles. They can also be used to read and wrote whole structs.

For instance, consider the following bank_account struct:

struct bank_account {
    int acct_num;
    char acct_name[20];
    double acct_balance;
};

This struct represents information about a bank account, and consists of an integer (acct_num) for the account number, a character array (account_name) to store the account holder’s name, and a floating-point number (acct_balance) for the account balance.

File format complications

Note that in a more realistic example, we would not use a double to represent currency. When serializing or deserializing a struct, we also would probably not use the type int, the size of which can vary from platform to platform. We would either amend the struct so that it uses an exact-sized integer type like uint32, or would need to cast to such a type while serializing.

Finally, we would have to account for the endianness of different platforms. “Endianness” refers to the byte-ordering scheme used to store multi-byte data types in RAM. In little-endian systems, the least significant byte is stored first, while in big-endian systems, the most significant byte comes first.

For example, consider the decimal number 17,412. If stored in a 2-byte short, this number would be stored with the value 172 in the most significant byte (MSB), and 67 in the least significant byte (LSB) (since \((172 \times 256) + 67 = 17,412\)). But in what order in memory would those two bytes be in?

Larger integer types will similarly be stored in “reverse” order on little-endian systems. If we had a 4-byte integer type, with the bytes from most to least significant being B1, B2, B3, and B4, then on a little-endian system, they would be stored in the order “B4 B3 B2 B1” in memory, and on a big-endian system, in the order “B1 B2 B3 B4”.

The x86-64 architecture uses little-endian byte ordering, which is the most common byte ordering used in processors today; examples of big-endian systems include the PowerPC architecture (used for some early Apple Macintosh computers) and mainframes like the IBM Z-series. Additionally, network protocols (such as the Ethernet and IP protocols) typically use big-endian byte order for data transmission (to the extent that big-endian is often referred to as “network byte order”).

What byte ordering is used in binary file formats varies – a file format could use big-endian, little-endian, or even (rarely) both in the same file. Some examples are:

A file format could also specify that it uses the “native endianness” of the platform the file was created on, but then would not be portable between systems of different endianness.

For this laboratory (and for the project) we will assume that integer types are to be stored on disk in little-endian order. This means we can directly use the fread and fwrite functions to write integer types, without having to do any byte-reordering.

1.1 Writing and reading a bank account struct

Code for this lab can be found in the lab-06-code.zip file, and includes the following program, write_bank_account.c:

#include <stdio.h>
#include <stdlib.h>

struct bank_account {
  int acct_num;
  char acct_name[20];
  double acct_balance;
};

int main() {
  // a bank_account instance
  struct bank_account account = {123456, "John Doe", 1000.50};

  const char * filename = "bank_account.bin";

  // open binary file for writing
  FILE *ofp = fopen(filename, "wb");

  if (ofp == NULL) {
    perror("Error opening file");
    exit(EXIT_FAILURE);
  }

  // write 'account' to file; there's 1 element to write,
  // which has size 'sizeof(struct bank_account)'.
  size_t els_written = fwrite(&account, sizeof(struct bank_account), 1, ofp);

  if (els_written != 1) {
    perror("Error writing to file");
    fclose(ofp);
    return 1;
  }

  fclose(ofp);

  printf("Bank account struct written to '%s'\n", filename);

  exit(EXIT_SUCCESS);
}

If we were writing an array of bank_account structs, the return value of fwrite would be the number of elements written. Here, since we have just one struct, we expect to get back the result 1; as with any C function, it’s import to always check the return value of fwrite to make sure an error hasn’t occurred. You can compile the program with the command:

$ make CC=gcc CFLAGS='-std=c11 -pedantic -Wall -Wextra -Wconversion' write_bank_account.o write_bank_account

Run the program; it will create a binary file named bank_account.bin containing the serialized bank_account struct. Since we can’t view binary files easily using less or vim, take a look at the contents with the program xxd:

$ xxd bank_account.bin
00000000: 40e2 0100 4a6f 686e 2044 6f65 0000 0000  @...John Doe....
00000010: 0000 0000 0000 0000 0000 0000 0044 8f40  .............D.@

xxd produces a hexadecimal dump of binary files, showing both hexadecimal and ASCII representations of the data; this is handy for debugging and verifying the contents of binary files. The value 123456 is 0x0001e240 in hexadecimal notation, and we can see the first four bytes of the file contain this number in little-endian order: 40e2 0100.

The 20 bytes after that are the string “John Doe” – in hex, “4a6f 686e 2044 6f65”. In the output shown above, we then see a series of zero bytes, and the last 8 bytes of the file (0000 0000 0044 8f40) represent the double 1000.50.

Python for debugging and verifying binary formats

Although our code is written in C, it can often be convenient to use Python to verify and interpret the contents of binary files.

For instance, we can use Python’s hex() function to get numbers in hexadecimal format. If we run python3 to get a Python prompt, then typing hex(123456) at the prompt should display the result 0x1e240, which is 0x0001e240 when padded with zeroes to 4 bytes.

We can also use the struct library to find out what the double 1000.50 looks like as a sequence of bytes.

Try the following at the Python prompt:

>>> import struct
>>> struct.pack('d', 1000.50).hex()
'0000000000448f40'

The 'd' indicates that we want to convert something to bytes as if it were a C double. In the output, the “b” before the string means it represents an uninterpreted sequence of bytes; here the output is indicating that the double 1000.5 will convert to the sequence of bytes 0000 0000 0044 8f40, exactly what we saw in the output from xxd.

It often is also possible to perform tasks like this using the GDB debugger, which we examined in the second lab; but for many programmers, using Python will be more convenient.

The read_bank_account.c program contains corresponding code for reading and displaying the contents of our “bank_account.bin” file:

#include <stdio.h>
#include <stdlib.h>

struct bank_account {
  int acct_num;
  char acct_name[20];
  double acct_balance;
};

int main() {
  // a bank_account instance to store the read data
  struct bank_account account;

  const char * filename = "bank_account.bin";

  // open the binary file for reading
  FILE *file = fopen(filename, "rb");

  if (file == NULL) {
    perror("Error opening file");
    exit(EXIT_FAILURE);
  }

  // read from the file
  size_t els_read = fread(&account, sizeof(struct bank_account), 1, file);

  if (els_read != 1) {
    perror("Error reading from file");
    fclose(file);
    exit(EXIT_FAILURE);
  }

  fclose(file);

  // display the account information
  printf("Account Number: %d\n", account.acct_num);
  printf("Account Name: %s\n", account.acct_name);
  printf("Account Balance: %.2f\n", account.acct_balance);

  return 0;
}

If you compile and run it, you should see displayed exactly the struct contents that we wrote in to the file.

Exercise

Amend the two programs so that instead of reading and writing a single struct, they read and write an array of 4 such structs. Compile and run them, and check that the output you get is what you expect.

Exercise

Our programs thus far both store a fixed number of records to a file (one struct in the initial write_bank_account.c and read_bank_account.c code, four structs in the code written for the previous exercise). How could we amend our programs (and the file format used) so that the file format included a count of the number of records stored?

By using these programs as examples, and reading the documentation on the cppreference.com site for fread and fwrite, you should be able to develop functions for your project which read and write in the file formats specified.

1.2 Struct layout, padding, and alignment

When a C compiler lays out a struct in memory, it does not necessarily place fields immediately adjacent to each other. Instead, it follows alignment rules determined by the target architecture.

Most processors are significantly more efficient when multi-byte values (such as int, double, or pointers) are stored at memory addresses that are multiples of their size (or some other platform-specific alignment boundary). To satisfy these requirements, the compiler may insert padding bytes between struct fields.

For example, consider:

struct example {
  char c;
  int x;
};

Although this looks like it should only require 1 + 4 = 5 bytes, the actual layout in memory is typically:

offset 0:  c
offset 1–3: padding
offset 4–7: x

So on a typical 64-bit Linux system, sizeof(struct example) will usually be 8, not 5.

Padding ensures that each field is correctly aligned in memory – int values are typically aligned to 4-byte boundaries, and double values are typically aligned to 8-byte boundaries. Misaligned accesses may be slower or, on some architectures, not supported at all.

Alignment rules and undefined behaviour

In C, accessing a value through a pointer that is not correctly aligned for its type can lead to undefined behaviour (UB).

For example:

char buffer[8];
int *p = (int *)(buffer + 1);  // potentially misaligned
int x = *p;                    // undefined behaviour in C

Even if this appears to “work” on some systems (notably x86-64), the behaviour is not guaranteed by the C standard. On different architectures, misaligned access may:

1.3 Practical guidelines

So, what are consequences of padding for binary file formats? Because compilers are free to insert padding, the in-memory representation of a struct is not automatically a portable file format:

In the bank_account examples in this lab, we have written a whole bank_account struct to disk and read it back, using calls like

  size_t els_written = fwrite(&account, sizeof(struct bank_account), 1, ofp);

But that is only safe to do when reading and writing are both done using the same compiler, on the same platform. Different compilers and different platforms may lay the struct out differently – if you write with one compiler/platform combination, but read it back on another, you may get garbage.

To avoid issues caused by padding and alignment, it is best to treat binary file formats as a sequence of fields, not as raw memory dumps of structs. In particular:

For example, instead of relying on:

fwrite(&account, sizeof(struct bank_account), 1, file);

a more robust approach is to write individual fields:

fwrite(&account.acct_num, sizeof(account.acct_num), 1, file);
fwrite(account.acct_name, sizeof(account.acct_name), 1, file);
fwrite(&account.acct_balance, sizeof(account.acct_balance), 1, file);

This makes the file format explicit and independent of compiler-inserted padding.

2 Static analysis

Static vs dynamic analysis

Recall from lectures that static analysis tools analyse a program for defects without running it, whereas dynamic analyses are done at runtime.

You already have experience with one sort of static analysis tool – compilers. Compilers are an example of a static analysis tool, because (in addition to producing compiled output) nearly all compilers attempt to detect one sort of defect, namely type errors: cases where the programmer performs operations on a data item which are not appropriate for its type. (C is sometimes referred to as “weakly typed” because it is possible to implicitly convert between many types – for instance, to treat unsigned integral types as signed, or vice versa.) Compilers operate on the source code of a program, but static analysis tools also exist that analyse binary artifacts (such as binary executables or libraries) – the Ghidra reverse engineering framework is an example of one of these. Compilers typically only perform a fairly limited range of checks for possible defects, so it’s often useful to augment them with other static analysis tools.

You might have already experimented with one of the key dynamic analysers we look at, too, the Google sanitizers included with GCC. These perform dynamic analysis of your program, and therefore require your program to be run in order to work. They operate by injecting extra instructions into your program at compile time, deliberately altering the way your program behaves. Enabling them often requires little more than adding something like “-fsanitize=address,undefined -fno-omit-frame-pointer -g -O1” to your GCC options.

Then, many undefined behaviours (like going out of bounds) will trigger the sanitizers to display a stack trace and an explanation of the error – making C behave much more like a language with exceptions (Python or Java, say) when it comes to trying to diagnose bugs and vulnerabilities (as opposed to its usual behaviour, which is to give no visible sign at all that something could be wrong with the program).

When completing the unit project, it will be up to your group to decide what static and dynamic analysis tools to use on your code in order to find defects and possible vulnerabilities.

2.1 Compiler options

Before looking at standalone static analysis tools, we’ll first discuss options that are already built into your compiler. It’s important to ensure you’re making good use of the features already included in GCC.

At a minimum, the compiler options you use for CITS3007 work should include the following:

  -std=c11 -pedantic-errors -Wall -Wextra -Wconversion

You can find a list of all GCC’s warning-related options here. You can easily find recommendations for more extensive warning options than the minimum ones above by Googling for them (one set of recommendations can be found here).

Other important practices to bear in mind are:

Compile at multiple optimization levels

You should make sure to compile at multiple levels of optimization. GCC can perform different analyses, and thus output different warnings, depending on what level of optimization you ask it for. The -O0 option disables all optimizations (GCC’s default behaviour), and -O1 and -O2 enable progressively more optimizations.

You can get documentation on all of GCC’s optimization options here, and obtain a brief list by running gcc --help=optimizers.

Compile with and without debugging symbols

Compiling with debug symbols enabled (GCC’s -g option) can prevent some bugs from appearing – so, even if you use debug symbols to assist you in debugging your code, it’s important to compile and test your code without symbols added, as well.

Compile and test with and without sanitizers

In later classes we will look at the sanitizers included with GCC in more detail. These perform dynamic analysis of your program, and therefore require your program to be run in order to work. It’s a good idea to test your code both with and without sanitizers enabled (the ASan and UBSan sanitizers are particularly effective at detecting errors).

Compile with different compilers

It can be helpful to try compiling your code using different compilers – although all C compilers should detect errors mandated by the C standard, what other sorts of analyses they do and the warnings they produce can vary from tool to tool. On the CITS3007 standard development environment (CDE), the GCC and Clang compilers are both available.

Compile on different platforms

Sometimes it can be useful to ensure your code is compiled and run on multiple platforms – for example, MacOS, Windows, and BSD Unixes, in addition to Linux. The CITS3007 project is only required to compile and operate correctly on Linux systems. But it can still be useful to see how it behaves on other, closely related, systems – sometimes this can expose bugs or assumptions you didn’t detect on Linux. It will be up to you to decide if this is a strategy you wish to pursue.

2.2 Setup

In the CITS3007 standard development environment (CDE), download the source code for the dnstracer program, which we’ll be analysing, and extract it:

$ wget http://www.mavetju.org/download/dnstracer-1.9.tar.gz
$ tar xf dnstracer-1.9.tar.gz
$ cd dnstracer-1.9

Dnstracer is a somewhat old piece of code – the homepage appears not have changed since 2002. It wasn’t written with the intention to conform to a particular C standard, but we’ll see if we can adapt the codebase to work with the C11 standard, issued a little under 10 years after the codebase stopped being maintained. This sort of task is not uncommon: your team may be tasked with refactoring and improving old, “legacy” code, which still seems to work, adding tests, and bringing it in line with modern standards.

Note that the Dnstracer download link is an “http” link rather than an “https” link – what problems could this cause? You can read more about Dnstracer by following the relevant links at http://www.mavetju.org/unix/general.php. It is used to graphically depict the chain of servers involved in a Domain Name System (DNS) query. The DNS protocol is an important part of the modern Internet, but we won’t be examining it in detail – Dnstracer is just a sample program used here to test analysis tools on.

It’s still used as a network reconnaissance tool as part of penetration testing.

2.3 Building and analysis

2.3.1 Building

We will be analysing the Dsntracer program, which is subject to a known vulnerability, CVE-2017-9430. You can read more about the Dnstracer program at https://www.mavetju.org/unix/general.php.

You can build Dnstracer by running the following commands in your development environment:

$ ./configure
# a number of outputs of automatically run tests should appear here
$ make
# We expect this to produce many warnings and errors -- see below

Let’s examine what these are doing.

./configure and the GNU Autotools

If we want to write a C program that can be compiled and run on many systems, we need some platform-independent way of detecting what operating system we are running on, what tools (compilers, linkers, and scripting tools) are available, and exactly what options they support – unfortunately, these can vary widely from platform to platform.

How can we detect these things, and use that knowledge when compiling our program? One way is to use the suite of tools known as GNU Autotools. Rather than write a Makefile ourselves, we create a template for a Makefile – named Makefile.in – and use the GNU Autotools to create a ./configure script which will gather details about the system it is running on, and use those details to generate:

  1. a proper Makefile from the template, and
  2. a config.h file which should be #included in our C source files – this incorporates information about the system being compiled on, and defines symbols that let us know what functions and headers are available on that target system.

(Specifically, Dnstracer is using the tools Autoconf and Automake – GNU Autotools contains other tools as well which are outside the scope of this lab.)

The GNU Autotools are sometimes criticised as not being very easy to use. Alternatives to the GNU Autotools for writing platform-independent code and building on a range of platforms include the tools Meson and Cmake.

The ./configure script generates two files, a Makefile and config.h, which incorporate information about the system being compiled on. However, the content of those two files is only as good as the developer makes it – if they don’t enable the compiler warnings and checks that they should, then the final executable can easily be buggy. The output of the make command above should show us the final compilation command being run. (Type make clean, then make again if you need to see what the output was.)

# Lots of output ... and eventually:
gcc -DHAVE_CONFIG_H -I. -I. -I.     -g -O2 -c `test -f 'dnstracer.c' || echo './'`dnstracer.c

It also includes a warning about a possible vulnerability (marked with -Wformat-overflow).

We know from earlier classes that invoking GCC without specifying a C standard (like C11) and enabling extra warnings can easily result in code that contains bugs and security vulnerabilities – so the current version of the Makefile is insufficient for our purposes.

How can we enable extra warnings from GCC? If you run ./configure --help, you’ll see that we can supply a number of arguments to ./configure, and some of these are incorporated into the Makefile and use to invoke GCC. Let’s try to increase the amount of checking our compiler does (and improve error messages) by switching our compiler to clang, and enabling more compiler warnings:

$ CC=clang CFLAGS="-pedantic -Wall -Wextra" ./configure
# we expect this to produce many warnings and/or errors...
$ make clean all

If you look through the output, you’ll now see many warnings that include the following:

passing 'unsigned char *' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign]

Although this is useful information, there are so many of these warnings it’s difficult to see other potentially serious issues. (And many of them may be harmless – an example of false positives from the compiler warnings). So we’ll disable those. GCC tells us that they’re enabled using the flag -Wpointer-sign, so we can use the flag -Wno-pointer-sign to disable them. Run:

$ CC=clang CFLAGS="-pedantic -std=c11 -Wall -Wextra -Wno-pointer-sign" ./configure
$ make clean all
# _still_ many warnings and/or errors. but slightly fewer than before 

The -pedantic flag tells the compiler to adhere strictly to the C11 standard (-std=c11); compilation now fails, however: the author of Dnstracer did not write properly compliant C code. In particular:

Check man strncasecmp, and you’ll see it requires #include <strings.h>, which is missing from the C code. man strdup tells us that this is a Linux/POSIX function, not part of the C standard library; to inform the compiler we want to use POSIX functions, we should add a line #define _POSIX_C_SOURCE 200809L to our C code. This is called a feature-test macro. Furthermore, it’ll be useful to make use of static asserts, so we should include the assert.h header. Edit the dnstracer.c file, and add the following near the top of the file (e.g. just after the first block comment):

#define _POSIX_C_SOURCE 200809L
#define _DEFAULT_SOURCE
#include <assert.h>

The #defines need to appear before we start include-ing header files. If we now run make clean all, we should have got rid of many compiler-generated warnings. But many more problems exist.

Writing portable C code – non-standard extensions to C

If we want to write portable C code – code that will work with other C compilers and/or other operating systems – it’s important to specify what C standard we’re wanting to adhere to (in this case, C11), and to request that the compiler strictly adhere to that standard.

When we invoke gcc or clang with the arguments “-std=c11 -pedantic”, this disables many compiler-specific extensions. For example: the C standard says it’s impermissible to declare a zero-length array (e.g. int myarray[0]), but by default, gcc will let you do so without warning. In general, disabling compiler-specific extensions is a good thing: it ensures we don’t accidentally use gcc-only features, and makes our code more portable to other compilers.

One reason some people don’t add those arguments is because (as we saw above) doing so may make their programs stop compiling. But this is because they haven’t been sufficiently careful about distinguishing between functions that are part of the C standard library, and functions which are extensions to C provided by their compiler, or are specific to the operating system they happen to be compiling on.

For example, fopen is part of the C standard library; if you run man fopen, you’ll see it’s include in the stdio.h header file. (And if you look under the “Conforming to” heading in the man page, you’ll see it says POSIX.1-2001, POSIX.1-2008, C89, C99fopen is part of the C89 (and later) versions of the C standard.)

On the other hand, strncasecmp is not part of the C standard library: it was originally introduced by BSD (the “Berkeley Standard Distribution”), a previously popular flavour of Unix, and is now a GCC extension. It is part of the POSIX standard for Unix-like operating systems. (If you look under the “Conforming to” heading in the man page, you’ll see it says 4.4BSD, POSIX.1-2001, POSIX.1-2008.)

Using -std=c11 -pedantic encourages you to be more explicit about what compiler- or OS-specific functions you’re using. strncasecmp is usually only found on Unix-like operating systems. It isn’t available, for instance, when compiling on Windows with the MSVC compiler; if you want similar functionality, you need the _strnicmp function.

Sometimes when using a function from a standard other than the C standards, your compiler will require you to specify exactly what extensions and standards you want to enable. For instance, man strdup (rather obliquely) tells you that adding

#define _POSIX_C_SOURCE 200809L

to your C code is one way of making the strdup function available.

Note that you should put the above #define before any #includes: the #define is called a “feature-test macro”, and it’s acting as a sort of signal to the compiler, telling it what parts of any later-appearing header files to process, and what to ignore.

Using -std=c11 -pedantic doesn’t guarantee your code conforms with the C standard (though it does help). Even with those flags enabled, it’s still quite possible to write non-conforming programs. As the gcc manual says:

Some users try to use -Wpedantic to check programs for strict ISO C conformance. They soon find that it does not do quite what they want: it finds some non-ISO practices, but not all – only those for which ISO C requires a diagnostic, and some others for which diagnostics have been added.

From a security point of view, it’s easier to audit code that’s explicit about what libraries it’s using, than code which leaves that implicit; so specifying a C standard and -pedantic is usually desirable.

Project tip

Failing to use non-standard functions correctly has been a frequent source of lost marks in the unit project in previous years. If you use them, make sure you understand how to do so correctly.

2.3.2 Static analysis

We’ll identify some problems with Dnstracer using Flawfinder – read “How does Flawfinder Work?”, here: https://dwheeler.com/flawfinder/#how_work. Flawfinder is a linter or static analysis tool that checks for known problematic code (e.g. code that calls unsafe functions like strcpy and strcat). Install Flawfinder by running the command sudo apt install flawfinder in your development environment, and then try using it by running:

$ flawfinder *.c

The output typically includes a message at the end with tips on what Flawfinder’s output means. (You may get a somewhat different message at the time this lab is run.)

$ flawfinder *.c
# ... many lines omitted ...
Not every hit is necessarily a security vulnerability.
You can inhibit a report by adding a comment in this form:
// flawfinder: ignore
Make *sure* it's a false positive!
You can use the option --neverignore to show these.

In general, you’ll see a lot of output. This is a common problem when applying static analysis tools to an existing codebase for the first time. It’s usually preferable to start coding with static analysers already enabled – but sometimes we inherit legacy code and simply have to make do. Commercial static analysis companies will often provide technical experts to your organisation, as part of the process of setting the tool up, who can help configure the tool to as nearly as possible show only the warnings you want. But we are using a free and open source tool, and will have to do this job ourselves. Any good static analysis tool should allow us to ignore particular bits of code that would be marked problematic, either temporarily, or because we can prove to our satisfaction that the code is safe.

The output of flawfinder is not especially convenient for browsing.

It’s often more convenient to integrate the output with the editor or IDE (integrated development environment) you’re using.

You might like to see if you can find a way of integrating the output of flawfinder into your preferred editor or IDE (e.g. VS Code or Vim).

In flawfinder’s output, see if you can find mentions of “rr_types”. You should be able to identify that Flawfinder is giving us a general warning about any array with static bounds (which is, really, all arrays in C11). However, there’s another issue here – what is it?

The declared size of the array, and the number of the elements should match up; if someone changes one but not the other, that could introduce problems. We’ll add a more reliable way of checking this.

We’re now statically checking that the number of elements in rr_types (i.e., the size of the array in bytes, divided by the size of one element) is always 256.

We’ll now look at warnings from a program called clang-tidy. It can be run from the command-line (see man clang-tidy for details); try running

$ clang-tidy --checks='-clang-diagnostic-pointer-sign' --extra-arg="-DHAVE_CONFIG_H -I. -Wno-pointer-sign" dnstracer.c --

(NB: This is a “quick and dirty” way of getting clang-tidy to run; better instructions will be provided in future weeks.)

We need to give clang-tidy correct compilation arguments (like -I.), or it won’t know where the config.h header is and will mis-report errors. We want it not to report problems with pointers being coerced from signed to unsigned or vice versa (i.e., the same issue flagged by gcc with -Wno-pointer-sign), so we disable that check by putting a minus in front of clang-diagnostic-pointer-sign.

3 Dynamic analysis

Let’s see how Dnstracer is supposed to be used. It will tell us the chain of DNS name servers that needs to be followed to find the IP address of a host. For instance, try

$ ./dnstracer -4 -s ns.uwa.edu.au www.google.com
$ ./dnstracer -4 -s ns.uwa.edu.au www.arranstewart.io

These commands say to use a local UWA nameserver (ns.uwa.edu.au) and to follow the chain of nameservers needed to get IP addresses for two hosts (www.google.com and www.arranstewart.io). The “Google” host is fairly dull; it seems the UWA nameserver stores that IP address directly itself. The second is a little more interesting, as it requires name servers run by Hurricane Electric to be queried.

Now re-compile with gcc at the O2 optimization level, and try some specially selected input:

$ CC=gcc CFLAGS="-pedantic -g -std=c11 -Wall -Wextra -Wno-pointer-sign -O2" ./configure
$ make clean all
$ ./dnstracer -v $(python3 -c 'print("A"*1025)')

You should see the message

*** buffer overflow detected ***: terminated
Aborted (core dumped)

This is the “denial of service” problem reported in CVE-2017-9430. A buffer overflow occurs, but gets caught by gcc’s inbuilt protections and causes the problem to crash. This is better than a buffer overflow being allowed to execute unchecked, but is still a problem: in general, a user should not be able to make a program segfault or throw an exception based on data they provide. Doing so for e.g. a server program – e.g. if a server ran code like Dnstracer’s and allowed users to provide input via, say, a web form – could result in one user being able to force the program to crash, and create a denial of service for other users. (In the present case, Dnstracer isn’t a server, though, so the risk is actually very minimal.)

Note that the problem doesn’t show up when compiling with clang, and only appears at the O2 optimization level (which is often applied when software is being built for distribution to users). Recall that at higher optimization levels, the compiler tends to make stronger and stronger assumptions that no Undefined Behaviour ever occurs, and this can lead to vulnerabilities.

One can analyse the dumped core file in gdb to find the problematic code.

We need to run the following to ensure core dumps work properly on Ubuntu:

$ ulimit -c unlimited
$ sudo systemctl stop apport.service
$ sudo systemctl disable apport.service

ulimit and systemctl

User accounts on Linux have limits placed on things like how many files they can have open at once, and the maximum size of the stack in programs the user runs. You can see all the limits by running ulimit -a.

Some of these are “soft” limits which an unprivileged user can change – running ulimit -c unlimited says there should be no limit on the size of core files which programs run by the user may dump. Others have “hard” limits, like the number of open files. You can run ulimit -n 2048 to change the maximum number of open files for your user to 2048, but if you try a number above that, or try running ulimit -n unlimited, you will get an error message:

bash: ulimit: open files: cannot modify limit: Operation not permitted

The systemctl command is used to start and stop system services – programs which are always running in the background. In this case, we want to stop the apport service: it intercepts segfaulting programs, and tries to send information about the crash to Canonical’s servers. But we don’t want that – we want to let the program crash, and we want the default behaviour for segfaulting programs, which is to produce a memory-dump in a core file.

Run the bad input again, and you should get a message about a core dump being generated; then run gdb. The commands are as follows:

$ ./dnstracer -v $(python3 -c 'print("A"*1025)')
$ gdb -tui ./dnstracer ./core

In GDB, run the commands backtrace, then frame 7: you should see that a call to strcpy on about line 1628 is the cause.

We’ll try to find this flaw using a particular dynamic analysis technique called “fuzzing”. Static analysis analyses the static artifacts of a system (like the code source files); dynamic analysis actually runs the program. We’ll use a program called afl-fuzz1 to find that bug for us, and identify input that will trigger it.

Fuzzers are very effective at finding code that can trigger program crashes, and afl-fuzz would normally be able to find this vulnerability (and probably many others) by itself if we just let it run for a couple of days. To speed things up, however – because in this case we already know what the vulnerability is – we’ll give the fuzzer some hints.

By default, afl-fuzz requires our program take its input from standard input (though there are ways of altering this behaviour). So to get our program working with afl-fuzz, we’ll add the following code at the start of main (search in Vim for argv to find it, or use the Tagbar pane and search for main):

    int  new_argc = 2;
    char **new_argv;
    {
    new_argv = malloc(sizeof(char*) * new_argc + 1);

    // copy argv[0]
    size_t argv0_len = strlen(argv[0]);
    new_argv[0] = malloc(argv0_len + 1);
    strncpy(new_argv[0], argv[0], argv0_len);
    new_argv[argv0_len] = '\0';

    // read in argv[1] from file
    const size_t BUF_SIZE = 4096;
    char buf[BUF_SIZE];
    ssize_t res = read(0, buf, BUF_SIZE - 1);
    if (res > BUF_SIZE)
      res = BUF_SIZE;
    buf[res] = '\0';

    new_argv[1] = malloc(sizeof(char) * (res + 1));
    strncpy(new_argv[1], buf, res);
    new_argv[1][res] = '\0';

    // set argv[2] to NULL terminator
    new_argv[new_argc] = NULL;
    }

    argv = new_argv;
    argc = new_argc;

The aim here is to get some input from standard input, but then to ensure the input we’ve just read will work properly with the rest of the codebase (which expects to operate on arguments in argv).

So we create new versions of argv and argc (lines 1–2 of the above code), containing the data we want (obtained from standard input – line 15), and then we replace the old versions of argv and argc with our new ones (lines 28–29). If what the code is doing is not clear, try stepping through it in a debugger to see what effect each line has.

AFL requires some sample, valid inputs to work with. Run the following:

$ mkdir -p testcase_dir
$ printf 'www.google.com' > testcase_dir/google
$ python3 -c 'print("A"*980, end="")' > testcase_dir/manyAs

We also need to ideally allow afl-fuzz to instrument the code (i.e., insert extra instructions so it can analyze what the running code is doing) – though afl-fuzz will still work even without this step. Recompile Dnstracer by running the following:

$ CC=/usr/bin/afl-gcc CFLAGS="-pedantic -g -std=c11 -Wall -Wextra -Wno-pointer-sign -O2" ./configure
$ make clean all

Instrumenting for afl-fuzz

Some of the dynamic analysis tools we have seen (like the Google sanitizers, ASan and UBsan) are built into GCC, so to use them, we just have to supply GCC with appropriate command-line arguments (e.g. -fsanitize=address,undefined).

However, AFL-fuzz is not part of GCC, and it takes a different approach. It provides a command, afl-gcc, which behaves very similarly to normal GCC, but additionally adds in the instrumentation that AFL-fuzz needs.

So we can perform the instrumentation by specifying the option CC=/usr/bin/afl-gcc to the ./configure command: this specifies a particular compiler that we want to use.

Then, to do the fuzzing, run

$ afl-fuzz -d -i testcase_dir -o findings_dir -- ./dnstracer

A “progress” screen should shortly appear, showing what AFL-fuzz is doing – something like this:


             american fuzzy lop ++2.59d (dnstracer) [explore] {-1}
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│        run time : 0 days, 0 hrs, 0 min, 18 sec      │  cycles done : 2     │
│   last new path : 0 days, 0 hrs, 0 min, 0 sec       │  total paths : 70    │
│ last uniq crash : none seen yet                     │ uniq crashes : 0     │
│  last uniq hang : none seen yet                     │   uniq hangs : 0     │
├─ cycle progress ───────────────────┬─ map coverage ─┴──────────────────────┤
│  now processing : 66*0 (94.3%)     │    map density : 0.02% / 0.23%        │
│ paths timed out : 0 (0.00%)        │ count coverage : 1.71 bits/tuple      │
├─ stage progress ───────────────────┼─ findings in depth ───────────────────┤
│  now trying : splice 4             │ favored paths : 26 (37.14%)           │
│ stage execs : 60/64 (93.75%)       │  new edges on : 30 (42.86%)           │
│ total execs : 58.0k                │ total crashes : 0 (0 unique)          │
│  exec speed : 3029/sec             │  total tmouts : 0 (0 unique)          │
├─ fuzzing strategy yields ──────────┴───────────────┬─ path geometry ───────┤
│   bit flips : n/a, n/a, n/a                        │    levels : 8         │
│  byte flips : n/a, n/a, n/a                        │   pending : 29        │
│ arithmetics : n/a, n/a, n/a                        │  pend fav : 0         │
│  known ints : n/a, n/a, n/a                        │ own finds : 68        │
│  dictionary : n/a, n/a, n/a                        │  imported : n/a       │
│   havoc/rad : 29/29.2k, 39/28.0k, 0/0              │ stability : 100.00%   │
│   py/custom : 0/0, 0/0                             ├───────────────────────┘
│        trim : 50.13%/169, n/a                      │             [cpu:322%]
└────────────────────────────────────────────────────┘

The AFL-fuzz documentation gives an explanation of this screen here.

We’ve given afl-fuzz a very strong hint here about some valid input that’s almost invalid (testcase_dir/manyAs); but given time and proper configuration, many fuzzers will be able to identify such input for themselves.

After about a minute, afl-fuzz should report that it has found a “crash”; hit ctrl-c to stop it, and look in findings_dir/crashes for the identified bad input.

Crash files

Inside the findings_dir/crashes directory should be files containing input that will cause the program under test to crash. For instance, on one run of AFL-fuzz, a “bad input” file is produced called “findings_dir/crashes/id:000000,sig:06,src:000083,time:25801+000001,op:splice,rep:16”. The filename gives information about the crash that occurred and how the input was derived.

Since gcc’s buffer overflow protections are enabled, we should expect a crash to occur exactly when the input is long enough to overflow the buffer – at that point, gcc’s protection code detects that something has been written outside the buffer bounds, and calls abort(). So all AFL-fuzz has to do to trigger a crash is lengthen the input string enough. But AFL-fuzz monitors the code paths the program under test is executing – that’s what the “instrumentation” step is for – and can thus “learn” to explore quite complicated input structures – see this post by the main developer of AFL-fuzz, Michał Zalewski, in which AFL-fuzz “learns” how to generate valid JPEG files, just from being given the input string “hello”.

In general, running a fuzzer on potentially vulnerable software is a pretty “cheap” activity: one can leave a fuzzer running for several days with simple, valid input, and check at the end of that period to see what problems have been discovered.

Challenge exercise

We’ve seen how we could have detected CVE-2017-9430 in advance, by letting a fuzzer generate random inputs and attempt to crash the Dnstracer program.

Can you work out the best way of fixing the problem, once detected?

4 Further reading on fuzzing

Take a look at The Fuzzing Book (by Andreas Zeller, Rahul Gopinath, Marcel Böhme, Gordon Fraser, and Christian Holler) at https://www.fuzzingbook.org, in particular the “Introduction to Fuzzing” at https://www.fuzzingbook.org/html/Fuzzer.html.

Fuzzing doesn’t apply just to C programs; the idea behind fuzzing is to randomly generate inputs in hopes of revealing crashes or other bad behaviour by a program. The Fuzzing Book demonstrates how the techniques by applying them to Python programs, but they are generally applicable to any language.

Fuzzing has been very successful at finding security vulnerabilities in software – often much more so than writing unit tests, for instance. An issue with unit tests is that human testers can’t generate as many tests as a fuzzer can (fuzzers will often generate at least thousands per second), and often have trouble coming up with test inputs that are sufficiently “off the beaten path” of normal program execution to trigger vulnerabilities.

Fuzzers often work well with some of the dynamic sanitizers which we’ve seen gcc and clang provide. The sanitizers (such as ASAN, the AddressSanitizer, and UBSan, the Undefined Behaviour sanitizer) help with making a program crash if memory-access problems or undefined behaviour are detected.

You can read more about AFL-fuzz at https://afl-1.readthedocs.io/en/latest/fuzzing.html, and if you have time, experiment with the honggfuzz fuzzer (https://github.com/google/honggfuzz) or using AFL-fuzz in combination with sanitizers.

5 Other tools

This lab introduces several static analysis tools – but these should not be the only tools you use to analyse your code. In practice, different tools will detect different possible defects, so it’s important to use a range of tools to reduce the chances of defects creeping into your code.

It’s therefore recommended your group try other static analysis tools to see what benefit they provide. Some suggested tools include:

Cppcheck

Cppcheck aims to have a low false-positive rate, and performs what is called “flow analysis” – it can detect when a construct in your code could cause problems later (or earlier) in the program.

Clang static analyzer

This is a different tool to clang-tidy – the Clang project has a number of distinct static analysis tools associated with it. The clang static analyser not only performs extensive static analysis of your code, but is capable of describing the problems using easy-to-understand diagrams produced from your code, like this one:

The simplest way to run the static analyser is usually from the command-line, with the scan-build command.

Sparse

This tool is used for analysing the Linux kernel source code – it can catch mismatched types, implicit casts, and other type-related bugs that the compiler may not warn about. It supports the use of annotations on the code which can help the compiler and analyser more clearly understand the intent of your code.

Ikos

Developed by NASA, Ikos performs what is called abstract interpretation on a program – a static analysis technique that approximates the set of all possible paths through the program, and what state the program would be in at each point.





  1. “AFL” stands for “American Fuzzy Lop”, a type of rabbit; afl-fuzz was developed by Google. Read about it further at https://github.com/google/AFL. (If you have time, you might like to try using another fuzzer, honggfuzz, by reading the documentation at https://github.com/google/honggfuzz.)↩︎