CITS3007 lab 2 (week 3) – Debugging

For this lab, from within your VM, download the source code for the lab from the lab-02-code.zip zip file. (You can do this by running, for instance, wget https://cits3007.arranstewart.io/labs/lab-02-code.zip from within the VM.) You can then unzip the file using the unzip command, and view individual files using less or vim.

This lab shows how you can use GDB (the GNU Debugger) program to inspect a running program. This is important for later labs, and for the unit project. The best way of fixing bugs in your project code will be to use GDB to step through your code and pinpoint the source of those bugs. Often, you will also be able to access a debugger through your IDE or graphical editor.1 However, it’s worth learning how to use GDB directly, as in practice, you may not always have access to an IDE or graphical editor (for instance, when debugging programs running on cloud-based virtual machines, or on embedded devices).

1. GDB basics

GDB, the GNU Debugger, lets us step through compiled C (or C++) programs and examine the values of variables in the running program.

When compiling programs we wish to debug, we need to pass the flag -g to gcc, which tells it to add debugging information. It can also be helpful to pass the -O0 option to gcc, which tells the compiler not to optimize the compiled code.2 If we try to execute a binary, and gcc has heavily optimized the machine-code instructions emitted, then the CPU instructions being executed may not correspond very closely to the source code we provided, making the behaviour of GDB unexpected.3

The Makefile for this lab already includes these two flags, so running make factorial in your VM is all you need to do to compile the code. (All commands from this point on in the lab are intended to be run from the command-line in your VM, in the cloned lab02 directory, unless otherwise specified.)

Compiling and Makefiles

The code provided for this lab includes a Makefile intended to work with the GNU Make program, which contains pre-written rules for compiling all the sample programs in the lab.

Amongst other things, it ensures that when GCC is compiling code, it uses the options -std=c11 -pedantic-errors -Wall -Wextra -Wconversion (amongst others). When compiling code for this unit, you should always include these options, at a minimum.4

Of course, it is possible to invoke GCC “by hand” to compile our code, but GNU Make

We can even use GNU Make without having a Makefile present – it has many “built-in” rules about how to compile C and C++ programs, which mean that if we have a C source file my_program.c present, we can use the following single command to compile it:

$ make CFLAGS="-std=c11 -pedantic-errors -Wall -Wextra -Wconversion" my_program.o my_program

Here, we’ve instructed Make to build the object file my_program.o and the executable my_program, and we’ve specified compiler options that we want provided, but we’ve left it up to Make’s built-in rules to work out exactly what programs need to be invoked. For small, single-file projects, Make’s built-in rules are often all we need.

1.1. Factorial results

Read the API comments for the factorial function in factorial.c, and build the factorial program with the command make factorial.

Try executing the factorial program with various arguments from 0 to 20 (the valid range) and outside it. Does the program print the correct result? (If you’re not sure what the factorial of some number is, then Googling “factorial 10”, for example, should give you an answer.)

See if you can spot the cause of the error in factorial.c. If you can, don’t fix it yet – we’re going to use the program to experiment with debugging using GDB.

1.2. Running GDB

Launch the debugger by running

$ gdb ./factorial

You should see some welcome messages from GDB, then it will display the debugger prompt (gdb). As the welcome messages say, you can type help at this prompt to get help, but the online help is unfortunately not especially useful unless you already have some familiarity with GDB. (If you do know the first letter of a command you’re interested in, then GDB has an “autocomplete” feature – type l and then the tab key a couple of times, to see commands beginning with l.)

Some of the commands you can run from the GDB prompt include:

Try both of these commands. When you run the program, you should see it print the error message

Error: expected 1 command-line argument (an INT), but got 0

since by default, GDB runs the program with no command-line arguments. (GDB should also print a message saying that our program exited with code 01. By convention, programs on Unix-like platforms exit with a non-zero code to indicate an error.)

Set the programs arguments by running the following command (don’t type the (gdb) prompt):

(gdb) set args 6

and then running the program again.

Now, exit the debugger by typing quit or ctrl-d, and start it again. This time, we’ll use GDB’s TUI (text-based user interface).5

Type ctrl-x and then the a key immediately afterward. A “window” should open in your terminal; run the list command, and you should see something like this:

The arrow keys and the pageup and pagedown keys on your keyboard should now move you around in the source listing window, and ctrl-i will refresh the display if at any point it seems to get out of sync with what you’re doing. (The ctrl-x a sequence toggles between GDBs normal mode and TUI mode; hitting it repeatedly will take you back and forth between them.)

The breakpoint LINENUM command (b for short) will set a breakpoint in the code (and the source listing will indicate this with a “b+” in the code margin).

Run the command b 26 to set a breakpoint at line 26 (containing the statement argc--), and r to run the program.

GDB will highlight the line about to be executed. Some other useful commands:

For some additional commands and advanced features, see the Hitchikers Guide To The GDB and the GDB tutorial series here and here from RedHat. GDB “cheat sheets” are available here (PDF) and here.

Dynamic printf (dprintf) – no more stray printfs!

A common method of debugging C programs is to add printf() invocations at various points in the program to show what the value program variables take on at different times. A disadvantage of this approach is that it requires you to re-compile your program, and you must remember to remove the calls to printf() from your final code.

However, GDB will let you add printf() invocations without recompiling the program using the dprintf (dynamic printf) command.

…click for more

Issuing the dprintf LINENUM, FORMAT-STRING, EXPRESSION command has the effect of adding a breakpoint at LINENUM, as well as inserting a call to printf which prints the specified expression using a specified printf-style format string.

So, for example, the command dprintf myprogram.c:8, "Num elements: %d\n", n would allow you to insert printf calls that nicely display the value of the variable n at line 8 of a program.

If you’re interested in using the dprintf command, you can find a tutorial on how to use it here.

1.3. argc and argv

The first thing the factorial program does in main is execute the following statements:

argc--;
argv++;

If you’re running an instance of the factorial program, kill it with k, use set args 6 to set the command-line arguments of the program, and run it with r. (Your breakpoint at line 26 should still be showing; execute the command b 26 to set if you’ve accidentally exited GDB and come back in.)

Step through the program, examining the values of argc, argv, and elements of argv (like argv[0] and argv[1]) at various points in the program.

[s]tep vs [n]ext

In general, when you’re stepping through code, the command you want to use is “n” (“next”), which steps over function calls when it encounters them.

When you encounter a call to a function you’ve defined elsewhere in the program, and want to step “into” that function, then “s” (“step”) is the command to use.

If you try and invoke “s” on a function that is part of the C runtime, however, like strtol, then GDB will print an error something like this:

  (gdb) s
  __strtol (nptr=0x7fffffffe730 "6", endptr=0x7fffffffe3a0, base=10) at ../stdlib/strtol.c:105
  ../stdlib/strtol.c: No such file or directory.

Here, GDB is telling you that it can’t “step into” the code for strtol, because it can’t find the original source code for that function, nor can it find any “debugging symbols” for it. (To save disk space, C runtime libraries are normally shipped without either of those – though it is possible to install them if you wish.)

…click for more

A quick fix is to type f for finish, which will finish running the current function, and so should get you back to the C code the function was called from.

You’ll get a similar error if you try to “step into” the errno variable. errno isn’t a library function, but is a global symbol defined in the C runtime, and thus causes the same sort of error messages if you try to step “into” it.

The takeaway here is: usually, you can only “step into” functions that you’ve defined, and only when you compiled your code using the “-g” option which causes gcc to include debugging symbols.

What is the effect of the two statements we listed above? Why would we use them?

1.4. strtol

In the file factorial.c, we use the function strtol to convert the program’s first command-line argument into a long, despite the fact that the factorial function only takes an int, and we then cast the long into an int.

However, C11 has a function atoi, which converts strings to ints, so it seems we could have used that. Read the documentation for

and summarize what the differences are. Why might we prefer strtol over atoi?

1.5. Diagnosing and fixing the factorial bug

Kill the factorial program, set a breakpoint somewhere in the factorial function (e.g. line 18), and use the run and/or continue commands to get to your breakpoint.

Step through execution of the factorial function, and examine the values of the local variables (using either print or info locals). What is the bug in factorial? Fix it.

Recommendation – keep lab notes

It’s recommended you keep online notes of useful commands, coding best practices, useful links etc. you come across in the unit, as a reminder to yourself of what we’ve covered. You could keep a Word or text document, if you like, using Google Docs, but another option is to store your notes in a “Gist” – a single text file versioned by GitHub.

If you have a GitHub account and are logged in, then click on the “+” symbol in the top right of any GitHub page, and select “New gist”. Give your gist a description (e.g. “My CITS3007 notes”) and a filename (e.g. “notes.md”). Then click “Create secret gist” (or “public”, if you wish to make it public).

Gists support formatting your file using Markdown – for instance, use asterisks (“*”) to surround words intended to be italic, and start paragraphs which should be part of a list with a hyphen and space (“- ”). Clicking the “Preview” tab will show you what your notes look like converted to HTML.

2. Segmentation faults

The file segfault.c contains the following code:


  #include <stdlib.h>
  #include <stdio.h>

  int main(void) {
    char *buf;
    buf = malloc(1<<31); // allocate a large buffer
    printf("type some text and hit 'return':\n");
    fgets(buf, 1024, stdin); // read 1024 chars into buf
    printf("\n%s\n\n", buf); // print what was entered
    free(buf);
    return 0;
  }

Compile the segfault program by running make segfault. (You should see several warnings from GCC when you compile the program – they should give you some clue about some potential problems are with this program.)

For this program, the behaviour intended by the developer was that it should accept a line of input from the user, and echo the line back.

Run the program with ./segfault, and enter some text – what behaviour do you see?

You should see that the program produces a segmentation fault. A segmentation fault is caused when the CPU detects that a program has attempted to access memory which it is not permitted to access. Technically speaking, the program has invoked undefined behaviour, which means that the program is not a valid C program at all, and the C standard provides no guarantees about what the program might do. With the particular compiler version we’re using, though, and on the particular platform we are compiling for, we can reliably predict that a segmentation fault will occur.

Now try running the program using GDB. (Hint: you can get GDB to start in TUI mode by running gdb -tui ./segfault.) Start GDB and run the program with the run command, and enter some text. Once the segfault occurs, run the backtrace command to see the current stack trace.

You should see something like

#1  0x00007ffff7e2a96c in __GI__IO_getline (fp=fp@entry=0x7ffff7f93980 <_IO_2_1_stdin_>, buf=buf@entry=0x0,
    n=n@entry=1023, delim=delim@entry=10, extract_delim=extract_delim@entry=1) at iogetline.c:34
#2  0x00007ffff7e296ca in _IO_fgets (buf=0x0, n=1024, fp=0x7ffff7f93980 <_IO_2_1_stdin_>) at iofgets.c:53
#3  0x0000555555555209 in main () at segfault.c:9

Each stack frame shows the values of the arguments to the function called for that frame. Do any of them look suspicious?

Try printing the value of buf (with the command print buf) before and after it has been allocated, and see what result you get.

Try using the print command to see how the “bitwise left shift” (“<<”) operator works.

Try p 1 << 2, p 1 << 10, and a few other values, then try p 1 << 31. What result do you get? Why might this occur? (Hint: read the cppreference.com page on arithmetic operators, in particular the section on “overflow”: https://en.cppreference.com/w/cpp/language/operator_arithmetic. Also try the ptype command for the various values you typed above, to see what their type is.) How can the program be fixed?

3. C refresher no. 2

On Moodle, you will find an unassessed quiz entitled “C refresher no. 2”. It’s recommended you complete this (either now, or in your own time) to check your knowledge of C control flow structures and data types.

 

  1. For instance, Eclipse and VS Code will provide a graphical interface to GDB.↩︎

  2. Passing flags like -O1, -O2 and -O3 to gcc tells it to spend longer compiling the code, in order apply increasingly advanced optimizations; see the documentation for gcc’s optimization options for more details.↩︎

  3. On the other hand, sometimes the behaviour we’re trying to debug might only appear when optimizations are enabled. In such a case, we will likely have to debug our optimized binary, and simply accept that sometimes, the code being executed differs from what we see in the source file.↩︎

  4. The -std=c11 option instructs the compiler to use the C11 standard. The -pedantic-errors option instructs the compiler to disallow GCC-specific extensions to that standard, and to report an error if any of them are used. (By default, GCC tends to be fairly “generous”, and attempts to compile programs that use extensions, even when we’ve specified we want to use the C11 standard.) The other options (-Wall -Wextra -Wconversion) enable warnings about problematic constructs in our source code.↩︎

  5. Once you have some familiarity with the GDB TUI interface, you might be interested in the CGDB package, which is quite similar, but provides a few extra conveniences (like always showing a split screen with code and command panes available). On Ubuntu, GGDB can be installed with sudo apt-get update, then sudo apt install cgdb, and can then be invoked with cgdb my-prog (to debug the program my-prog).↩︎