CS50 - Section 9

Section Information

Most of the material in these section notes is a precursor to things that we'll be doing soon, and not preparation for the midterm. Don't panic if a lot of this stuff looks new- it is. Keep this stuff around as reference for next week; don't stress about it before the midterm.

Bit Manipulation in C

C is one of the relatively few higher-level programming languages that enable the programmer to explicitly manipulate bits (in fact, some people argue that C is not a high-level language at all, simply because it provides this facility).

It turns out that the ability to fiddle with individual bits can be a very handy feature, especially when designing data structures that use a minimal amount of memory. It also makes dealing with UNIX system calls much easier, since many of them have arguments that actually are created by mashing together patterns of bits (see the man pages for open(2) or chmod(2)

Bitwise or.

Bitwise and.

Bitwise xor (exclusive or).

Bitwise not (bit negation).

Shift left- slide the bits ``up'' by a specified number of positions.

Shift right- slide the bits ``down'' by a specified number of positions.

According to K+R, what gets shifted into the top bits during a right shift is compiler-dependent and may be either 0 or the previous top bit (but not random bits!) if being shifted is of a ``signed'' type (i.e. int short etc). If the thing is unsigned then the bits shifted in are always 0.

For example, if operating on 8-bit chars:

All of the operators (except the unary operator ~) can also be used in conjunction with the assignment operator, i.e.



For the next few days, I'm sure that you'll all be preoccupied with the midterm on Monday. However, soon we'll all be back to the fun of programming, so I thought I'd add a few notes about debugging.

Of course, the best debugging strategy of all is to not put any bugs into the code in the first place, by choosing correct algorithms, and then implementing them in a clear and simple manner. Unfortunately, this is sometimes easier said than done, and real programs are rarely entirely free of bugs.

So, how can bugs be isolated and removed? In general, the actual techniques of debugging are hard to pin down. After some practice, however, they become second nature.

Here are a few debugging tips and strategies that might help you on your way:

Learn gdb.

gdb, the debugger of choice for our systems, is a valuable tool. It is silly not to learn how to use it.

Understand what the code is supposed to do.

For gross errors, it's pretty clear that something is wrong- if the program dumps core, for example, you can probably feel pretty confident that there's a bug in it. However, if the program simply produces output that you believe is wrong, it's a more subtle issue. Unless you understand exactly what the program is supposed to do, it's hard to tell whether it's doing it or not!

Read the comments.

With any luck, the comments left behind by yourself or some other programmer will shed some light on what algorithms are used by the program, as well as other important details such as the underlying assumptions. If the comments describe an algorithm that is incorrect, then chances are excellent that the code is also incorrect. Remember: if you fix the code, fix the comments too!

Of course, sometimes the comments are correct, but the programmer failed to properly translate their algorithms into C (or else the comments bear no relationship to the code). If this is the case, something is almost always wrong. Look carefully at code that seems to contradict the comments.

Watch out for "boundary" conditions.

Most people get the general cases right, but the "boundary conditions" (i.e. what happens at the edges of a game board, or the end of a linked list, or when the input is not what the program expected) are often hiding places for bugs that jump up and bite you at embarassing intervals. Check the special cases with special care.

Program defensively.

Are there conditions that can't possibly happen? Maybe you shouldn't assume that they never do, at least without checking.

The ticking time bomb.

A very common sort of error occurs when there is a mismatch between the assumptions made by different functions that call each other. For example, imagine that you're writing a linked list routine that contains the following code:

	llist_t		*foo	= (whatever);

		/* foo is a valid llist_t *	*/
	llist_mangle (foo);

	foo = llist_head (foo);
Is this code OK? We're sure that foo is a valid list cell pointer when we call llist_mangle, but is it safe to pass to llist_head on the next line? We don't know. For all we know, llist_mangle is a function that takes a pointer to a list cell and fills it with garbage, free's it, or generally mangles it. Perhaps it's not a valid argument to llist_head at all- we need to know more about what llist_mangle does before we can trust it.

Note that if foo really is garbage after the call to llist_mangle, it's llist_head that is going to crash. You could stare at llist_head for days without getting any closer to understanding why, unless you understand the surrounding conditions.

The same problem is true any changes you might make to fix bugs- make sure they don't violate the assumptions of some other part of the program!

Find the minimal case.

Once you know that there's a bug, try to find the smallest possible set of inputs that trigger it. Narrow down the possibilities. Eliminate as many places that the bug might be hiding by not using them. For example, in the previous bullet, try testing llist_head without llist_mangle. Does it always work? Does it only fail if you do call llist_mangle, and never otherwise?

Similarly, when you start testing your program to see if there are any bugs, try testing in the same way- start with simple tests, and don't go on until they all work.

Please bring any errors or inconsistencies in this document (other than the bit about the bears) to the attention of the author.

Dan Ellard - (ellard@deas.harvard.edu)