There were a few things about the example which made it particularly hard:
In order to avoid this sort of problem, use unambiguous language, and when necessary clarify what you mean. When there is any ambiguity, you can assume that the computer will always do the wrong thing. (You can't rely on common sense, because computers don't have any.)
Algorithms should be well organized to accomplish specific, identifiable tasks on their way to the final goal.
It turns out that it does matter. In fact, the study of efficient algorithms is one the most important areas of computer science. There are several reasons why this is so:
Note that the ``size'' of the input may actually represent different things, depending on context. When talking about the number of steps required to do something to an array, n usually refers to the number of elements in the array, while when talking about how many steps it takes to multiply two numbers, n may be the number of digits in each of the numbers, and so on.
When dealing with big-O notation, only the order or dominant term in the function is considered important, since this term will dominate the behavior of the function as n gets larger. In addition, all constant factors are disregarded. For example, in the following table, f(n) is assumed to give the exact number of steps required to solve a problem of size n:
1. f(n) = 10 * n^2 + 100n + 10 O(f(n)) = n^2 2. f(n) = 0.0001 * n^2 + 10000 * n^1.9 O(f(n)) = n^2In the first example, the n^2 (n squared) term will dominate in the long run, and the fact that this is multiplied by 10 is not important to the order.
In the second example, even though the 10000 * n^1.9 will contribute a larger portion of the total than the square term unless n is very large, nevertheless the order is n^2 again because that is the term that will dominate as n becomes more and more gigantic.
Note that the order of an algorithm does not specify how long it will take to run (except in a general way, as n becomes very large). In this example, even though the two algorithms have the same order, the first will be faster for small n, while the second will be faster once n becomes large. Similarly, consider the following table:
1. f(n) = n^2 O(f(n)) = n^2 2. f(n) = 100000 * n O(f(n)) = nThe O(n) algorithm is clearly much faster when n is larger- but how much larger? In this case, the two are equal when n is 10,000-- for any n smaller than 10,00, the "slower" algorithm will get the job done more quickly.
Cases like this are not uncommon (sometimes the "better" algorithm will have a big constant out front that will make it inefficient for small n) but in general, algorithms with smaller orders are more desireable.
These terms have more precise and complete definitions in theoretical computer science, but this should hold us for now.
You can think of rand() as somehow containing a very long list of numbers, and every time you call rand() it just gives you the next number. srand() seeds the random number generator, which is analogous to telling rand() where to start reading in the list. This list is always the same, so if you give srand() the same seed again, rand() will start in the same place again. If you don't use srand() (or Randomize(), or some other function that calls srand()), then rand() always starts at the ``beginning'' of the list, and you will always get the same sequence of random numbers.
Generally, it is only appropriate to call srand (or Randomize()) once per program- there's no need to call it again, once you've selected a seed.
One way to generate a seed that will change from run to run is to use the current time as the seed. (This method is described in detail in the lecture notes.) You might also want a way to let the user set the seed to an explicit value, so that they can get the same random numbers whenever they want, but can also change the seed (so they're not always starting at the same place).
So, let's think for a moment about how this technique actually works. Imagine that you had a fair coin, and you flipped it a large number of times- you'd expect if you flipped it enough times, about half the time it would come up heads. We expect it to come up heads exactly half the time, because it is a fair coin (but of course we're not surprised at all if it comes up a few more heads than tails, or vice versa).
However, this also works in the other direction: if we flip a coin a large number of times, and it comes up heads about half the time, then we soon start to expect that it is a fair coin! Similarly, if we flip it a large number of times and it only comes up heads 1/4 of the time, then we will soon start to think that the coin is unfair, and in fact is three times more likely to come up tails than heads. Just as we expect a certain distribution of heads and tails for a coin whose "fairness" we know, given a certain distribution of heads and tails, we can start to form an estimate of the fairness of the coin.
Of course, we need to flip the coin a large number before we can start to have a lot of confidence in our estimate. For example, if you flip the coin once, it will either be heads or tails, and your estimate based on the distribution so far will be that the coin is absolutely unfair in one direction or the other! However, after many coin flips, the distribution of actual events will, in all likehood, give us a pretty good idea what the true bias on the coin is.
A stack is a data structure that has several operations associated with it:
For example, if you define an array variable inside a function, there's no telling what sort of random garbage might show up there- so often you'll use a loop to stroll through the array, filling it with zeros or some other appropriate initial value. The same sort of thing happens with stacks (or any other data structure) in C- you can tell the C compiler to set aside memory to use as a stack (much as you told it to set aside memory for an array) but you can't tell it what values to put into that memory (just as you couldn't tell it what values to put into the array). Therefore, in C (and similar languages), it is appropriate to write a initialization routine that does whatever is necessary to "clean up" the stack and reinitialize it to being an empty stack.
If the data structure that you are using to implement the stack has a fixed maximum number of elements that it can contain at the same time (i.e. the maximum number of be pushes less the number of pops since the last initialization), then if the application tries to do another push, an error occurs. This situation is often called stack overflow.
If the stack is empty (i.e. it contains no values), then it is considered an error (for the purposes of this assignment, at least) to try to pop another value off of the stack. This situation is often called stack underflow.
Of course, if you know the underlying implementation of the data structure used for the stack, then peek and rotate are usually very simple. You might find it interesting to think about how you could implement stack_peek and stack_rotate in terms of the other functions.
Also of interest might be some code that implements some binary tree functions. These are different from the functions in the lecture notes; I thought it might be interesting to see another approach to the same problems, so I wrote these routines based on slightly different algorithms.
The last function, tree_display, can be very useful for debugging binary tree structures.
Each recursive algorithm has two cases:
Note that a recursive algorithm may have more than one base case, but they are all lumped together under this heading.
Note that the recursive case must move towards the base, by reducing the ``size'' of the problem. If it doesn't, then the base case will never be reached, and the algorithm will not terminate.
Note again that a recursive algorithm may have more than one recursive case (i.e. it may recurse more than once- binary tree traversal algorithms are a good example of this).
For example, consider the problem of doing an inorder traversal of an ordered binary tree. This can be done using purely iterative constructs, but is far from trivial to code- give it a try, if you like! (hint- a stack or two or some linked lists will come in handy.)
As another example, consider the quicksort algorithm. When it was first invented (by C. A. R. Hoare, I believe), programming languages that supported recursion were not widely used, so it was initially described and implemented using only iterative constructs. Unfortunately, it is quite complicated to correctly state the quicksort algorithm without recursion. In fact, the original implementations were so complex and hard to follow that there was some debate over whether or not they actually really worked! Once the algorithm was restated in a recursive form, however, the algorithm was quickly proven to be correct.
/* The NULL test must come first! */ if (cell == NULL) { return (NULL); } else if (cell->value == value) { return (cell); }
return (llist_llookup (cell->next));
llist_lookup (llist_t *cell, int value) { if (cell == NULL) { return (NULL); } else if (cell->value == value) { return (cell); } else { return (llist_llookup (cell->next)); } }
Now, let's try an algorithm that puts together a few more concepts, playing with both recursion and pointer arithmetic: an algorithm for finding the sum of the elements in an array of integers. For reference, here's an iterative solution:
int array_sum (int length, int *array) { int i; int sum = 0; for (i = 0; i < length; i++) { sum += array [i]; } return (sum); }
As an aside, unrelated to recursion, note the use of pointer notation to represent the array. By using pointer notation instead of array notation, we can eliminate the requirement that we specify the length of the array as part of its type, but instead can pass this as a separate parameter. Therefore, this function will work properly with arrays of any length, not just some fixed length.
Note: this iterative algorithm is perfectly adequate; it would be unnecessary and generally unlikely for this to be recoded recursively- we're just doing this for the mental exercise, not because it's the best way.) To restate this recursively, we need to identify the base case and the recursive case (or cases).
For this problem, we have a choice of base cases, depending on what assumptions we dare to make about the array. If we know that the array will always be at least one element long, then the base case can be that the sum of the elements of an array of length 1 is the value of the first element. However, let's be more general, and allow the user to enter arrays with length zero. So, our base case will be that the sum of an array of zero elements is zero. This translates into the following C:
if (length == 0) { return (0); }
Now, we need a recursive case. Here's one: the sum of the elements is the sum of the first element in the array and the sum of the rest of the elements. This translates into the following C:
return (array [0] + array_sum (length - 1, array + 1));
Note the address arithmetic used here: array+1 is the address of the "rest" of the array past the first element.
Putting the pieces together:
int array_sum (int length, int *array) { if (length == 0) { return (0); } else { return (array [0] + array_sum (length - 1, array + 1)); } }
This algorithm is correct, but has a small flaw- we're not programming defensively. If someone calls this function with a negative length, the base case will never occur, and we'll recurse infinitely. Of course, nobody should call this function with a negative list, but we're not doing anything to stop them from doing so. There are two things we can do, however: we could make length an unsigned integer, so that it cannot ever be negative, or we could change the comparison of length == 0 to length <= 0.
Examples of an algorithm with this order include linear search, and searching through a linked list.
Examples of an algorithm with this order are binary search, or searching through a balanced binary tree.
Quicksort and mergesort are examples of this.
That sounds confusing; let's try an example. Imagine that you have a list of people's names that you want to sort into alphabetical order (considering both first and last names). Unfortunately, the first and last names are stored in different parts of the database and for whatever reason it's not possible to consider both of them at the same time. So, what can you do? The answer is to sort the list twice, using a stable sort. The first time sort is by first name, and the second sort is by last names.
As an example, in an ordered binary tree, each node partitions its descendants- all of the descendants that are less than or equal to the node are to the left, while all of the greater descendants are to the right.
However, it does have some redeeming qualities- it is trivial to add an optimization to detect presorted data, and it is easy to make stable. Even better, it can be implemented in a very efficient on a parallel computer. Still, nobody loves it.
Selection sort is also easy to make stable.
In the expected case, quick sort (although O(n ln n) like merge sort) is generally faster than merge sort. However, in the worst case (i.e. when fate conspires to make the algorithm always choose a bad pivot value) it can become O(n^2) just like bubble or selection sort.
Note that there are several partitioning algorithms that can be used (and an uncountable number of different algorithms for choosing the pivot element). However, the basic algorithm behind quick sort is always the same- partition the data by some pivot value, and then recursively quick sort each partition. Note that the partitioning algorithm given in the lecture notes is not stable, but there are partitioning algorithms that are.
As an aside, there are methods of choosing the pivot element that ensure that quicksort does not ever go quadratic. These algorithms are somewhat esoteric and a relatively recent discovery, however, so we'll always assume that quicksort can go quadratic in the worst case. If you're interested in knowing how this can be done, feel free to ask.
Dan Ellard - (ellard@deas.harvard.edu)