As we have already seen, one method for representing a 1-dimensional array is as a linked list. We can extend this representation to deal with 2-dimensional arrays by treating each 2-dimensional array as a 1-dimensional array of rows, where each row is in turn a 1-dimensional array. Each ``row'' array can be implemented as an ordered linked list (where the order is determined by the column of each element). If an element does not appear in the linked list, then it is assumed to have the default value.
Consider the 4-by-4 array illustrated in figure 3.1. This array consists of 12 zero elements, and 4 non-zero elements. Compared to most sparse arrays, this has a relatively high proportion of non-zero elements, but it will serve to illustrate the principles involved.
Figure 3.1: An ordinary C array.
An illustration of a sparse array that represents the same array as the array in figure 3.1 is shown in figure 3.2.
Figure 3.2: A simple sparse array.
The method of accessing elements in a sparse array implemented this way is very simple: find the row list for the row of the element, and search it for the elements in the correct column. If the element is not present, then we simply assume that the value of that element is zero.
Depending on how we access the elements of the array, it may be more appropriate to organize the array by row or by column. For example, if we represent our sparse array with an array of rows (where each row is a linked list), then determining whether or not a particular row is entirely empty is an O(1) operation- either the pointer in the row array is NULL, or it isn't. Determining whether or not a particular column is empty, however, requires checking each row for an element in that column, a potentially O(n) operation.
Of course, it is possible to represent the array using both methods: an array of row lists, and an array of column lists. An example of this is illustrated in figure 3.3, which once again represents the same array as in figure 3.2.
Figure 3.3:
A sparse array that uses both row and column lists.
Deciding which representation to use is an important issue. Having both row and column lists makes it easy to traverse the array in both row and column order (and as we'll see in chapter 4, this can be quite useful), but it makes other operations more difficult. For example, if row lists are stored, then exchanging two rows of an array is trivial (simply exchanging the pointers to the first element in each row list is sufficient). Storing both row and column lists makes swapping rows more difficult, however, since exchanging two rows may defeat the ordering of the columns.
As always, knowing how the array will be used is essential to picking the optimal representation.
If most of the rows of the sparse array do not contain any elements, then the array of pointers to row lists is itself a sparse array, and can also be implemented as a linked list. An illustration of this is shown in figure 3.4, which represents the same arrays as figure 3.2, but using a linked list to represent the row array.
Figure 3.4: A sparse array where the array of
rows is replaced by a linked list.
From an implementation perspective, adding this is slightly more clumsy than it might appear, because the linked list that represents the row array is a linked list of linked lists, while the linked list that represents each row itself is a linked list of whatever type of element the array actually represents. This means that (at least in a language like C), the linked list routines we use for the rows can't be used for the row array, because of this type mismatch.
If the number of rows that actually contain any elements is small, then this approach can offer large space savings- but if many of the rows are occupied, then as much space can be consumed by this representation as the array. In addition, the access time is considerably slower, because we've replaced an O(1) array lookup with a (potentially) O(n) linked list traversal.