Next: How Space-Efficient Are Sparse Up: Sparse Arrays Previous: Linked List Representation

An Implementation

The code in sa.c (with header file sa.h) provides a relatively simple and straightforward implementation of sparse arrays. (The entire source for this library is available online, and is not included here.)

Note that the representation that sa uses includes both row and column lists; it could be somewhat simpler if only rows or columns were used.

Also, as an extra feature, the default value of the elements in the array is not assumed to be zero. Instead, this value can be specified when the array is created.

  1  /*
  2   * An "sa_cell_t" is a simple singly-linked list
  3   * structure.  Each cell represents an element of
  4   * the array, and each list represents a single
  5   * row or column of the array.
  6   */
  7  
  8  typedef struct _sa_cell_t {
  9          struct _sa_cell_t *next; /* pointer to next.    */
 10          int     index;          /* index in row or col. */
 11          int     value;          /* value of the elem.   */
 12  } sa_cell_t;
 13  
 14  /*
 15   * An sa_t implements a sparse array.  Note that
 16   * this implementation includes both row and column
 17   * lists.
 18   */
 19   
 20  typedef struct  {
 21          int     num_rows;       /* # of rows in array.  */
 22          int     num_cols;       /* # of columns.        */
 23          sa_cell_t **rows;       /* array of row lists.  */
 24          sa_cell_t **cols;       /* array of col lists.  */
 25          int     default_val;    /* default elem value.  */
 26  } sa_t;
 27

An Optimization: Taking Advantage of Order

Consider the problem of displaying the contents of a sparse array by printing them on the screen. The obvious algorithm is to print the array in the ordinary manner, entirely ignoring the fact that the array is sparse. An implementation of this algorithm is given by function saPrint, shown below.

  1  void            saPrint (sa_t *sa)
  2  {
  3          int             i, j;
  4  
  5          for (i = 0; i < sa->num_rows; i++) {
  6                  for (j = 0; j < sa->num_cols; j++) {
  7                          printf ("%4d ", saGet (sa, i, j));
  8                  }
  9                  printf ("\n");
 10          }
 11  }

Ideally, the cost of printing an n-by-n array is , since every element in the array must be printed, and there are elements. Unfortunately, this method is inefficient because of the cost of the saGet function (which is used to fetch the value of each element in the array). In the worst case, saGet is O(n), so saPrint has a worst-case running time of .

However, we can take advantage of the ordered property of the lists in order to speed this up quite a bit. Since the order we wish to print the elements in is the same as the order that they appear in the lists, there is no reason to start over at the head of the list (as saGet must do) each time we need to access an element. Instead, we can keep track of the location of the last element that we visited, and begin our search there.

Alternatively, we can take even more advantage of the fact that the row lists are kept in order. We can simply traverse each row list, looking ahead to see where the next non-zero element is and printing out the appropriate number of zeros between. This principle is illustrated in saPrintFast.

  1  void            saPrintFast (sa_t *sa)
  2  {
  3          int             i, j;
  4          sa_cell_t       *cursor;
  5  
  6          for (i = 0; i < sa->num_rows; i++) {
  7                  cursor = sa->rows [i];
  8                  for (j = 0; j < sa->num_cols; j++) {
  9                          if ((cursor == NULL) || (cursor->index > j)) {
 10                                  printf ("%4d", sa->default_val);
 11                          }
 12                          else { 
 13                                  printf ("%4d", cursor->value);
 14                                  cursor = cursor->next;
 15                          }
 16                          printf (" ");
 17                  }
 18  
 19                  printf ("\n");
 20          }
 21  
 22          return ;
 23  }

Cursors

In saPrintFast, cursor is used to keep track of where we are in the list. In general, a cursor is a way to keep track of an ``interesting'' location in a data structure, so that this location doesn't need to be recomputed again. In this case, we keep track of where we are in the list so we don't have to traverse to the same spot again.

The notion of a cursor is also mentioned in Weiss, section 3.2.8, in a somewhat different context- but the idea is the same. Instead of searching the CursorSpace array for the index of the unused element, the index of an available element (if there are any) is always stored in the first location of the array.

Next: How Space-Efficient Are Sparse Up: Sparse Arrays Previous: Linked List Representation

Dan Ellard
Mon Jul 21 22:30:59 EDT 1997