As implied earlier, there are tricks that can be used to speed up matrix multiplication if the matrices are known to have particular properties.
One kind of matrix that can be multiplied quickly is a sparse matrix. A sparse matrix, like a sparse array, is a matrix where most of the elements are the same value. In most cases, the common value is zero, but this is not necessarily the case. In this section, we'll begin by exploring how sparse matrices that are mostly zero can be multiplied quickly, but at the end we will see how the principles for handling these matrices can also be used to speed up operations on other kinds of sparse matrices.
A simple-minded approach to matrix multiplication is to blindly use algorithm 4.2 again, but instead of using the ordinary array operations to access to matrix elements, use the sparse array (or multilist) operations to access them:
1 void mm_mul_sa (sa_t *A, sa_t *B, sa_t *C) 2 { 3 int i, j, k; 4 int sum; 5 6 for (i = 0; i < A->num_rows; i++) { 7 for (j = 0; j < B->num_cols; j++) { 8 sum = 0; 9 for (k = 0; k < A->num_cols; k++) { 10 sum += saGet (A,i,k) * saGet (B,k,j); 11 } 12 saSet (C,i,j,sum); 13 } 14 } 15 }
This is a step in the wrong direction, however: since saGet and saSet are generally O(n) operations themselves, the resulting algorithm is in the worst case.
However, there is a way to speed this up considerably by changing the inner loop (in this code, the k loop) to take advantage of two key facts:
[hbtp]
This algorithm is identical to algorithm 4.2, except for the inner block, which is changed as follows:
1 void mm_mul_sa_fast (sa_t *A, sa_t *B, sa_t *C) 2 { 3 int i, j, sum; 4 sa_cell_t *row_p, *col_p; 5 6 for (i = 0; i < A->num_rows; i++) { 7 for (j = 0; j < B->num_cols; j++) { 8 row_p = A->rows [i]; 9 col_p = B->cols [j]; 10 sum = 0; 11 12 while ((row_p != NULL) && (col_p != NULL)) { 13 if (row_p->index == col_p->index) { 14 sum += row_p->value * col_p->value; 15 row_p = row_p->next; 16 col_p = col_p->next; 17 } 18 else if (row_p->index < col_p->index) 19 row_p = row_p->next; 20 else 21 col_p = col_p->next; 22 } 23 saSet (C, i, j, sum); 24 } 25 } 26 }
The inner loop of algorithm 4.3 avoids trying each possible k (as algorithm 4.2 did) by moving down both the row and column lists in tandem, searching for elements in the row list that have the same index as elements in the column list. Since the lists are kept in order by index, we know that we can do this in one scan through the lists. Each iteration through the loop moves forward at least one position in one of the lists, so the loop terminates after at most R + C iterations (where R is the number of elements in the row list and C the number of elements in the column list).
Implementing the algorithm this way (i.e. writing this function in a way that depends on so many details of the sa_t structure is clearly another abuse of software engineering principles. See section 3.5 for more discussion.
As promised, we will now extend the techniques we have developed for multiplying sparse matrices where the common value is zero to handle sparse matrices where the common value is some other number.
Let A be a sparse matrix where most of the elements have value , and B be a sparse matrix where most of the elements have a value of zero. Our goal is compute AB efficiently.
A can be written as the sum of two matrices: one where all of the elements are , and the other the difference between this matrix and A:
We'll denote the matrix that results from subtracting from every element of A as , and the matrix that consists entirely of s as .
This is useful, because matrix multiplication is distributive:
This property follows from the definition of matrix multiplication; the proof is left as an exercise.
However, matrix multiplication is not commutative: in general . Therefore, it is very important to get the order right when we do multiplication!
Therefore, .
At this point, it seems like we're making more work for ourselves, because we are now doing two matrix multiplications and then adding the results together. However, with a little more work we can simplify this calculation enormously.
Since and B are both sparse matrices with a common value of zero, the product can be computed efficiently using the algorithm described in the previous section.
Since every row in is identical, every row in the product of must also be identical as well. In fact, each element in the result is multiplied by the sum of the elements in the corresponding column of B. Therefore, the product is
The sum for computing each value of the first row takes O(n), but once the first row is finished, each other row is just a copy of the first. Therefore, this multiplication is .
Computing the order of the entire calculation depends upon the "sparseness" of A and B. If they are very sparse, and the sparse array multiplication algorithm from the previous section is modified to take full advantage of this (see the exercises for more information), then the multiplication of can be done faster than , so that the addition at the end of the calculation dominates the order of calculation. If A and B are not so sparse, then this multiplication can be .