Sparse arrays is a mathematical abstraction that completely ignores the implementation details. Formally, they are any matrix that has "many" zeros (or null) values. The practical problem is that most useful optimizations around sparse arrays require closely matching implementation details against the problem to be solved. With sparse arrays, implementation details are killers.

For instance, suppose the standard solution is adopted. The sparse array will be organized as an array of linked lists representing the rows, with each row containing another linked list that contains the individual data values. What happens if you want to do a matrix multiply? A matrix multiply requires a column by row lookup and a row by column lookup. One will be an O(1) lookup, and the other will be a O(n^2) lookup. This makes a full matrix multiply an O(n^5) operation, and memory is the least of everyone's worries.

To optimize the code, it is necessary to look closely at how the matrix will be built and used. However, as soon as that starts happening, the matrix multiply decomposes into a bunch of specialized matrix operations. At this point, the abstraction starts falling apart.

For example:

a) Assume the multiplication involves a diagonal matrix. Then the optimum solution is to store the diagonal matrix as a 1xn matrix, and specialize the matrix code. This was the favoured approach from numerical methods in C and Fortran.

b) Assume the multiplication involves a tridiagonal matrix. Then the optimum solution is to store the tridiagonal matrix as a 3xn matrix, and specialize the matrix code. Again, see numerical methods in C and Fortran, or just about any good matrix library.

c) Assume the matrix operation involves a "control-systems" style matrix. One populated row, followed by a diagonal series of rows with one or two elements. The optimum solution is to develop specialized code. For most control systems problems, this matrix never changes.

d) Control systems often have a compact matrix representation involving a series of matrix multiplies. However, if the matrix multiplies are analysed, they become a much simpler sequence of equations that can often be executed in O(n^2) time instead of the longer O(n^3) time of the matrix multiplies. As such, develop specialized code. Both MatLab and Mathematica have functions where numerical operations can be broken down into there constituent formulas and saved as "C" code.

e) Assume we really need to frequently multiply a truly sparse array. Then build two sets of linked lists, one organized by row/column and another organized by column/row. Then both the row and column lookups can be done as an O(1) operation. The matrix multiply is a O(n^3) operation.

f) Just because the inputs to a matrix operation are sparse, doesn't mean the output array is sparse. I'm thinking of Singular Value Decomposition, some matrix multiplies, matrix inverses, matrix pseudo-inverses, and covariance matrices. Also, some matrices that appear in Quantum physics. In this case, matrix operations need to be further specialized to deal with creating non-sparse matrices from sparse-matrices. Additionally, some matrices may need to be rounded to sparse, even though they may be fully populated, like some covariance matrices.

In the end, sparse matrices are simply a descriptive term for a bunch of application-specific optimizations. Sparse matrices devolve into numerical optimizations that no-one cares about unless they are looking at an application that requires the specific numerical optimization. I'm not surprised high-school CS coders don't "understand" them.