-
6. Methods for Large-Scale Problems
- Johns Hopkins University Press
- Chapter
- Additional Information
Chapter 6 Methods for Large-Scale Problems Large-scale problems are those where the number of variables is such that direct methods, like the ones we have described in earlier chapters, cannot be applied because of storage limitations, accuracy restrictions or computational cost. Usually, large-scale problems have special structure, such as sparseness, which favors the use of special techniques. We already saw that Givens rotations are preferable to Householder transformations in this case. There are many iterative methods tailored to specific problems and applications in this area; here we choose to survey only the most important algorithms and concepts related to iterative methods for linear LSQ problems . We finally discuss a block approach that can be combined with either direct or iterative methods and can handle fairly large problems in a distributed network of computers. The reader is referred to [20] for a general survey of methods. 6.1 Iterative versus direct methods If the LSQ problem is large, the question arises whether either iterative or direct methods should be used. A problem with, for example, half a million equations and n = O(105 ) parameters to be determined, such as is common in solving inverse problems, needs conservatively O(n3 ) operations if one uses a direct method. On a modern computer (circa 2011), performing 1010 floating-point operations/second, that calculation would require about 28 hours of CPU time, although hardware is getting cheaper and faster all the time. Codes that take advantage of multicore hardware can also speed up the process. 105 106 LEAST SQUARES DATA FITTING WITH APPLICATIONS On the other hand, iterative methods compute an approximation to the solution and then improve on it until the error is small enough. For many large problems, iterative methods are favorable alternatives to direct methods because they can (approximately) solve the problem faster and with much less demands on memory. Typically, iterative methods do not require the matrix to be explicitly stored – instead they use subroutines that compute the necessary matrix-vector multiplications. However, if the normal equations are sparse, a direct method may still be a good option. The advantages/disadvantages of direct and iterative methods can be summarized as follows: • Speed Direct methods: the number of necessary steps is known, but fillin (i.e., creation of new nonzero elements) may make the process prohibitively expensive. Taking advantage of sparseness to minimize fill-in is an art and it requires special algorithms. Iterative methods: at best there is an estimate of the number of iterations and the cost per iteration, but matrix structures are preserved. • Storage Direct methods: sometimes efficient compressed storage formats can be used but retrieval may be costly. Iterative methods: compressed storage formats can be chosen, subject to the same proviso as above. • Robustness Direct methods: there is a direct method applicable to every problem and the solution can always be attained. Iterative methods: may require many iterations to converge. • Formulation Direct methods: solve the normal equations (this may destroy sparseness ) or the overdetermined system itself. Iterative methods: solve the normal equations AT (A x−b), the overdetermined system, or the augmented matrix form. A compromise between the two approaches are preconditioned iterative methods, which use an approximate factorization of A. Also, block iterative methods that are easily parallelizable are an interesting alternative discussed at the end of the chapter. For the sake of completeness, the classical iterative methods will be surveyed first, and then some of the more efficient Krylov subspace methods will be explained. [3.235.139.122] Project MUSE (2024-03-28 20:53 GMT) METHODS FOR LARGE-SCALE PROBLEMS 107 6.2 Classical stationary methods The basic idea behind the definition of the classical iterative methods is to split the matrix so that the resulting system can easily be solved at each iteration. We consider first the application to the normal equations. Recall that if A is a full-rank matrix, then the normal equations are symmetric and positive definite. The general formulation is given below: • Split AT A into convenient matrices Q−(Q−AT A), where Q is easily invertible. • The iteration is now defined by choosing a starting vector x(0) and then solving at each step: Q x(k+1) = (Q − AT A) x(k) + AT b. • The convergence properties depend on the spectral radius of the iteration matrix: I − Q−1 (AT A). The most common methods are • Richardson: Q = I. • Jacobi: Q = main diagonal of AT A. • Gauss-Seidel: Q...