Axel Ruhe 02/01/04

    Scientific works and experiences

My field is numerical linear algebra with an emphasis on numerical computation of eigenvalues and some detours towards parameter estimation and approximation. Let me describe the different parts in essentially chronological order after when I took them up, numbers refer to the publication list in the enclosed CV.

1. Jacobi algorithms and matrix approximations

My very first attempts [2-4] were developments of the venerable Jacobi algorithm for eigenvalues. I built on works by H H Goldstine and Patricia Eberlein. At the time it was believed that Jacobi could be generalized into a good alternative for nonsymmetric matrices, but soon it was apparent that this was not the case, and the QR algorithm took center stage. It might be interesting to note that Jacobi has kept reappearing in new contexts. I thought that [21], published in 1980, should be the very last word, but soon thereafter Jacobi was reawakened as an algorithm for systolic arrays and massively parallel computers. Later on, in 1986 the ever present Gene Golub put me on the track to apply the normal matrix Jacobi studied in [2], to find the closest normal matrix to an arbitrary matrix, see [35]! I was very proud of [35], and had great expectations on further developments, but did not publish any more on this. Other people like Moody Chu found that continuation methods can be used much to the same effect as Jacobi for tasks like pole assignment, and Demmel and Veselic showed that Jacobi gave a way to construct eigenvalue algorithms with high relative accuracy.

2. Ill conditioned eigenvalues and the Jordan normal form

The main part of my thesis [5-7] was built on some works of Vera N Kublanovskaya. It brought her staircase algorithm for computing the Jordan normal form of a matrix into the West and into a practical numerical algorithm. The main advantage of this QR like algorithm, compared to the inverse iteration approach taken earlier by J Varah, was that it could handle derogatory matrices (with several linearly independent eigenvectors) as well as just defective matrices (matrices with one chain of principal vectors). I used the perturbation theory, as described by Kato, to see how one multiple eigenvalue split into several groups when the matrix was perturbed, and managed to get the algorithm to behave in just that way. One difficulty remained, to determine which of the approximate eigenvalues to group together and where. My student Bo Kågström took up the work, and we published a joint paper [20]. Later Paul Van Dooren came in with experiences from the Control theory field, and then the trouble on deciding where to put the multiple eigenvalue disappeared. In most practical cases zero is the only interesting point. Nowadays the staircase algorithm is standard, and Kågström, together with J Demmel, A Edelman and E Elmroth (my academic grandchild) has made numerous studies on how to compute the Jordan and Kronecker canonical forms.

3. Nonlinear eigenvalues

When I moved to Umeå in 1970, I got numerous students, diploma workers and graduate students and they all got started in different directions. One was nonlinear eigenvalues and this resulted in [9], where problems that are nonlinear in the eigenvalue parameter, but still linear in the vector, are treated. The linearization algorithm that I found here is kind of neat, and gives a global algorithm, in the symmetric case using inertia counts for the eigenvalues. I took up nonlinear eigenvalues again after moving to Göteborg in 1983, and now my visitor Y F Zhou [32] and doctoral student J Huitfeldt [36] were started out to work on fully nonlinear or bifurcation problems. The linearization gave a new way of switching branches when following the solution curve of a nonlinear problem, and Huitfeldt's work has got some recognition. Now a group here takes up that work again.

4. Large sparse eigenvalue problems, Spectral Transformation Lanczos

Another direction I started in Umeå was large sparse eigenvalues. The works [10-14] and the survey [16] are attempts to adapt different approaches that were successful for linear systems to eigenvalues. Now the Lanczos algorithm had come en vogue after the thesis of C C Paige and the works of among others the Golub and Parlett groups. Block Lanczos had been introduced as a way to get multiple eigenvalues and I introduced a band Lanczos variant [17]. It is more accurate than block Lanczos, when it comes to rounding errors. Even if most of the literature during the mean time has been on block Lanczos, the band variant has now been suggested by R Freund and co workers, as a way to compute reduced order models of multiple input multiple output systems.

The real success came with my next graduate student, Thomas Ericsson who implemented Spectral Transformation Lanczos [26]. This joint paper is still my most cited, and several software packages were constructed as directed in our work. Originally the code was developed to handle vibrations of pipes at ASEA Atom, later our code was included in the SAAB variant of the ASKA finite element package and used for flutter and vibrations for the Gripen project. J Lewis at Boeing and coworkers included a block version of STLM in the NASTRAN package, and much later I put a spectral transformation Arnoldi into the PDE toolbox of Matlab [43] and FEMLAB which were developed by COMSOL, a spin off from KTH in Stockholm.

We have just finished the book Templates for Eigenvalues, a practical guide [50] for SIAM. It was written by a large international collaboration. I wrote a few sections and was one of the Editors. During its preparation we encountered quite a few unexplored areas in the matrix computation field.

5. Rational Krylov

After 1992, I have put quite a lot of effort into Rational Krylov, [38-42, 46-48], a further development of spectral transformation Lanczos or Arnoldi, where several shifts are used in one run. My start was to find a way to use vectors produced during Rayleigh quotient iteration to find approximations to the next eigenvalue after the first had converged. Then I was excited by the possibility of getting a long Krylov sequence by simultaneous application of several shifts on a parallel computer.

6. Reduced order models

Later it appeared that Rational Krylov was an excellent idea, when computing reduced order models of linear dynamic systems that are accurate over a wide frequency range. Such systems are common in Electrical Engineering, see the joint work [48] with my student Daniel Skoogh, and it looks like Rational Krylov is catching on in these circles.


7. Nonlinear least squares

Let us go back to Umeå and follow another track, algorithms for parameter estimation, which started for me as a collaboration with my friend Per Åke Wedin. The accelerated Gauss Newton algorithms [18] did not get the attention that I expected, but our joint study of separable least squares [19] has been quite widely used.
I took up ideas from Cantor and Evans and the struggles with the moment problem by Sven Åke Gustafson into the practically important and theoretically challenging problem of approximating a data series by a positive sum of exponentials [24,25]. The convex cone used here is really nice, and I got to referee many papers on numerical inversion of the Laplace transform and similar issues, as a result.
I also found that the convex cone could be used to understand the complex formation problem of inorganic chemistry, treated by the pioneering LETAGROP code of L G Sillén used extensively by the Ingri group in Umeå. My students K Holmström, L Andersson and I wrote a joint paper [33], but then our paths diverged, Holmström to Västerås and software development and Andersson to study relativity theory on manifolds.

Another work in the computational statistics direction on principal components with missing observations [15] did not even get published, but later my Umeå neighbor Svante Wold approached me and we made a joint paper [29]. Wold had a new idea he called partial least squares (PLS) that he was pushing in all directions, but I found that this was just the same as performing a few steps of the Golub Kahan bidiagonalization Lanczos algorithm for the singular value decomposition (SVD). Statisticians have been rather guarded in their esteem of these kinds of algorithms, and there are better ways to handle collinearity in measurements.

8. Information Retrieval

My most promising project just now is to use a few steps of Golub Kahan for Information Retrieval, see the report [49], written with my latest graduate student Katarina Blom. This is still in the early stages, but our results came as a surprise to such an authority as Dianne O Leary, and we are eager to see if it can be put to use in a practical situation. A special conference CIR00 was held in Raleigh North Carolina in October 2000, see our invited talk [52]!