‹Kopfzeile›
Hier klicken, um Master-Textformat zu bearbeiten
Zweite Ebene
Dritte Ebene
Vierte Ebene
Fünfte Ebene
‹Datum/Uhrzeit›
‹Fußzeile›
‹Nr.›
To report about the birth of the cg-method is for me almost like remembering my first love and I am doing it with great pleasure. I am very grateful to the organizers of this symposium that they have invited me to assume this task for a wider public so that my shortcomings in keeping up with the newest developments in mathematics hopefully will not be too visible. My appreciation is all the greater because already long ago I have deserted research in numerical analysis, in which Prof. Stiefel, one of the inventors of the cg-method, introduced me at the beginning of the 1950s, in favor of  science policy. This meant that instead of doing research I tried to procure research money and to create favorable conditions for such activities. Taking into account my limited mathematical gifts I think that this way I have been able to advance science a little more than if  I had continued in my academic career. But I should not talk above all about myself so let us start with the presentation which I have prepared using the New Learning Technologies in order to show you that I have at least kept up with the developments of computer technology whose shortcomings at its beginnings after the second world war caused me some frustrations.
Following the rules for a good presentation I begin with a disposition
Although my talk should be understandable also for the nonspecialist I think that I have to try to explain briefly what is the cg-method. The best time for this is probably at the beginning when you are not yet too tired of my speech. The famous German mathematician David Hilbert once said. "A mathematical theory is not to be considered complete until you have made it so clear that you can explain it to the first man whom you meet on the street."  At his time women apparently ere ot supposed to be encountered on the street. Since the cg-method is theoretically fully clarified for at least fifty years, I am encouraged to undertake its popularization. Of the different ways in which it can be presented I shall use the one Stiefel developed at the IAM. Hestenes derivation which he established independently is based more on geometrical considerations and will not be given here. Then I want to mention briefly some of the fields in which this method is applied.
Since this symposium is also meant to honor the brilliant inventors of the method short biographies of Stiefel and Hestenes will follow.
Based on my personal memories complemented by what I have found in some documents I shall show and comment next some drawings and pictures of the persons who participated in the birth of the method at the two institutes. At the end I shall wrap up  my presentation with a few general remarks and acknowledgements for the help I received when preparing it.
First let us recall what is a system of linear equations
This can be written in an abbreviated form where for the cg-method A has the following properties:
The matrices which we were confronted with were somewhat special since they originated from elliptic partial differential equations which appear in the theory of elasticity describing the static behavior under load e.g. of  plates. The asterisk means that the original array is transposed i.e. rows become columns and columns rows.
For solving such a system  one starts out with a first guess.
The notion of residual vector is then introduced. It gives to  a certain extent a measure for how far one is from the solution of the problem. This initial guess is then step by step improved, which leads to a new approximation vector.
In the relaxation methods the residual vectors play an important role since they are used for finding better approximations. This notion was introduced by Sir Richard Southwell, a British engineering professor of the university of Oxford who invented the point relaxation method. He harbored a rather shortsighted view about systematic relaxation methods adapted to computers: (citations from articles by D. Young and Garrett Birkhoff in “History of Scientific Computing”)
,,Any attempt to mechanize relaxation methods would be a waste of time.“ He did not believe that Computers could compete with human intuition and ingenuity in applying relaxation methods. He emphasized ,,the freedom left to the [human] computer, to decide the nature of his next step“ in relaxation methods.”
L. Fox‘s described Southwell‘s relaxation method:
The success of the method (and it was successful even with the meager computing equipment then available) depended significantly on the ability of the human eye and brain to pick out quickly the largest of a sequence of numbers or a cluster of such numbers, to recognize patterns of numbers, and to forecast the overall effects of relaxation operations. In fact, it was rather like a game of chess.
Based on his experiences Stiefel did not share this opinion. Already at the start of his endeavors in numerical analysis he wrote in lecture notes the following:
Stiefel called this method initially „n-Schrittverfahren“ and the  n vectors p0 .., pn-1.weight vectors. They are conjugate, (pi* Apk) = 0 i k. According to the annual report of IAM for the "Studienjahr 1950/51" Stiefel elaborated this method during that period (October 1950 to July 1951). Late in June or early in July 1951, M. Hestenes with the cooperation of J. B. Rosser, G. Forsythe and L. Paige of INA devised what he called a Conjugate Gradient Method for solving systems of linear equations. He formulated three versions of this routine. Stiefel was invited about that time to give a talk on solving systems of linear equations at an INA symposium which took place August 23-25, 1951 . The  librarian gave Stiefel on his arrival at INA a paper describing Hestenes' work. According to Hestenes' recollection of the events shortly thereafter Stiefel came to his office with this paper in hand and said, "Look! This is my talk." Stiefel had invented the same algorithm from a different point of view which I have given here. He looked upon it as a relaxation method whereas Hestenes viewed it as a gradient routine on conjugate subspaces, a topic on which he had worked already in 1936 when he developed an algorithm for constructing a set of mutually conjugate directions in Euclidian space for the purpose of studying quadric surfaces. Because they had devised the same routine independently at about the same time, Stiefel and Hestenes decided to write a joint paper describing the routine and its properties.
In his invited address at the 1954 International Mathematical Congress in Amsterdam, Stiefel had concluded that: ,,Among all scalar iteration algorithms, the method of conjugate gradients is the best strategy,“ In the beginning the method attracted a lot of interest. In the 1960s some unsuccessful applications on large problems due to an unsophisticated use almost caused its passing into oblivion. Work by John Reid in the early 1970s renewed attention to the algorithm. As the rich program of this symposium shows it has since then been the subject of many research efforts. Furthermore it is nowadays together with incomplete Cholesky as preconditioner  a standard algorithm for solving linear systems involving large, sparse,  positive definite matrices in universities and industry. Its importance has been recognized  about two years ago by selecting Krylov Subspace methods (CG is the mother of all of them)  as one of the "Top 10 Algorithms of the (twentieth) Century" by Computing in Science & Engineering (CS&E) magazine, a joint publication of the IEEE Computer Society and the American Institute of Physics.
This is not a complete list of all the fields in which applications occurred, but it shows already its importance in modern technology.
Stiefel was a very broadly interested creative mathematician. He began his career as a topologist and made notable contributions in this field. In 1948 he changed the direction of his research completely and got interested in computers and numerical analysis. It must not have been an easy change because he was well aware of the disdain  many contemporary outstanding pure mathematicians harbored for applied mathematics which they called "dirty mathematics". In this context belongs the following passage in his report about his first trip to the US which took place from 10/18/1948 – 3/12/11949: (translated into English): "Remarkable is the effacing of the boundary between pure and applied mathematics, between mathematician and engineer. The rigorous division between professors of pure and those of applied mathematics, as it was earlier usual in German universities, is abandoned. Leading is the type of the pure mathematician but interested in the applications. The needs of computer technology have resulted everywhere in the training of young people who possess a thorough knowledge of  math and electronics at the same time and who are capable to adapt a given problem mathematically optimally to the machine."
He gave a permanent place to computer sciences at the ETHZ by creating and directing the IAM for 30 years.
Besides his comparatively short political career he devoted considerable time to serving in the Swiss army where he advanced to the rank of colonel. In this position he commanded the meteorological services of the artillery.
Stiefel‘s Father taught drawing at the cantonal high school in Zurich. In 1938 to 1940 I was his pupil there. Since I obtained the best result of my class in a shooting competition which since centuries is organised for boys in their teens every September in Zurich, he gave me this etching as a present.
In all of these areas Stiefel made truly original and fundamental contributions. In fact, even as a newcomer to a field he was able to find a solution to some important basic problem, and in retrospect, his solution was simple and surprising at the same time.
With respect to scientific computation, period 3 is the most important, but periods 4 and 5 must not be overlooked. The paramount contribution to numerical linear algebra is of course the conjugate gradient algorithm introduced in the joint paper with M. R. Hestenes and further investigated in a series of papers . However, one should also mention Stiefels' promotion of the use of variational principles for deriving the linear system from the physical problem. With this approach he put difference methods on a common basis with the finite-element method.
"Ambros Speisers'( with Rutishauser his first collaborator at IAM) judgment in his article “The Early Years of the Institute of Applied Mathematics”
Stiefel had a remarkable intuitive insight of what was to become important in the future, an insight that he retained throughout his life. Without his foresight, the Institute( for Applied Mathematics) would not have been founded: None of his colleagues supported him, they followed, at a safe distance, the curious events that were occurring in room 16d. There is no doubt in my mind that Stiefel must be regarded as one of the outstanding figures at ETH in the 50s and 60s of this century."
Stiefel was not only an ingenious researcher but also a gifted and lucid teacher. This picture has been taken in the 1950s although it is not exactly known on which occasion.
Similarly to Stiefel Hestenes had a brilliant university career combined with many activities outside academic institutions. He worked at INA from 1949 until 1954.
Hestenes played an essential role in building the UCLA mathematics department and had an outstanding research carrier. During his tenure at Chicago and UCLA, he supervised the thesis research of 34 students.
His best known works include many publications on the problem of Bolza, a famous paper on quadratic forms in Hilbert space and the conjugate gradient paper with Stiefel. Hestenes remained scientifically active until his death in 1991, concentrating in his later years on the method of multipliers and continuing to write and publish.
The smiling face in this picture betrays Hestenes' charming personality. "motivating and building confidence in students accounted for a large part of his daily activity"
The computation of the deformation of this dam under the pressure from the water of one of the largest storage lakes in Switzerland was one of the first outside orders which IAM received after its foundation in 1948. Mathematically it involved the solution of a boundary value problem with a two dimensional elliptic partial differential equation which could only be computed by using relaxation methods. At the start when the programmable computer Z4 was not yet available and only electromechanical  calculators existed at the institute, Southwells' point relaxation method was used. Stiefel engaged for this a former German naval officer, Dr. F. Krantz, but he worked on the tedious computations also personally. Often he followed the recommendation of a famous instructor in the Swiss army  "If the 24 hours of the day are not sufficient for achieving your work then use also the night" and computed during a whole night. Several times I  joined him in these strenuous nightly exercises. I think that this rather dull and tiring effort also motivated him to find a better method for solving the system of linear equations. In support of this I found recently that David M. Young, one of the developers of the successive overrelaxation method, stated in his article “A historical review of relaxation methods” (published in “A history of scientific computing”): my propensity for making numerical errors was so strong that I knew that I would never be able to solve significant problems except by machines and this determined him to continue his PhD thesis on relaxation methods suitable for the use with computers.
The IAM had concluded a contract with the Federal Aircraft Factory which developed a deltawinged jetfighter at the time for the Swiss army, to solve this problem. It lead to a system of 106 linear equations. At the end of the common article by Hestenes and Stiefel it is mentioned as the largest problem solved by the cg-method so far and that the computations had been realized on the Zuse computer at ETHZ (with a sufficiently accurate answer obtained in 90 iterations). This machine had only a mechanical memory for 64 numbers with 6 digits so the corresponding matrix and vectors had to be stored externally by punching holes in old movie films. These had to be mounted on and removed from the reading units as the computation advanced so there was a constant interaction between man and machine.
Heinz Rutishauser (1918-1970) produced several outstanding theories and methods in numerical analysis and programming and was Stiefels' most competent mathematical interlocutor at IAM
Title page of my PhD thesis which I finished in 1953 (published in 1954) while I worked as an applied mathematician for the „Flug- und Fahrzeugwerke Altenrhein“ dealing with the flutter problems of a jetfighter which the company developed for the Swiss army. The English translation of its title is:The application of the cg-method and its modifications for the solution of linear boundary value problems.
Despite the fact that the method has become a standard topic in numerical  textbooks,  this  original paper is  still  widely read and cited. Apparently it remains the best reference on this algorithm. It's contents our so extensive that it contains the roots of the essential advances in this area up to nowadays: e.g. preconditioning to accelerate convergence of the algorithm, variations that can be used to solve systems of equations involving nonsymmetrical matrices, and evaluation of computational variants in order to choose the most numerically stable one.
Cornelius Lanczos (1893-1974) can be considered so to speak as the third inventor of the cg-method. He was a very original physicist and mathematician of Jewish Hungarian descent who had a profound impact on the foundations of twentieth century science. His papers cover a vast array of disciplines, including general relativity, quantum mechanics, scientific computation, applied mathematics and numerical analysis. Lanczos was indeed one of the twentieth century's most versatile and innovative scientific minds, and many of Lanczos' ideas are still of interest to present-day research in physics and applied mathematics. During the period when this picture was taken Lanczos wrote up his version of the cg-method which he called a method of "Minimized Iterations." In addition he devised a special iteration for solving large systems of linear equations.
George E. Forsythe (1917 – 1972) was an outstanding senior member of INA at the time. He joined with Barkley Rosser, Cornelius Lanczos and Magnus Hestenes -to name only the most prominent ones - in a seminar organized in 1949 at INA in which methods for solving systems of linear equations and finding eigenvalues and eigenvectors of matrices were studied. Hestenes' development of the cg-method  is in some way a result of the discussions in this seminar. Forsythe contribution was the lucid classification of the then known methods for solving systems of linear equations.
Public transportation in Los Angeles was rather poor at that time. If you walked around in residential areas, you risked to be checked by a police patrol. So the acquisition of a car was almost a necessity and enlarged considerably the possibilities to find a convenient and reasonably priced accommodation. Stiefel did not know how to drive a car when he visited INA. So  he had first to learn it which he did in a short time.
This picture taken in the winter 1951/52 in Los Angeles shows that we did not only work  but also enjoyed the mild climate of Southern California
With his newly acquired second hand car Stiefel, his wife and occasionally I as a chauffeur drove around and visited many of the impressive tourist sights on the West Coast and even in Arizona and New Mexico.
My first contact with Stiefel occurred in 1943 when I was a 17 year old high school pupil. In November 1942 the then new art journal "Du" published an article about a way of cutting up a cube found by the prolific Swiss inventor Paul Schatz. I was so fascinated by this body which could be turned inside out that I constructed a cardboard model of it. My Hungarian mother who was then the physician of the Stiefel family showed this model to him who borrowed it in order to display it in his seminar on topology. I met him then for the first time personally when I got it back.
The successful history of the cg-method shows us that we have to be careful when interpreting the nowadays frequently used statement: "scientific knowledge has an ever decreasing half life." It does not mean that all scientific knowledge will become obsolete or even wrong in a few years after its detection. Some inventions, especially in mathematics as shown by the cg-method, remain valid for an unlimited time. In closing my talk I should like to thank Walter Gander and Martin Gutknecht and their collaborators who have greatly and patiently helped me to prepare the material in particular some of the photos which I have used in this presentation and made useful comments on its first draft.
This Russian applied mathematician was the inventor of the Krylov subspaces which are spanned  by conjugate vectors. The cg-method is also called a Krylov subspace iteration.