Hier klicken, um Master-Textformat zu
bearbeiten
Zweite Ebene
Dritte Ebene
Vierte Ebene
Fünfte Ebene
To report
about the birth of the cg-method is for me almost like remembering my first love and I am doing it with great
pleasure. I am very grateful to the
organizers of this symposium that they have
invited me to assume this task for a wider public so that my shortcomings in keeping up with the newest developments
in mathematics hopefully will not be too
visible. My appreciation is all the
greater because already long ago I have deserted research in numerical analysis, in which Prof. Stiefel, one of
the inventors of the cg-method, introduced
me at the beginning of the 1950s, in favor
of science policy. This meant that
instead of doing research I tried to
procure research money and to create favorable conditions for such activities. Taking into account my limited mathematical gifts I think that this way I have been
able to advance science a little more than
if I had continued in my academic career. But I should not talk above all about
myself so let us start with the
presentation which I have prepared using the New Learning Technologies in order to show you that I have at least kept up with the developments of computer
technology whose shortcomings at its
beginnings after the second world war caused
me some frustrations.
Following
the rules for a good presentation I begin with a disposition
Although my
talk should be understandable also for the nonspecialist I think that I have to try to explain briefly what is the
cg-method. The best time for this is
probably at the beginning when you are not yet too tired of my speech. The famous German mathematician David Hilbert once
said. "A mathematical theory is not
to be considered complete until you have made it so clear that you can explain it to the first man whom you
meet on the street." At his time women apparently ere ot supposed to be
encountered on the street. Since the
cg-method is theoretically fully clarified for at least fifty years, I am encouraged to undertake its popularization.
Of the different ways in which it can be presented I
shall use the one Stiefel developed at the
IAM. Hestenes derivation which he established independently is based more on geometrical considerations and will
not be given here. Then I want to mention
briefly some of the fields in which this method is applied.
Since this symposium is also meant to honor the
brilliant inventors of the method short
biographies of Stiefel and Hestenes will follow.
Based on my personal memories complemented by what I
have found in some documents I shall show
and comment next some drawings and pictures of the persons who participated in the birth of the method at
the two institutes.
At the end I shall
wrap up my presentation with a few
general remarks and acknowledgements for
the help I received when preparing it.
First let
us recall what is a system of linear equations
This can be
written in an abbreviated form where for the cg-method A has the following properties:
The
matrices which we were confronted with were somewhat special since they originated from elliptic partial
differential equations which appear in the
theory of elasticity describing the static
behavior under load e.g. of plates.
The asterisk means that the original array
is transposed i.e. rows become columns and columns
rows.
For solving such a system one starts out with a first guess.
The notion of residual vector is then introduced.
It gives to a certain extent a measure for how far one is from the
solution of the problem.
This initial guess is then step by step improved, which
leads to a new approximation vector.
In the
relaxation methods the residual vectors play an important role since they are used for finding better
approximations. This notion was introduced
by Sir Richard Southwell, a British engineering
professor of the university of Oxford who invented the point relaxation method. He harbored a
rather shortsighted view about systematic relaxation methods adapted to computers:
(citations
from articles by D. Young and Garrett Birkhoff in “History of
Scientific Computing”)
,,Any attempt to mechanize relaxation methods would be a waste
of
time.“ He did not believe that Computers could compete with human
intuition and ingenuity in applying relaxation methods. He emphasized
,,the freedom left to the [human] computer, to decide the nature of his next step“ in
relaxation methods.”
L. Fox‘s described Southwell‘s relaxation method:
The success of the method (and it was successful even with the meager
computing equipment then available) depended significantly on the ability of
the human eye and brain to pick out quickly the largest of a
sequence of numbers or a cluster of such numbers, to recognize patterns
of numbers, and to forecast the overall effects of relaxation
operations. In fact, it was rather like a game of chess.
Based on
his experiences Stiefel did not share this opinion. Already at the start of his endeavors in numerical
analysis he wrote in lecture notes the
following:
Stiefel called
this method initially „n-Schrittverfahren“ and the n vectors p0 .., pn-1.weight
vectors. They are conjugate, (pi* Apk) = 0 i ≠ k.
According to the annual report of
IAM for the "Studienjahr 1950/51" Stiefel elaborated this method
during that
period (October 1950 to July 1951). Late in June or early in July 1951, M. Hestenes with the cooperation of J. B. Rosser, G.
Forsythe and L. Paige of INA devised
what he called a Conjugate Gradient Method for solving systems of linear equations. He formulated three versions of this
routine. Stiefel was invited about that time to give a talk on solving systems of linear equations
at an INA symposium
which took place August 23-25, 1951 . The librarian gave
Stiefel on his arrival at
INA a paper describing Hestenes' work. According to Hestenes' recollection of the events shortly thereafter
Stiefel came to his office with this paper in hand and said, "Look! This is my talk."
Stiefel had invented the same algorithm from a different point of view which I have given here.
He looked upon it as a
relaxation method whereas Hestenes viewed it as a gradient routine on conjugate subspaces, a topic on which he had
worked already in 1936 when he developed an algorithm for constructing a set of mutually
conjugate directions in Euclidian
space for the purpose of studying quadric surfaces. Because they had devised the same routine independently at about
the same time, Stiefel and Hestenes
decided to write a joint paper describing the routine and its properties.
In his invited address at the 1954 International Mathematical Congress in Amsterdam, Stiefel had concluded that:
,,Among all scalar
iteration algorithms, the method of conjugate gradients is the best strategy,“
In the beginning the method attracted a lot of
interest. In the 1960s some
unsuccessful applications on large problems due to an unsophisticated use almost caused its passing into
oblivion. Work by John
Reid in the early 1970s renewed attention to the algorithm. As the rich program of this symposium
shows it has since then
been the subject of many research efforts. Furthermore it is nowadays together with incomplete Cholesky as preconditioner
a standard algorithm for solving linear systems involving large, sparse, positive definite matrices in universities and industry.
Its importance has been recognized about two
years ago by selecting
Krylov Subspace methods (CG is the mother of all of them) as one of the "Top 10 Algorithms
of the (twentieth) Century"
by Computing in Science & Engineering (CS&E) magazine, a joint publication of the IEEE Computer
Society and the
American Institute of Physics.
This is not a complete list of all the fields in which
applications occurred, but it shows already its importance in modern technology.
Stiefel was a very broadly
interested creative mathematician. He began his career as a topologist and made
notable contributions in this field. In 1948 he changed the direction of his
research completely and got interested in computers and numerical
analysis. It must not have been an easy change because he was well aware of the
disdain many contemporary outstanding pure
mathematicians harbored for applied mathematics which they called "dirty
mathematics". In this context belongs the following passage in his report
about his first trip to the US which took place from 10/18/1948 –
3/12/11949: (translated into English): "Remarkable is the
effacing of the boundary between pure and applied mathematics,
between mathematician and engineer. The rigorous division between
professors of pure and those of applied mathematics, as
it was earlier usual in German universities, is abandoned. Leading is the type
of the pure mathematician but interested in the applications. The
needs of computer technology have resulted everywhere in
the training of young people who possess a thorough knowledge of math and electronics at the same time and
who are capable to adapt a given problem mathematically optimally to the machine."
He
gave a permanent place to computer sciences at the ETHZ by creating and directing
the IAM for 30 years.
Besides his comparatively short political career
he devoted considerable time to serving in the Swiss army where
he advanced to the rank of colonel. In this position he commanded the
meteorological services of the artillery.
Stiefel‘s
Father taught drawing at the cantonal high school in Zurich. In 1938 to 1940 I was his pupil there. Since I
obtained the best result of my class in a
shooting competition which since centuries
is organised for boys in their teens every September in Zurich, he gave me this etching as a present.
In all of these areas Stiefel made truly original
and fundamental contributions. In fact, even as a newcomer to a field he was
able to find a solution to some important basic problem, and in retrospect,
his solution was simple and surprising at the same time.
With respect to scientific computation, period 3
is the most important, but periods 4 and 5 must not be overlooked. The
paramount contribution to numerical linear algebra is of course the conjugate
gradient algorithm introduced in the joint paper with M. R. Hestenes and
further investigated in a series of papers . However, one should also mention
Stiefels' promotion of the use of variational principles for deriving the
linear system from the physical problem. With this approach he put difference
methods on a common basis with the finite-element method.
"Ambros Speisers'( with Rutishauser his first collaborator
at IAM) judgment in his article “The Early Years of the
Institute of Applied Mathematics”
Stiefel had a remarkable
intuitive insight of what was to become important in
the future, an insight that he retained throughout his life. Without his foresight,
the Institute( for Applied Mathematics) would not have been founded:
None of his colleagues supported him, they followed, at a safe distance,
the curious events that were occurring in room 16d. There is no doubt
in my mind that Stiefel must be regarded as one of the outstanding figures
at ETH in the 50s and 60s of this century."
Stiefel was not only an ingenious researcher but also a gifted
and lucid teacher. This picture has
been taken in the 1950s although it is not exactly known on which occasion.
Similarly
to Stiefel Hestenes had a brilliant university career combined with many activities outside academic
institutions. He worked at INA from 1949
until 1954.
Hestenes played an essential role in building the UCLA mathematics department and had an outstanding
research carrier.
During his tenure at Chicago and UCLA, he supervised the thesis research of 34 students.
His best known works include many publications on
the problem of Bolza,
a famous paper on quadratic forms in Hilbert space and the conjugate gradient paper with Stiefel.
Hestenes remained scientifically
active until his death in 1991, concentrating in his later years on the method of multipliers and
continuing to write and
publish.
The smiling
face in this picture betrays Hestenes' charming personality. "motivating and building confidence in students accounted for a large part of his daily activity"
The
computation of the deformation of this dam under the pressure from the water of one of the largest storage lakes in
Switzerland was one of the first outside
orders which IAM received after its foundation in 1948. Mathematically it involved the solution of a boundary value problem
with a two dimensional elliptic partial
differential equation which could only be computed by using relaxation methods. At the start when the programmable
computer Z4 was not yet available and only
electromechanical calculators existed
at the institute, Southwells' point
relaxation method was used. Stiefel engaged for this a former German naval officer, Dr. F. Krantz, but he worked on
the tedious computations also personally.
Often he followed the recommendation of a famous instructor in the Swiss army
"If the 24 hours of the day are not sufficient for achieving your work then use also the night" and computed during
a whole night. Several times I joined him in these strenuous nightly
exercises. I think that this rather dull and
tiring effort also motivated him to find a better method for solving the system of linear equations. In support of this I found
recently that David M. Young, one of the
developers of the successive overrelaxation method, stated in his article “A historical review of relaxation methods”
(published in “A history of scientific
computing”): „ my propensity for making numerical errors was so strong
that I knew that I would never be able to solve significant problems except
by machines “ and this determined him to continue his PhD thesis on relaxation methods suitable for the use with computers.
The IAM had concluded a contract with the Federal Aircraft Factory which developed a deltawinged jetfighter
at the time for the Swiss
army, to solve this problem.
It lead to a system of 106 linear equations. At the end of the common article by Hestenes and Stiefel it is
mentioned as the largest
problem solved by the cg-method so far and that the computations had been realized on the Zuse
computer at ETHZ (with a
sufficiently accurate answer obtained in 90 iterations). This machine had only a mechanical memory for 64
numbers with 6
digits so the corresponding matrix and vectors had to be stored externally by punching holes in old movie
films. These had to be
mounted on and removed from the reading units as the computation advanced so there was a constant
interaction between
man and machine.
Heinz Rutishauser (1918-1970) produced several outstanding theories and
methods in numerical analysis and programming and was Stiefels' most competent
mathematical interlocutor at IAM
Title page
of my PhD thesis which I finished in 1953 (published in 1954) while I worked as an applied mathematician for
the „Flug- und Fahrzeugwerke Altenrhein“
dealing with the flutter problems of a
jetfighter which the company developed for the Swiss army. The English translation of its title is:The application of the cg-method and its modifications for the solution
of linear boundary value problems.
Despite the fact that the method has become a standard topic in numerical
textbooks, this original paper is still widely read and cited. Apparently it remains the best reference on this
algorithm. It's contents our so extensive
that it contains the roots of the essential
advances in this area up to nowadays: e.g. preconditioning
to accelerate convergence of the algorithm, variations
that can be used to solve systems of equations involving nonsymmetrical matrices, and evaluation of
computational variants in order to choose the
most numerically stable one.
Cornelius Lanczos
(1893-1974) can be considered so to speak as the third inventor of the
cg-method. He was a very original physicist and mathematician of
Jewish Hungarian descent who had a profound impact on the foundations of
twentieth century science. His papers cover a vast array of
disciplines, including general relativity, quantum mechanics, scientific
computation, applied mathematics and numerical analysis. Lanczos
was indeed one of the twentieth century's most versatile and
innovative scientific minds, and many of Lanczos' ideas
are still of interest to present-day research in physics and applied
mathematics.
During the period when this picture was taken
Lanczos wrote up his version of the cg-method which he called a
method of "Minimized Iterations." In addition he
devised a special iteration for solving large systems of linear equations.
George E.
Forsythe (1917 – 1972) was an outstanding senior member of INA at the time. He joined with Barkley
Rosser, Cornelius Lanczos and Magnus
Hestenes -to name only the most prominent
ones - in a seminar organized in 1949 at INA in which methods for solving systems of linear equations and
finding eigenvalues and eigenvectors of
matrices were studied. Hestenes' development
of the cg-method is in some way a
result of the discussions in this seminar.
Forsythe contribution was the lucid classification
of the then known methods for solving systems of linear equations.
Public
transportation in Los Angeles was rather poor at that time. If you walked around in residential areas, you risked
to be checked by a police patrol. So the
acquisition of a car was almost a
necessity and enlarged considerably the possibilities to find a convenient and reasonably priced accommodation. Stiefel
did not know how to drive a car when he
visited INA. So he had first to learn it which he did in a short time.
This
picture taken in the winter 1951/52 in Los Angeles shows that we did not only work but also enjoyed the mild climate of Southern California
With his
newly acquired second hand car Stiefel, his wife and occasionally I as a chauffeur drove around and visited
many of the impressive tourist sights on
the West Coast and even in Arizona and New
Mexico.
My first
contact with Stiefel occurred in 1943 when I was a 17 year old high school pupil. In November 1942 the then
new art journal "Du" published
an article about a way of cutting up a cube
found by the prolific Swiss inventor Paul Schatz. I was so fascinated by this body which could be turned inside
out that I constructed a cardboard model
of it. My Hungarian mother who was then
the physician of the Stiefel family showed this model to him who borrowed it in order to display it in his
seminar on topology. I met him then for
the first time personally when I got it back.
The
successful history of the cg-method shows us that we have to be careful when interpreting the nowadays frequently
used statement: "scientific knowledge
has an ever decreasing half life." It
does not mean that all scientific knowledge will become obsolete or even wrong in a few years after its detection. Some
inventions, especially in mathematics as
shown by the cg-method, remain valid for
an unlimited time.
In closing my talk
I should like to thank Walter Gander and Martin
Gutknecht and their collaborators who have greatly and patiently helped me to prepare the material in
particular some of the photos which I have
used in this presentation and made useful comments
on its first draft.
This
Russian applied mathematician was the inventor of the Krylov subspaces which are spanned by conjugate vectors. The cg-method is also called a Krylov subspace iteration.