General Information 

One of the most basic problems in vision is to use images to recognize that a particular object or event that we’ve never seen before belongs to a particular class of objects or events. To do this we must have a rich notion of what an object is, that can capture what is common in them. For example, chairs vary tremendously in their shape and material properties. How do we look at a chair we’ve never seen before and identify it as a chair? Accounting for this variation in recognition is largely an unsolved problem. In this course we will survey a number of approaches to representing and recognizing objects. We will draw inspiration by looking at work from philosophy, psychology, linguistics, and mathematics. However, our primary focus will be more concrete, to learn the algorithms and analytic tools that have been applied in visual object classification.
First we will study approaches based on the idea that objects can be described by a set of necessary and sufficient image properties. This has taken the form of invariant representations. We will study the advantages and limitations of geometric and photometric invariants. Second we will consider classification approaches based on powerful similarity measures. These include methods of shape comparison, deformable template matching, and lighting insensitive matching. Third we will consider methods that attempt to represent the images of classes of objects using subspaces. This includes approaches based on PCA, linear combinations, and manifold representations of classes. Fourth, we will look at approaches to building generative models of classes, including the use of hidden markov models and pattern theory. Finally, we will consider the idea of building classifiers directly, without explicit representations of the class, using methods such as support vector machines, Winnow, and naïve Bayes.
The class will alternate between lectures teaching the basic mathematical and algorithmic techniques of these methods, and studentled discussion of vision research papers that apply these techniques. It will be essential for students to have a solid understanding of basic topics in math, such as linear algebra, probability and statistics, and calculus. It will also be useful to have some knowledge of computer vision, image processing, functional analysis, stochastic processes, or geometry. In general, the more math a student knows, the easier the course will be.
Here is my current plan for the workload of the class. This may change prior to the first day of class.
1) Reports. There are 15 classes scheduled for the presentation of papers. Prior to each of these classes, students must turn in a one page summary and critique of one of the papers to be discussed. Late papers will not be accepted, since the goal of these reports is to get you to think about papers before we discuss them. However, each student need only turn in reports for 12 of these classes. 20% of grade
2) Presentation. Students will be assigned in pairs to present (usually) two papers in one class. This will be a substantial part of the grade. Presentations should be well prepared. If enrollment is low enough, students may be expected to do this twice. 20% of grade
3) Midterm and Final. These will be based on material from the lectures. 40% of grade
4) Project. Student will choose one: 20% of grade
a) Write a detailed, paper, approximately five pages in length, proposing research that extends or adapts one of the approaches discussed in class. You may choose to base this on the papers you have presented.
b) Programming project: student will implement a technique discussed in class, and apply it to some real data. This is not meant to be a research project, but something closer to an extended problem set.
The schedule below is probably overly ambitious, so expect that we won't get to a couple of these classes. Pairs of students will lead the discussion of papers, as indicated. There will probably not be enough students to lead all these classes, so I will lead discussion for any extras. Classes October 14 and 16 will have to be rescheduled, due to the International Conference on Computer Vision.
Class 
Presenters  Topic  Background Reading 
1. 9/2  Jacobs  Introduction (view as web page).  
2. 9/4  Jacobs  Paper Presentation: (view
as web page).
Students must review (a) or (b). a. Women, Fire and Dangerous Things by Lakoff, Chapters 1 and 2. On reserve. b. S. Laurence and E. Margolis, ``Concepts and Cognitive Science'', in Concepts edited by E. Margolis and S. Laurence, MIT Press, 1999. On reserve. c. L. Wittgenstein, Philosophical Investigations, sections 6578. On reserve. 

3.9/9

Jacobs  Lecture: Affine and projective geometry and invariants. 
Introduction to Projective Geometry, C.R.
Wylie, McGrawHill Book Co., 1970. Y. Lamdan, J. T. Schwartz, and H. J.
Wolfson. Affine invariant modelbased object recognition. IEEE
Journal of Robotics and Automation, 6:578589, 1990 I. Weiss. Geometric Invariants and
Object Recognition. Intl. J. Computer Vision, 10:207231, 1993 J. Burns, R. Weiss, and E. Riseman, ``The
NonExistence of GeneralCase ViewInvariants’’, in Geometric
Invariance for Computer Vision, edited by J. Mundy and A. Zisserman,
MIT Press, 1992 Moses, Y. and Ullman, S. (1992).
``Limitations of non modelbased recognition schemes’’. In Sandini,
G., editor, Proc. 2nd European Conf. on Computer Vision, Lecture Notes in
Computer Science, volume 588, pages 820828. Springer Verlag. J. Mundy and A. Zisserman, Appendix – Projective
Geometry for Machine Vision, in Geometric Invariance for Computer Vision,
edited by J. Mundy and A. Zisserman, MIT Press, 1992. ``In Search of Illumination Invariants,'' IEEE Conference on Computer Vision and Pattern Recognition, pp.~{254261}, (June 2000). H. Chen, P. Belhumeur, and D. Jacobs. 
4. 9/11  Jacobs  Lecture: Geometric invariance (conclusion) and photometric invariance. (view as web page).  
5. 9/16  Jacobs  Presentation, David Jacobs: Classification with
Invariants
b. Biederman, I. (1987). Recognitionbycomponents: A theory of human image understanding. Psychological Review, 94(2):115147. On reserve. 
Jepson, A., W.
Richards, and D. Knill, Modal
structure and reliable inference, in ``Perception as Bayesian
Inference," eds. D. Knill and W. Richards, Cambridge Univ. Press,
1996, pp. 6392. Biederman has many papers on this topic, including: Biederman, I. Gerhardstein, P. (1993). Recognizing depthrotated objects: Evidence and conditions for threedimensional viewpoint invariance." Journal of Experimental Psychology: Human Perception and Performance, 19, 11621182. Biederman,
I. Gerhardstein, P. (1995). Viewpointdependent mechanisms in visual
object recognition: reply to Tarr and Bulthoff Journal of Experimental
Psychology: Human Perception and Performance, 21(6), 15061514. D. Jacobs. ``What Makes Viewpoint Invariant Properties Perceptually Salient,'' Journal of the Optical Society of America A, in press. 
6. 9/18  Rao  Class canceled so students may attend a lecture by C. R. Rao: Has Statistics a Future? If so, in what form?, 3:304:30, 1524 Van Munching Hall, the Howard Frank Auditorium.  
7. 9/23  Jacobs  Lecture: Linear subspaces – geometry & PCA  Duda, Hart and Stork, pp. 114117. On
reserve in library.
Shimon Ullman and Ronen Basri, Recognition by Linear Combinations of Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(10): 9921006, 1991.Available at: http://www.wisdom.weizmann.ac.il/~ronen/publications.html D. Jacobs "Matching 3D Models to 2D Images," the International Journal of Computer Vision, (21)1/2:123153, January, 1997. On reserve. Turk, M. & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3, 7186

8. 9/25  Jacobs  Lecture: Linear subspaces – photometry 
Shashua. On photometric
issues to featurebased object recognition. Int. J. Computer Vision,
21:99 122, 1997. ``Lambertian Reflectance and Linear Subspaces,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(2):218233, (2003). R. Basri and D. Jacobs. Available at: http://www.wisdom.weizmann.ac.il/~ronen/publications.html

9. 9/30  Nanda
Sen 
Presentations: Linear Representations of
Classes. T.F. Cootes and C.J. Taylor, "Statistical models of appearance for medical image analysis and computer vision", Proc. SPIE Medical Imaging 2001. Presentation. “Face Recognition Based on 3D Shape Estimation from Single Images”, by Blanz and Vetter, CGFTR 2, October 2002, University of Freiburg. There is a version in PAMI 2003 available on line, I'll add a link. Presentation. (view as web page). 
``Statistical models of appearance for computer vision'' by Cootes and Taylor Lohmann, G.P. 1983. Eigenshape analysis of microfossils: a general morphometric procedure for describing changes in shape. Mathematical Geology 15:659672. 
10.9/30
4:456:00 "Optional" 
Jacobs 
Presentation: Prototypes and Natural Categories. (view as web page). Posner,
M.I., & Keele, S.W. (1968). On the genesis of abstract ideas. Journal
of Experimental Psychology, 77, 353363. E. Rosch, C. Mervis, W. Gray, D. Johnson, and P. BoyesBraem, ``Basic Objects in Natural Categories'', Cognitive Psychology, 8:382439. On reserve. 
S. Ullman, Highlevel Vision, MIT Press, 1996, Chapter 6. Ronen Basri, Recognition by Prototypes, International Journal of Computer Vision, 19(2): 147168, 1996.

11. 10/2  Cuntoor
Raykor 
Presentations: Nonlinear subspaces
S. Edelman, ``Representation is Representation of Similarities’’, Behavioral and Brain Sciences. Presentation (view as web page). Joshua B. Tenenbaum, Vin de
Silva, John C. Langford, ``A
Global Geometric Framework for Nonlinear Dimensionality Reduction’’,
Science.
Sam T. Roweis, Lawrence K. Saul, Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science, Science. Presentation (view as web page). 
H. Sebastian Seung and Daniel D. Lee The Manifold Ways of Perception, Science.H. Murase and S.K. Nayar. Visual Learning and Recognition of 3D Objects from Appearance. International Journal of Computer Vision, vol. 14, no. 1, Jan 1995, pp 524. Ronen Basri, Dan Roth, and David Jacobs, Clustering Appearances of 3D Objects, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Santa Barbara: 414420, 1998 Cutzu, F., and S. Edelman, Representation of object similarity in human vision: psychophysics and a computational model, Vision Research 38:22272257, 1998 
12.10/7  Eaton
Mativo 
Presentations: The psychology of
similarity and viewbased recognition.
E. Goldmeier, Similarity in Visually Perceived Forms, International Universities Press, Psychological Issues, Volume VIII, Number 1, Monograph 29. Chapters: 15. On reserve. H. Bulthoff, S. Edelman, and M. Tarr, ``How Are ThreeDimensional Objects Represented in the Brain?'' MIT AI Memo #1479. 
Check out Mike Tarr's class
page for many relevant references.
Poggio, T. and Edelman, S., A network that learns to
recognize threedimensional objects.
Nature, 343:363266. 
13.10/9  Jacobs 
Presentations, mixed with lecture: Shape and Nature. Students must review (a). (a) D’arcy Thompson, On Growth and Form, Dover Books, 1992, Chapters 1 and 17. On reserve. (b) A guide to Tree Identification (just skim this). On reserve.

Morphometric tools for Landmark data, by Bookstein D.
G. Kendall. A survey of the statistical theory of shape.
Statistical Science, 4(2):87120, 1989 Shape and Shape Theory, by Kendall, Barden, Carne and Le Statistical Shape Analysis by I.
L. Dryden and Kanti
V. Mardia 
10/14  No Class, ICCV  
10/16  No Class, ICCV  
14. 10/21  Jacobs  Lecture: Morphometrics. (view as web page).  See previous class. 
15. 10/23  Yi (?)
Tahmoush 
Presentations: Morphometrics and Recognition. Serge Belongie, Jitendra Malik and Jan Puzicha Shape Matching and Object Recognition Using Shape
Contexts PAMI,
24(4):509522, April 2002.


16. 10/28  Jacobs  Lecture: Fourier descriptors and wavelets. (view as web page) 
A Wavelet Tour of Signal Processing,
by Mallat. “Introduction and Overview of Fourier Descriptors”, by
Lestrel. 
17. 10/30  Aggarwal
Nath 
Presentations: Waveletbased texture
classification.
J. S. De Bonet and P. Viola Texture Recognition Using
a Nonparametric MultiScale Statistical Model, CVPR ’98.
Available at: http://www.debonet.com/Research/Publications/ Presentation (view as web page). J Portilla and E P Simoncelli. A Parametric Texture Model based on Joint Statistics of Complex Wavelet Coefficients. Int'l Journal of Computer Vision. 40(1):4971, October, 2000. Presentation 

18. 11/4  Ran
Jacobs 
Presentations: Waveletbased representations for object classification. M. Lades, J.C. Vorbruggen, J.
Buhmann, J. Lange, C. von der Malsburg, R.P. Wurtz, W. Konen. Distortion
Invariant Object Recognition in the Dynamik Link Architecture. IEEE
Transactions on Computers 1992, 42(3):300311. Available at: http://citeseer.nj.nec.com/lades93distortion.html 
C.Schmid & R.Mohr (1997) Local Grayvalue Invariants for Image Retrieval, IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(5), 530535. 
19.11/6  Jacobs  Lecture Hidden Markov models (view as web page). 
Rabiner, "A Tutorial
on Hidden Markov Models and Selected Applications in Speech Recognition.

20.11/11  Ho
Lee 
Presentations: HMMs for classification
J. Yamato, J. Ohya, and K. Ishii, “Recognizing
Human Action in TimeSequential Images Using Hidden Markov Model,” CVPR
’92, pages 379385. J. Li, A. Najmi, and R. Gray, ``Image Classification by a Two Dimensional Hidden Markov Model,’’ IEEE Transactions on Signal Processing, February 2000. Presentation as pdf. 

21. 11/13  Jacobs  Lecture: Generative models of objects. Markov Random Fields. Gibbs Energy. Gibbs sampling.  S.Geman and D.Geman. "Stochastic relaxation, gibbs distributions, and the bayesian
restoration of images", IEEE Transactions on Pattern Analysis and Machine
Intelligence, 6:721741, 1984.
U. Grenander Y. Chow, and D. M. Keenan. "Hands. A Pattern Theoretic Study of Biological Shapes", Springer Verlag, New York, 1991. 
22. 11/18  Jacobs  Lecture: Skeletons and Parts. (See previous lecture notes)  
23.11/20  Mihalcik
Tran 
Presentations: Parts
K. Siddiqi, A. Shokoufandeh, S. J. Dickinson & S. W. Zucker. Shock Graphs and Shape Matching. International Journal of Computer Vision, 35(1), 1332, 1999. Presentation (view as web page). Pedro F. Felzenszwalb and Daniel P. Huttenlocher. 

24. 11/25  Ling
Gordon 
Presentations:
S. Geman, D. Potter, and Z. Chi. Composition systems. Quarterly of Applied Mathematics, LX, 2002, 707736. Presentation S. Zhu, Embedding
Gestalt Laws in Markov Random Fields  A theory for shape modeling and
perceptual organization. 

25 & 26. 12/2 & 12/4  Jacobs  Lecture: Linear separators, naive bayes, perceptrons, svms, boosting, winnow. (as web page).  C.
Burges, A
Tutorial on Support Vector Machines for Pattern Recognition
``Learning Quickly When Irrelevant Attributes Abound: A New Pattern Classification, Duda, Hart and Stork. 
27. 12/11  Fails
Shirdhonkar 
Presentations: Linear Classifiers
P. Viola and M. Jones. Robust realtime object detection. Technical Report 2001/01, Compaq CRL, February 2001 H. Schneiderman and T. Kanade "Object Detection Using the Statistics of Parts" International Journal of Computer Vision. Presentation (view as web page). 

12/16  Review Session. A.V. Williams, 4424, 3:305:00 