CMSC828J Advanced Topics in Information Processing: Approaches to Representing and Recognizing Objects

General Information

 

Class Time  

Tue, Thu 3:30-4:45

Room

CSI 3118

Course Info

See below

Text

Readings available on reserve in CS library and on web.  See below 

Personnel

 

Instructor

Name

David Jacobs

Email

djacobs at cs dot umd dot edu

Office

AVW 4421

Office hours

Tue 11:00-12:00, Wed. 3:30-4:30 or by appt.

 

Description

One of the most basic problems in vision is to use images to recognize that a particular object or event that we’ve never seen before belongs to a particular class of objects or events.   To do this we must have a rich notion of what an object is, that can capture what is common in them.  For example, chairs vary tremendously in their shape and material properties.  How do we look at a chair we’ve never seen before and identify it as a chair? Accounting for this variation in recognition is largely an unsolved problem.  In this course we will survey a number of approaches to representing and recognizing objects.  We will draw inspiration by looking at work from philosophy, psychology, linguistics, and mathematics.  However, our primary focus will be more concrete, to learn the algorithms and analytic tools that have been applied in visual object classification.

First we will study approaches based on the idea that objects can be described by a set of necessary and sufficient image properties.  This has taken the form of invariant representations.  We will study the advantages and limitations of geometric and photometric invariants.  Second we will consider classification approaches based on powerful similarity measures.  These include methods of shape comparison, deformable template matching, and lighting insensitive matching.  Third we will consider methods that attempt to represent the images of classes of objects using subspaces.  This includes approaches based on PCA, linear combinations, and manifold representations of classes.  Fourth, we will look at approaches to building generative models of classes, including the use of hidden markov models and pattern theory.  Finally, we will consider the idea of building classifiers directly, without explicit representations of the class, using methods such as support vector machines, Winnow, and naïve Bayes.   

The class will alternate between lectures teaching the basic mathematical and algorithmic techniques of these methods, and student-led discussion of vision research papers that apply these techniques.  It will be essential for students to have a solid understanding of basic topics in math, such as linear algebra, probability and statistics, and calculus.  It will also be useful to have some knowledge of computer vision, image processing, functional analysis, stochastic processes, or geometry.  In general, the more math a student knows, the easier the course will be.

Requirements

Here is my current plan for the workload of the class.  This may change prior to the first day of class.

1) Reports.  There are 15 classes scheduled for the presentation of papers.  Prior to each of these classes, students must turn in a one page summary and critique of one of the papers to be discussed.  Late papers will not be accepted, since the goal of these reports is to get you to think about papers before we discuss them.  However, each student need only turn in reports for 12 of these classes.  20% of grade

2) Presentation.  Students will be assigned in pairs to present (usually) two papers in one class.  This will be a substantial part of the grade.  Presentations should be well prepared.  If enrollment is low enough, students may be expected to do this twice.  20% of grade

3) Midterm and Final.  These will be based on material from the lectures.  40% of grade

4) Project.  Student will choose one:  20% of grade

     a) Write a detailed, paper, approximately five pages in length, proposing research that extends or adapts one of the approaches discussed in class.  You may choose to base this on the papers you have presented. 

     b) Programming project: student will implement a technique discussed in class, and apply it to some real data.  This is not meant to be a research project, but something closer to an extended problem set. 

Class Schedule

The schedule below is probably overly ambitious, so expect that we won't get to a couple of these classes.  Pairs of students will lead the discussion of papers, as indicated.  There will probably not be enough students to lead all these classes, so I will lead discussion for any extras.  Classes October 14 and 16 will have to be rescheduled, due to the International Conference on Computer Vision.  

Class

Presenters Topic Background Reading
1. 9/2 Jacobs Introduction (view as web page).
2. 9/4 Jacobs Paper Presentation:  (view as web page).

Students must review (a) or (b).

a. Women, Fire and Dangerous Things by Lakoff, Chapters 1 and 2.  On reserve.

b. S. Laurence and E. Margolis, ``Concepts and Cognitive Science'', in Concepts edited by E. Margolis and S. Laurence, MIT Press, 1999.  On reserve.

c. L. Wittgenstein, Philosophical Investigations, sections 65-78.  On reserve.

3.9/9

 

Jacobs Lecture: Affine and projective geometry and invariants.

(view as web page).

Introduction to Projective Geometry, C.R. Wylie, McGraw-Hill Book Co.,  1970.

Y. Lamdan, J. T. Schwartz, and H. J. Wolfson. Affine invariant model-based object recognition. IEEE Journal of Robotics and Automation, 6:578--589, 1990

I. Weiss. Geometric Invariants and Object Recognition. Intl. J. Computer Vision, 10:207--231, 1993

J. Burns, R. Weiss, and E. Riseman, ``The Non-Existence of General-Case View-Invariants’’, in Geometric Invariance for Computer Vision, edited by J. Mundy and A. Zisserman, MIT Press, 1992 

Moses, Y. and Ullman, S. (1992). ``Limitations of non model-based recognition schemes’’. In Sandini, G., editor, Proc. 2nd European Conf. on Computer Vision, Lecture Notes in Computer Science, volume 588, pages 820--828. Springer Verlag.

J. Mundy and A. Zisserman, Appendix – Projective Geometry for Machine Vision, in Geometric Invariance for Computer Vision, edited by J. Mundy and A. Zisserman, MIT Press, 1992.

``In Search of Illumination Invariants,'' IEEE Conference on Computer Vision and Pattern Recognition, pp.~{254--261}, (June 2000).  H. Chen, P. Belhumeur, and D. Jacobs.

4. 9/11 Jacobs Lecture: Geometric invariance (conclusion) and photometric invariance. (view as web page).
5. 9/16 Jacobs Presentation, David Jacobs: Classification with Invariants

(view as web page).

a. D. Lowe: Three-Dimensional Object Recognition from Single Two-Dimensional Images.  Artificial Intelligence, 1987.

b. Biederman, I. (1987). Recognition--by--components: A theory of human image understanding. Psychological Review, 94(2):115--147.  On reserve.

Jepson, A.,  W. Richards, and D. Knill, Modal structure and reliable inference, in ``Perception as Bayesian Inference," eds. D. Knill and W. Richards, Cambridge Univ. Press, 1996, pp. 63-92.
Description: This is Chapter 2 from ``Perception as Bayesian Inference." 

Biederman has many papers on this topic, including:

 Biederman, I. Gerhardstein, P. (1993). Recognizing depth-rotated objects: Evidence and conditions for three-dimensional viewpoint invariance." Journal of Experimental Psychology: Human Perception and Performance, 19, 1162-1182.

 Biederman, I. Gerhardstein, P. (1995). Viewpoint-dependent mechanisms in visual object recognition: reply to Tarr and Bulthoff Journal of Experimental Psychology: Human Perception and Performance, 21(6), 1506-1514.

D. Jacobs.  ``What Makes Viewpoint Invariant Properties Perceptually Salient,'' Journal of the Optical Society of America A, in press.  

6. 9/18 Rao Class canceled so students may attend a lecture by C. R. Rao: Has Statistics a Future?  If so, in what form?, 3:30-4:30, 1524 Van Munching Hall, the Howard Frank Auditorium.
7. 9/23 Jacobs Lecture: Linear subspaces – geometry & PCA

(view as web page).

Duda, Hart and Stork, pp. 114-117.  On reserve in library.

Shimon Ullman and Ronen Basri, Recognition by Linear Combinations of Models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(10): 992-1006, 1991.Available at: http://www.wisdom.weizmann.ac.il/~ronen/publications.html

 D. Jacobs "Matching 3-D Models to 2-D Images," the International Journal of Computer Vision, (21)1/2:123--153, January, 1997.  On reserve.

Turk, M. & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3, 71-86

 

8. 9/25 Jacobs Lecture: Linear subspaces – photometry

(view as web page).

Shashua. On photometric issues to feature-based object recognition. Int. J. Computer Vision, 21:99-- 122, 1997.

``Lambertian Reflectance and Linear Subspaces,'' IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(2):218-233, (2003).  R. Basri and D. Jacobs.  Available at: http://www.wisdom.weizmann.ac.il/~ronen/publications.html

 

9. 9/30 Nanda

Sen

Presentations: Linear Representations of Classes.

T.F. Cootes and C.J. Taylor, "Statistical models of appearance for medical image analysis and computer vision", Proc. SPIE Medical Imaging 2001.  Presentation.  

 Face Recognition Based on 3D Shape Estimation from Single Images”, by Blanz and Vetter, CGF-TR 2, October 2002, University of Freiburg.  There is a version in PAMI 2003 available on line, I'll add a link.  Presentation.  (view as web page).

``Statistical models of appearance for computer vision'' by Cootes and  Taylor

Lohmann, G.P. 1983.  Eigenshape analysis of microfossils: a general morphometric procedure for describing changes in shape.  Mathematical Geology 15:659-672.

10.9/30

4:45-6:00

"Optional"

Jacobs

Presentation: Prototypes and Natural Categories.  (view as web page).

Posner, M.I., & Keele, S.W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, 353-363.   Available at: http://step.psy.cmu.edu/scripts/Memory/PosnerKeele1971.html

E. Rosch, C. Mervis, W. Gray, D. Johnson, and P. Boyes-Braem, ``Basic Objects in Natural Categories'', Cognitive Psychology, 8:382--439.  On reserve.

S. Ullman, High-level Vision, MIT Press, 1996, Chapter 6.

Ronen Basri, Recognition by Prototypes, International Journal of Computer Vision, 19(2): 147-168, 1996.

 

11. 10/2 Cuntoor

Raykor

Presentations: Non-linear subspaces

S. Edelman, ``Representation is Representation of Similarities’’, Behavioral and Brain Sciences.  Presentation (view as web page).

Joshua B. Tenenbaum, Vin de Silva, John C. Langford, ``A Global Geometric Framework for Nonlinear Dimensionality Reduction’’, Science.

Sam T. Roweis, Lawrence K. Saul, Nonlinear Dimensionality Reduction by Locally Linear Embedding, Science, Science.  Presentation  (view as web page).

H. Sebastian Seung and Daniel D. Lee The Manifold Ways of Perception, Science.

H. Murase and S.K. Nayar. Visual Learning and Recognition of 3D Objects from Appearance. International Journal of Computer Vision, vol. 14, no. 1, Jan 1995, pp 5-24.

Ronen Basri, Dan Roth, and David Jacobs, Clustering Appearances of 3D Objects, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Santa Barbara: 414-420, 1998 

Cutzu, F., and S. Edelman, Representation of object similarity in human vision: psychophysics and a computational model, Vision Research 38:2227-2257, 1998

12.10/7 Eaton

Mativo

Presentations: The psychology of similarity and view-based recognition.

E. Goldmeier, Similarity in Visually Perceived Forms, International Universities Press, Psychological Issues, Volume VIII, Number 1, Monograph 29.  Chapters: 1-5.  On reserve.

H. Bulthoff, S. Edelman, and M. Tarr, ``How Are Three-Dimensional Objects Represented in the Brain?'' MIT AI Memo #1479.

Check out Mike Tarr's class page for many relevant references.

Poggio, T. and Edelman, S., A network that learns to recognize three-dimensional objects.  Nature, 343:363-266.

Liu Z, Knill D C, and Kersten D. Object classification for human and ideal observers. Vision Research, 35:549--568, 1995
13.10/9 Jacobs

Presentations, mixed with lecture: Shape and Nature.  Students must review (a).     

(a) D’arcy Thompson, On Growth and Form, Dover Books, 1992, Chapters 1 and 17.  On reserve.

 (b) A guide to Tree Identification (just skim this).  On reserve.

 

Morphometric tools for Landmark data, by Bookstein

 D. G. Kendall. A survey of the statistical theory of shape. Statistical Science, 4(2):87120, 1989

Shape and Shape Theory, by Kendall, Barden, Carne and Le

Statistical Shape Analysis by I. L. Dryden and Kanti V. Mardia

Geometric Morphometrics: Ten Years of Progress Following the ‘Revolution’ Dean C. Adams, F. James Rohlf , and Dennis E. Slice.  
10/14 No Class, ICCV
10/16 No Class, ICCV
14. 10/21 Jacobs Lecture: Morphometrics.  (view as web page). See previous class.
15. 10/23 Yi (?)

Tahmoush

Presentations: Morphometrics and Recognition.

Serge Belongie, Jitendra Malik and Jan Puzicha Shape Matching and Object Recognition Using Shape Contexts PAMI, 24(4):509-522, April 2002.           

Rangarajan, H. Chui, and F. Bookstein, "The softassign procrustes matching algorithm," in Proceedings of ICIAP '97, Lecture Notes in Computer Science, vol. 1310, pp. 29--42, Springer-Verlag, 1997.  Presentation (view as web page).
16. 10/28 Jacobs Lecture: Fourier descriptors and wavelets. (view as web page)

A Wavelet Tour of Signal Processing, by Mallat.

 “Introduction and Overview of Fourier Descriptors”, by Lestrel.

17. 10/30 Aggarwal

Nath

Presentations: Wavelet-based texture classification.

J. S. De Bonet and P. Viola Texture Recognition Using a Non-parametric Multi-Scale Statistical Model, CVPR ’98.  Available at: http://www.debonet.com/Research/Publications/  

Presentation (view as web page).

J Portilla and E P Simoncelli. A Parametric Texture Model based on Joint Statistics of Complex Wavelet Coefficients. Int'l Journal of Computer Vision. 40(1):49-71, October, 2000.  Presentation
18. 11/4 Ran

Jacobs

Presentations: Wavelet-based representations for object classification. 

M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. von der Malsburg, R.P. Wurtz, W. Konen. Distortion Invariant Object Recognition in the Dynamik Link Architecture. IEEE Transactions on Computers 1992, 42(3):300-311. Available at: http://citeseer.nj.nec.com/lades93distortion.html Presentation

 C.Schmid & R.Mohr (1997) Local Grayvalue Invariants for Image Retrieval, IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(5), 530-535.  
19.11/6 Jacobs Lecture Hidden Markov models (view as web page).

Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. 

R. Dugad and U. Desai, Technical Report: SPANN-96.1, IIT Bombay.  A Tutorial on Hidden Markov Models.  Available at: http://www.ling.gu.se/~leifg/stat02/doc/newhmmtut.pdf
20.11/11 Ho

Lee

Presentations: HMMs for classification

J. Yamato, J. Ohya, and K. Ishii, “Recognizing Human Action in Time-Sequential Images Using Hidden Markov Model,” CVPR ’92, pages 379-385.

J. Li, A. Najmi, and R. Gray, ``Image Classification by a Two Dimensional Hidden Markov Model,’’ IEEE Transactions on Signal Processing, February 2000.

Presentation as pdf.

21. 11/13 Jacobs Lecture: Generative models of objects.  Markov Random Fields. Gibbs Energy.  Gibbs sampling.

Presentation (as web page)

S.Geman and D.Geman. "Stochastic relaxation, gibbs distributions, and the bayesian restoration of images",  IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721--741, 1984.

U. Grenander Y. Chow, and D. M. Keenan. "Hands. A Pattern Theoretic Study of Biological Shapes", Springer Verlag, New York, 1991.

22. 11/18 Jacobs Lecture: Skeletons and Parts. (See previous lecture notes)
23.11/20 Mihalcik

Tran

Presentations: Parts

K. Siddiqi, A. Shokoufandeh, S. J. Dickinson & S. W. Zucker.  Shock Graphs and Shape Matching. International Journal of Computer Vision, 35(1), 13-32, 1999.  Presentation (view as web page).

Pedro F. Felzenszwalb and Daniel P. Huttenlocher.
Pictorial Structures for Object Recognition.  

24. 11/25 Ling

Gordon

Presentations: 

S. Geman, D. Potter, and Z. Chi.  Composition systemsQuarterly of Applied Mathematics, LX, 2002, 707-736.  Presentation

S. Zhu, Embedding Gestalt Laws in Markov Random Fields -- A theory for shape modeling and perceptual organization.
IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 21, No.11, pp1170-1187, Nov, 1999.   Presentation (view as web page).

25 & 26.  12/2 & 12/4 Jacobs Lecture: Linear separators, naive bayes, perceptrons, svms, boosting, winnow.  (as web page). C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition

``Learning Quickly When Irrelevant Attributes Abound: A New
Linear-Threshold Algorithm
,'' Machine Learning 2: 285--318, 1988,
N. Littlestone.  To access Click "Journal Contents", and click "Issue 4, April 1988".

Pattern Classification, Duda, Hart and Stork.

27. 12/11 Fails

Shirdhonkar

Presentations: Linear Classifiers

P. Viola and M. Jones. Robust real-time object detection. Technical Report 2001/01, Compaq CRL, February 2001

H. Schneiderman and T. Kanade "Object Detection Using the Statistics of PartsInternational Journal of Computer Vision.  Presentation (view as web page).

12/16 Review Session.  A.V. Williams, 4424, 3:30-5:00