CMSC828J Advanced Topics in Information Processing: Image Segmentation

General Information



Class Time  

Tue, Thu 12:30-1:45


CSI 3120

Course Info

See below


Readings available on reserve in CS library and on web.  See below 






David Jacobs



djacobs at cs dot umd dot edu



AVW 4421


Office hours

Wed. 3:00-4:00 or by appt.



Midterm available here. 


Image Segmentation is the process of dividing images up into meaningful subsets that correspond to surfaces or objects.  This is a central problem in vision, because recognition and reconstruction often rely on this information.  In this class we will survey a variety of different approaches to image segmentation.

What differentiates image segmentation from other clustering problems is that images have a natural 2d neighborhood structure.  As a consequence, many segmentation algorithms can be thought of as diffusing information about image similarity among nearby pixels.  We will begin by discussing diffusion processes, including anisotropic diffusion processes, which do exactly that.  We will then discuss approaches that diffuse probabilistic information by assuming Markov models of image probabilities.  These methods include Markov random fields, belief propagation, and linear relaxation labeling.  Other segmentation methods that rely on the natural graph structure of the image include normalized cut approaches to image segmentation, algebraic multigrid methods, and other graph algorithms such as shortest path methods for finding image segmentations.  Finally, we will discuss methods for applying more generic clustering techniques, such as E-M, to capitalize on the neighborhood structure of images.  Along the way, we will consider segmentation methods that rely on texture and motion cues.   The goal of the class will be to familiarize students with current research approaches to image segmentation, while at the same time teaching the theoretical foundations underlying this work.

The class will try to emphasize interactive discussions as much as possible.  We will alternate between lecture/discussions of the basic mathematical and algorithmic techniques behind segmentation methods, and discussion of research papers that apply these techniques.  Students will be required to prepare for and help lead discussions.  They will also do group projects in which they implement and test an approach to segmentation.  There will also be a midterm and final exam covering the basic computational techniques we’ve learned.  Students will find it important to have some prior knowledge of vision, mathematical sophistication, and familiarity with topics such as calculus, linear algebra, probability and statistics.


Here is my current plan for the workload of the class.  This may change during the first two weeks, as the number of students settles down.

1) Reports.  There will be about 15 classes in which we discuss research papers.  Prior to each of these classes, students must turn in a one page summary and critique of one of the papers to be discussed.  I prefer if you turn in a hardcopy to me at the start of class.  This should contain one paragraph summarizing the paper to be discussed, and one paragraph critiquing the paper.  The summary should focus on what you think is most important.  The critique should explain whether you think the paper is worthwhile, and why.  In addition, you should list two questions concerning: 1) Something you didn't understand in the paper.  2) Something about the paper you would like to discuss.  Late papers will not be accepted, since the goal of these reports is to get you to think about papers before we discuss them.  However, each student can skip three of these reports.  20% of grade

2) Paper expert.  Students will be assigned in pairs to become experts on some of the papers we will discuss.  Students should be prepared to help lead discussion, to answer our questions about these papers, and to provide interesting and insightful extra comments culled from the related literature.  This will be a substantial part of the grade.  Students will be expected to do this at least twice.  If we have too many students for effective discussion, we may resort to having the experts give more formal paper presentations.  20% of grade.  If the class is too crowded to allow everyone to be an expert twice, the percentage of the grade this counts for may be reduced.

3) Problem set, Midterm, Final.  These will be based on material from the lectures.  40% of grade

4) Project.  Student will choose one:  20% of grade

     a) Write a detailed, paper, approximately five pages in length, proposing research that extends or adapts one of the approaches discussed in class.  You may choose to base this on the papers for which you served as expert. 

     b) Programming project: in teams, student will implement a technique discussed in class, and apply it to some real data.  This is not meant to be a research project, but something closer to an extended problem set. 

Class Schedule

This schedule should be considered more of a guideline than a rigid plan.  In many cases I have scheduled an entire class to discuss a single paper, but I expect that we should be able to move a bit quicker than that.  So some extra topics are listed at the end, which we will discuss if time permits.




Background Reading

1. 8/31




2. 9/2


Lecture: Perceptual grouping in human vision

Organization in Vision by Gaetano Kanizsa, Prager Publishers, 1979.

3. 9/7


Paper discussion.

Students must write a one page review encompassing both papers below:

a. Kanizsa, G., "Subjective Contours" Sci. Am. 234 (1976) 48-52.  (On reserve in CS library, 3rd floor of AV Williams.)

b.Julesz, B., ``Subjective Contours in Early Vision and Beyond,'' DIMACS Workshop on Partitioning Data Sets, edited by Cox, I., Hansen, P., and Julesz, B., 1995.  (On reserve).





Lecture: Fourier Transforms

This material is covered in many standard techniques.  You might look at:A Wavelet Tour of Signal Processing, by Mallat for this and material on wavelets. Chapters 2 and 3 are on the Fourier Transform (on reserve in CS library).

5. 9/14


Lecture: Fourier Transforms





6. 9/21


Lecture: Diffusion Processes 

R. Ghez, Diffusion Phenomena .  John Wiley and Sons, 2001, chapter 1.  On reserve in the CS library.  You're responsible for this material.

7. 9/28





Paper Discussion

"A review of nonlinear diffusion filtering," by Joachim Weickert.  In Scale-Space Theory in Computer Vision, Lecture Notes in Computer Science, Vol. 1252, Springer, Berlin, pp. 3-28, 1997.

"Efficient and reliable schemes for nonlinear diffusion filtering", by J. Weickert, B.M. ter Haar Romeny, and M.A. Viergever, IEEE Trans. Imag. Proc., 1998.

8. 9/30




Paper Discussion

N. Sochen, R. Kimmel and R. Malladi,
``A General Framework for Low Level Vision",
IEEE Trans. in Image Processing, Special Issue on Geometry Driven Diffusion, 7 (1998) 310-318.

N. Sochen ,
``Affine Invariant Flows via the Beltrami Framework'',
Journal of Mathematical Imaging and Vision, 20:133-145, 2004.

See Sochen and Kimmel's web pages for other related material.

9. 10/5


Paper Discussion

Williams, L.R. and D.W. Jacobs, Local Parallel Computation of Stochastic Completion Fields, Neural Computation 9(4), pp. 859-881, 1997.

Williams, L.R. and D.W. Jacobs, Stochastic Completion Fields: A Neural Model of Illusory Contour Shape and Salience, Neural Computation 9(4), pp. 837-858, 1997.

D. Mumford, ``Elastica and Computer Vision'', Algebraic Geometry and its Applications, edited by Chandrajit Bajaj, New York, Springer-Verlag, 1994.  On reserve in CS library.

G. Guy and G.G. Medioni. Inferring global perceptual contours from local features. International Journal of Computer Vision, 20(1/2):113--133, 1996.

10. 10/5


Lecture: Markov models, belief propagation, Markov Random Fields, Relaxation Labeling.

Excerpt from Computer Vision by Ballard and Brown, pp. 408-430.  Available at Computer Science Library.

Textbooks on Stochastic Processes, such as Chapters 3 and 4 of An Introduction to Stochastic Modeling, by Taylor and Karlin, 3rd Edition, Academic Press, 1998.

Jordan M.I. and Weiss Y. Graphical models: probabilistic inference In Arbib, M. (ed): Handbook of Neural Networks and Brain Theory. 2nd edition. MIT Press

J. Yedidia,W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. International Joint Conference on Artificial Intelligence (IJCAI 2001) Distinguished Papers Track, 2001.

Check Lise Getoor's references for the Graphical Models Reading Group

11. 10/7


Lecture: Markov models, belief propagation, Markov Random Fields, Relaxation Labeling (continued).

Markov Random Field Modeling in Image Analysis (Computer Science Workbench) by Stan Z. Li (excerpt on reserve in CS library).

12. 10/12&14





Paper Discussion:

Williams, L.R. and K.K. Thornber, A Comparison of Measures for Detecting Natural Shapes in Cluttered Backgrounds, Intl. Journal of Computer Vision 34 (2/3), pp. 81-96, 1999.

A. Shashua and S. Ullman. Structural saliency: The detection of globallly salient structures using a locally connected network. In International Conference on Computer Vision, pages 321--327, 1988.

P. Parent and S. W. Zucker. Trace inference, curvature consistency, and curve detection. In IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 11, August 1989.

T. D. Alter and Ronen Basri, Extracting Salient Curves from Images: An Analysis of the Saliency Network, International Journal of Computer Vision, 27(1): 51-69, 1998.

Correctness of local probability propagation in graphical models with loops. Weiss Y.
Neural Computation 12 (1-41) 2000




Paper Discussion:

S.Geman and D.Geman. "Stochastic relaxation, gibbs distributions, and the bayesian restoration of images",  IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721--741, 1984.

B.S. Manjunath and R. Chellappa. Unsupervised texture segmentation using markov random field models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:478--482, 1991.

Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov, Olga Veksler, Ramin Zabih.
In IEEE transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, 2001.

J. Sun, H.-Y. Shum, and N.-N. Zheng. Stereo matching using belief propagation. In A. Heyden, G. Sparr, M. Nielsen, and P. Johansen, editors, Computer Vision - ECCV 2002, number 2351 in Lecture Notes in Computer Science, pages 510–524, 2002.






Spectral Methods:

Paper Discussion

Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence , 22(8):888-905, August 2000.

D. Spielman and S. Teng.  Spectral partitioning works: planar graphs and finite element meshs.  In Proceedings of the 37th Annual Symposium on Foundations of Computer Science, 1996.

On Clusterings : Good, bad and spectral R. Kannan, S. Vempala and A. Vetta, in Proceedings of the Symposium on Foundations of Computer Science (2000). Final version to appear in Journal of the ACM.

15. 11/2



Paper Discussion

C.W. Gear. multibody grouping from motion images. IJCV, 29(2):133--150, 1998. (Available in CS library).

J.P. Costeira and T. Kanade. A multibody factorization method for independently moving objects. IJCV, 29(3):159--179, 1998. (Available in CS library).

Recent papers by Rene Vidal and collaborators on multi-body segmentation.


Sen (Ng)


Liu (Maila)

Huang (Verman)

Yi /Bilgic(Weiss)

Paper Discussion.  We will each read one of the papers below in detail, and discuss them all.

On Spectral Clustering: Analysis and an algorithm, Andrew Y. Ng, Michael Jordan, and Yair Weiss. In NIPS 14,, 2002

Deepak Verma and Marina Meila  "Comparison of Spectral Clustering Methods" (submitted).

Learning Segmentation with Random Walk Marina Maila and Jianbo Shi Neural Information Processing Systems, NIPS, 2001

Segmentation using eigenvectors: a unifying view. Weiss Y.  Proceedings IEEE International Conference on Computer Vision p. 975-982 (1999)


17. 11/11


Lecture: Shortest path algorithms for finding boundaries.

Intelligent Scissors for Image Composition, by Eric Mortensen and William Barrett, SIGGRAPH '95.

D. Geiger, A. Gupta, L.A. Costa, and J. Vlontzos, "Dynamic programming for detecting, tracking, and matching deformable contours", IEEE Trans. PAMI, vol. PAMI-17, no. 3, pp. 294--302, Mar. 1995.

M. Kass, A. Witkin and D. Terzopoulos : Snakes: active contour models. Int. J. of Comp. Vision, 1, 321-331 (1988).

18. 11/16




Paper Discussion.

Globally Optimal Regions and Boundaries as Minimum Ratio Cycles, Ian H. Jermyn and Hiroshi Ishikawa. IEEE Transactions on Pattern Analysis and Machine Intelligence (Special Section on Graph Algorithms and Computer Vision), Vol. 23, No. 10, pp. 1075–1088, October 2001.


19. 11/16&11/18


Lecture: Algebraic multigrid

An Introduction to Algebraic Multigrid, by Klaus Stuben. Appendix A in Multigrid, by U. Trottenberg, C. Oosterlee and A. Schuller, Academic Press, 2001.

20. 11/18





Paper Discussion:

Eitan Sharon, Achi Brandt, and Ronen Basri, Fast Multiscale Image Segmentation, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), South Carolina, Vol. I: 70-77, 2000.


21. 11/23


Lecture: Wavelets

 There are many texts on wavelets available.  I have made use of the following:

A Wavelet Tour of Signal Processing, by Stephane Mallat, Academic Press, 1998.

Ten Lectures on Wavelets, Ingrid Daubechies, SIAM, 1992.

23. 11/30





Paper Discussion.  We will discuss extensions of two previously discussed works to handle texture.  You should read both papers and write a summary covering both.

M. Galun, E. Sharon, R. Basri, A. Brandt, Texture Segmentation by Multiscale Aggregation of Filter Responses and Shape Elements, Proceedings IEEE International Conference on Computer Vision,  716-723, Nice ,France, 2003.

Jitendra Malik, Serge Belongie, Thomas Leung, and Jianbo Shi. Contour and texture analysis for image segmentation. International Journal of Computer Vision, 2000.

Psychological Review 95, 15--48. Willliams, D. and B. Julesz (1992). Filters versus textons in human and machine texture discrimination. In H. Wechsler (Ed.), Neural networks for perception, Volume 1, pp. 145--175. Academic Press Inc.

24. 11/30


Lecture: Level Sets

J. A. Sethian, Level Set Methods and Fast Marching Methods.  Cambridge monographs on applied and computational methods.  1996.  On reserve in CS library.

Level Set Methods in Image Science Richard Tsai and Stanley Osher

25. 12/2


Lecture: Level Sets (continued)


26. 12/7





Paper Discussion.

Active contour without edges

Chan, T.F.; Vese, L.A., IEEE Transactions on Image Processing, 10 (2), Feb. 2001, pp. 266 -277


27. 12/7 &12/9


Lecture: E-M

Forsyth and Ponce, Computer Vision A Modern Approach, Chapter 16


E-M tutorial by Yair Weiss

28. 12/9






Paper Discussion: A. Jepson and M. Black, Mixture models for optical flow, Tech. Report, Res. in Biol. and Comp. Vision, Dept. of Comp. Sci., Univ. of Toronto, RBCV-TR-93-44, 1993

 Multibody Factorization with Uncertainty and Missing data using the EM algorithm

Amit Gruber and Yair Weiss
To appear in the International Conference on Computer Vision and Pattern Recognition (CVPR) 2004.

Final: 12/17


Final Exam.  Friday, Dec. 17th at 1:30-3:30.