CMSC828J Advanced Topics in Information Processing: Image Segmentation

General Information



Class Time  

Mon, Wed. 3:30-4:45


CSI 1122

Course Info

See below


Readings available from instructor and on web.  See below 






David Jacobs



djacobs at cs



AVW 4421


Office hours

Mon. 2:00-3:00 or by appt.



Study guide for Final

Papers are now complete for discussion on March 7.

Rubric for presentations posted

The midterm was posted 3/26 (see below). 

On 3/27 I updated the midterm with two minor clarifications.

On Monday, April 2, office hours will be 9:30-10:30, and after class.



Image Segmentation is the process of dividing images up into meaningful subsets that correspond to surfaces or objects.  This is a central problem in vision, because recognition and reconstruction often rely on this information.  In this class we will survey a variety of different approaches to image segmentation.

What differentiates image segmentation from other clustering problems is that images have a natural 2d neighborhood structure.  As a consequence, many segmentation algorithms can be thought of as diffusing information about image similarity among nearby pixels.  We will begin by discussing diffusion processes, including anisotropic diffusion processes, which do exactly that.  At the same time, we will discuss other local operations, such as edge detectors, that make judgements about image boundaries based on this information.  We will then discuss approaches that diffuse probabilistic information by assuming Markov models of image probabilities.  These methods include Markov random fields, belief propagation, and linear relaxation labeling.  Other segmentation methods that rely on the natural graph structure of the image include normalized cut approaches to image segmentation, algebraic multigrid methods, and other graph algorithms such as shortest path methods for finding image segmentations.  Finally, we will discuss methods for applying more generic clustering techniques, such as E-M, to capitalize on the neighborhood structure of images.  Along the way, we will consider segmentation methods that rely on texture, color, and motion cues.   The goal of the class will be to familiarize students with current research approaches to image segmentation, while at the same time teaching the theoretical foundations underlying this work.  A secondary goal will be to introduce many concepts that are fundamental in low-level vision (eg., filtering, edge detection, color and texture analysis).

The class will consist of lectures on basic material, and discussion and reading of work that is more current and/or speculative.  Students will be required to prepare for and help lead some of the discussions.  Students who are not leading discussions will still be expected to read papers to prepare for them.  They will also do problem sets or projects in which they implement and test an approach to segmentation, or propose novel work on segmentation.  There will also be a midterm and a final exam covering the basic computational techniques we’ve learned.  Students will find it important to have some prior knowledge of vision, mathematical sophistication, and familiarity with topics such as calculus, linear algebra, probability and statistics.


Here is my current plan for the workload of the class.  This may change during the first two weeks, as the number of students settles down.

1) Reports.  There will be (probably) 8 classes in which we discuss research papers.  Prior to each of these classes, students must turn in a one page report on the papers to be discussed.  Prior to class, I will pose one or more questions, one of which you should answer in your report.  I prefer if you turn in a hardcopy to me at the start of class.   Late papers will not be accepted, since the goal of these reports is to get you to think about papers before we discuss them.  However, each student need not turn in a report on days when they are presenting a paper, and may also skip one additional report.  10% of grade

2)  Presentations.  Each student will be in charge of presenting an in-class summary of one paper, and leading a discussion about that paper.  On discussion days, three papers will be presented.  The three students presenting are encouraged to work together and coordinate their presentations.   Rubric 15 % of grade. 

3) Midterm, Final.  These will be based on material from the lectures, and background reading for the lectures.  50% of grade

4) Problem Set/Project.  Student will choose one:  25% of grade

     a) Three problem sets will be assigned, requiring implementations of three of the algorithms discussed in class.

     b) Programming/research project: This is meant to be a more open-ended project for students interested in research in image segmentation.  It should involve implementation of existing or novel algorithms for segmentation, and experiments on a real-world data set.

Problem Sets

Please hand in your solution to the problem sets, including: 1) A document, with pictures when appropriate, describing your results; 2) Your code.  I would prefer to receive your code by email in a zip file, and a hardcopy of the document, but I'll accept everything by email.

If problem sets are late, I will deduct 10% for each day they are late.  Problem sets will not be accepted after the class following the due date.

Problem Set

Supplementary Material



Problem Set 1







Problem Set 2   4/4/12 4/18/12

 Problem Set 3

 Test Image

Results: 1 2 3 4

Uncompressed Results: 1 2 3 4

4/26/12 5/9/12

Class Schedule

This schedule should be considered more of a guideline than a rigid plan.





Background Reading

1. 1/25




2. 1/30


Perceptual grouping in human vision

Vision Science, by Stephen Palmer, Chapter 6.  



You're responsible for this material.  See also:


Subjective Contours in Early Vision and Beyond, by Bela Julesz.

Kanizsa, G., "Subjective Contours" Sci. Am. 234 (1976) 48-52. 

3. 2/1



Fourier Transforms (1)

This material is covered in many standard techniques.  You might look at:A Wavelet Tour of Signal Processing , by Mallat for this and material on wavelets. Chapters 2 and 3 are on the Fourier Transform.


I also like the discussion in Elementary Functional Analysis by Shilov (This is part of the Dover Classics series, so there is a cheap paperback edition).


Some of this material is discussed in Forsyth and Ponce, Chapter 7.

4. 2/6


Fourier Transforms (2)

5. 2/8


Diffusion Processes 

R. Ghez, Diffusion Phenomena .  John Wiley and Sons, 2001, chapter 1. You're responsible for this material.

6. 2/13


Edge Detection

Forsyth and Ponce Chapter 8

7. 2/15


Non-linear Diffusion

"A review of nonlinear diffusion filtering," by Joachim Weickert.  In Scale-Space Theory in Computer Vision, Lecture Notes in Computer Science, Vol. 1252, Springer, Berlin, pp. 3-28, 1997.

See also Weickert's book: Anisotropic Diffusion in Image Processing

8. 2/20

Alex, Abhishek and Kaustav

Bilateral Filtering and Non-local Means.


Write a one page (or less) paper answering one of the following questions:

1) Based on all methods reviewed in both papers, does bilateral filtering still seem like a good filtering method?  Why or why not?

2) If you had to pick a single reason why NL-means works well, what is it?  Please explain.  What is the biggest disadvantage of NL-means relative to other methods?

Michael Elad.  On the Origin of the Bilateral Filter and ways to improve it.  IEEE Trans. on Image Processing, 2002.


A. Buades, B. Coll and J.M. Morel.  A Review of Image Denoising Algorithms, with a New One.  Siam Journal on Multiscale Modeling and Simulation, 2005.


See also:


Carlo Tomasi and Roberto Manduchi.  Bilateral Filtering for Gray and Color Images. ICCV 1998.


Sylvain Paris and Fredo Durand.  A fast approximation of the bilateral filter using a signal processing approach.  ECCV 2006

9. 2/22


Contours, Dynamic Programming and Markov Processes

D. Geiger, A. Gupta, L.A. Costa, and J. Vlontzos, "Dynamic programming for detecting, tracking, and matching deformable contours", IEEE Trans. PAMI, vol. PAMI-17, no. 3, pp. 294--302, Mar. 1995.

Intelligent Scissors for Image Composition, by Eric Mortensen and William Barrett, SIGGRAPH '95.

Williams, L.R. and K.K. Thornber, A Comparison of Measures for Detecting Natural Shapes in Cluttered Backgrounds, Intl. Journal of Computer Vision 34 (2/3), pp. 81-96, 1999.

A. Shashua and S. Ullman. Structural saliency: The detection of globallly salient structures using a locally connected network. In International Conference on Computer Vision, pages 321--327, 1988.

10. 2/27


Markov Random Fields  Notes in text

Also see Boykov, Veksler and Zabih.

Markov Random Field Modeling in Image Analysis (Computer Science Workbench) by Stan Z. Li.


Fast Approximate Energy Minimization via Graph Cuts, by Boykov, Veksler, and Zabih.

S.Geman and D.Geman. "Stochastic relaxation, gibbs distributions, and the bayesian restoration of images",  IEEE Transactions on Pattern Analysis and Machine Intelligence, 6:721--741, 1984.

11. 2/29


Conditional Random Fields

C. Sutton and A. McCallum.  An introduction to conditional random fields for relational learning. In An Introduction to Statistical Relational Learning, edited by Getoor and Taskar. 

X. Ren, C. Fowlkes, and J. Malik.  Learning probabilistic models for contour completion in natural images. 

Kumar, S., and Hebert, M. (2006). Discriminative random fields. International Journal of Computer Vision, 68(2).

See also:

X. He, R. Zemel, and M. Carreira-Perpinan.  Multiscale conditional random fields for image labeling, CVPR 2004.

12. 3/5


Graph Cuts  See Boykov and Jolly

Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D images.
Yuri Boykov and Marie-Pierre Jolly. In International Conference on Computer Vision, (ICCV), vol. I, pp. 105-112, 2001.


GrabCut - Interactive Foreground Extraction using Iterated Graph Cuts
Carsten Rother, Vladimir Kolmogorov and Andrew Blake.
In ACM Transactions on Graphics (SIGGRAPH), August 2004.  

13. 3/7




CRFs and MRFs and Graph-based methods

Kumar, S., and Hebert, M. (2006). Discriminative random fields. International Journal of Computer Vision, 68(2).

Felzenszwalb and Huttenlocher.   Efficient Graph-based image segmentation.  IJCV 2004.

Felzenszwalb and Veksler.  Tiered Scene Labeling with Dynamic Programming.  CVPR 2010.

14. 3/12


Normalized Cut.

Forsyth and Ponce, Section 14.5

Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence , 22(8):888-905, August 2000.

 Luxburg. A Tutorial on Spectral Clustering.[0].pdf

Agarwal, et al. Beyond Pairwise Clustering.  

15. 3/14


Distribution modeling: E-M, Mean shift, Mixtures of Gaussians

Forsyth and Ponce, Computer Vision A Modern Approach, Chapter 16


E-M tutorial by Yair Weiss

16. 3/26


Level Sets

J. A. Sethian, Level Set Methods and Fast Marching Methods.  Cambridge monographs on applied and computational methods.  1996.  On reserve in CS library.

Level Set Methods in Image Science Richard Tsai and Stanley Osher

17. 3/28


Level Sets (continued)

 D. Mumford and J. Shah.  Optimal approximations by piecewise smooth functions and associated variational problems.  Comm on Pure and Applied Math, 1989. (excerpts)


Active contour without edges

Chan, T.F.; Vese, L.A., IEEE Transactions on Image Processing, 10 (2), Feb. 2001, pp. 266 -277

18. 4/2


Wavelets and wavelet shrinkage

There are many texts on wavelets available.  I have made use of the following:

A Wavelet Tour of Signal Processing, by Stephane Mallat, Academic Press, 1998.

Ten Lectures on Wavelets, Ingrid Daubechies, SIAM, 1992.

D. Donoho, I.Johnstone.  Ideal spatial adaptation by wavelet shrinkage.  Biometrika, 1994.

19. 4/4

No class today


An Introduction to Algebraic Multigrid, by Klaus Stuben. Appendix A in Multigrid, by U. Trottenberg, C. Oosterlee and A. Schuller,, Academic Press, 2001.

20. 4/9



 Forsyth and Ponce, Chapter 9

21. 4/11

Discussion:  Fan, Victoria, Guangxiao


Discussion: three of these papers

M. Galun, E. Sharon, R. Basri, A. Brandt, Texture Segmentation by Multiscale Aggregation of Filter Responses and Shape Elements, Proceedings IEEE International Conference on Computer Vision,  716-723, Nice, France, 2003.

Jitendra Malik, Serge Belongie, Thomas Leung, and Jianbo Shi. Contour and texture analysis for image segmentation. International Journal of Computer Vision, 2000.

Shotton, Winn, Rother and Criminisi

Textonboost: Joint appearance, shape and context modeling for multiclass object recognition and segmentation.  ECCV 2006.

Yang, Wright, Ma and Sastry.  Unsupervised segmentation of natural images via lossy data compression.  CVIU 2008.

22. 4/16


Motion Segmentation (we didn't get to optical flow)

A. Jepson and M. Black, Mixture models for optical flow, Tech. Report, Res. in Biol. and Comp. Vision, Dept. of Comp. Sci., Univ. of Toronto, RBCV-TR-93-44, 1993


C.W. Gear, Multibody grouping from motion images. IJCV, 1998. (Available from me)
J.P. Costeira and T. Kande.  A multibody factorization method for independently moving objects.  IJCV 1998.

23. 4/18


Riemannian manifolds -- We didn't make it to this topic this year.

Pennec, Fillard and Ayache.  A Riemannian Framework for Tensor Computing.  IJCV 2005.


Sochen, Kimmel, and Malladi.  A general framework for low-level vision.  IEEE Trans. on Image Processing, 1998.

24. 4/23





Martin, Fowlkes, and Malik Learning to Detect Natural Image Boundaries Using Local Brightness, Color and Texture Cues


 Dollar, P., Tu, Z., Belongie, S.: Supervised learning of edges and object boundaries In: CVPR. (2006) 1964-1971,


Sharon Alpert, Meirav Galun, Boaz Nadler, and Ronen Basri, “Detecting faint curved edges in noisy images,” European Conf. on Computer Vision (ECCV-10), Crete, Greece, 2010

25. 4/25





Y. Chai, V. Lempitsky, A. Zisserman, BiCoS: A Bi-level Co-Segmentation Method for Image Classification, ICCV 2011.  


Winn and Shotton The layout consistent random field for recognizing and segmenting partially occluded objects CVPR 2006.


Vicente, Kolmogorov, and Rother Cosegmentation Revisited: Models and Optimization ECCV 2010

26. 4/30


Steven (Xi)



Object detection with discriminatively trained part-based models. Felzenszwalb, Girshick,  McAllester and Ramanan.  PAMI 2010

Bagon and Galun, A unified multiscale framework for discrete energy minimization (available from me)

Fast Motion Deblurring, by Cho and Lee


27. 5/2



Clustering by passing messages between daa points, by Frey and Dueck, Science.

28. 5/7

Jianyu (Leo)


A. Levin and Y. Weiss, Learning to Combine Bottom-Up and Top-Down Segmentation, IJCV 2009.

Decomposing a Scene Into Geometric and Sematically Consistent Regions Stephen Gould, Richard Fulton, Daphne Kohler

Segmentation of Brain MR Images Through a Hidden Markov Random Field Model and the Expectation-Maximization Algorithm Yongyue Zhang, Michael Brady, and Stephen Smith IEEE Trans. on Medical Images, 2001


29.  5/9






 Expert Knowledge-guided segmentation system for brain MRI.  by Petiot, Delingette, Thompson and Ayache

Geodesic active contours Vicent Caselles, Ron Kimmel, Guillermo Sapiro

Hoiem, rother and Winn, 3d layoutcrf for multiview object class recognition and segmentation ;

FINAL 5/15

1:30 PM, in Classroom




Student Honor Code

The University of Maryland, College Park has a nationally recognized Code of Academic Integrity, administered by the Student Honor Council. This Code sets standards for academic integrity at Maryland for all undergraduate and graduate students. As a student you are responsible for upholding these standards for this course. It is very important for you to be aware of the consequences of cheating, fabrication, facilitation, and plagiarism. For more information on the Code of Academic Integrity or the Student Honor Council, please visit To further exhibit your commitment to academic integrity, remember to sign the Honor Pledge on all examinations and assignments: "I pledge on my honor that I have not given or received any unauthorized assistance on this examination (assignment)."