Syllabus

CMSC 426
Image Processing (Computer Vision)
David Jacobs
Spring 2003

 

Overview

Unfortunately, the official name and course description are a bit inaccurate.  

In CMSC 426 we will study the basics of computer vision.  This is the process of using images to find out about the world at which one is looking.  The course focuses on several basic topics.  The first is finding the boundary of objects in images.  Usually, the visual properties of adjacent objects are different, such as their lightness or texture, so we must understand how to locate changes in these properties.  Then we must learn how to integrate these local cues into interpretations about regions of images.   The second topic involves recovering the intrinsic properties of the world from one or more images.  This includes understanding how to recover the reflectance properties of an object from their appearance in an image.  It also involves using the appearance of a scene from different viewpoints to determine  the depth of objects in the scene.  The third basic topic involves recognizing the identity of objects in a scene.  To do this, we must account for the fact that viewpoint and lighting affect the appearance of an object.  Even more challenging, different instances of the same class of object may be somewhat different.  That is, different chairs may have different shapes or be made of different materials.  We will try to describe in detail some of the basic techniques used to solve these problems, and give some idea of the nature of ongoing research in these areas.

Background

Students will need prior knowledge of programming, algorithms, and math.  Programs will be done in Matlab, but we will not assume a prior knowledge of this language.  However, students should feel comfortable about writing programs in a new language after a brief tutorial.  They should also be familiar with the basics of algorithms.  Computer vision is more mathematical than most computer science courses.  We will talk about problems in geometry (how do we relate the 3D world to its 2D projection), functional analysis (since images are 2D functions), optimization (since we often combine visual evidence to find an optimal solution), and linear algebra (since points and lines are naturally represented and manipulating using matrices).  We assume no specific mathematical knowledge beyond calculus.  However, parts of the class will probably be tough if you haven't seen linear algebra before.  In general, the more math you know, the easier it will be to pick up the new concepts you see in the class. 

Text

It is strongly recommended that you buy the text: Computer Vision A Modern Approach, David Forsyth and Jean Ponce., Prentice Hall, 2003.

Course Work

There will be 6-8 problem sets assigned during the semester.  These will include some programming assignments to be done in Matlab, and some pen and paper exercises.  One problem set will typically include a mix of both types of assignments.  Homework is due at the start of class.  Problems due on Tuesday will be subject to a late penalty of 10% for every 24 hours they are late, and may not be turned in more than 72 hours late.  Problems due on Thursday will be subject to a late penalty of 10% if turned in 24 hours late.  They will receive a 30% penalty if turned in by 11am the next Monday, and may not be turned in later than that.

There will be three exams: two quizzes and a comprehensive final. Tentative weights: Homeworks 30%, quizzes a total of 30%, final exam 40%. 

Homework assignments are to be written up neatly and clearly, and programming assignments must be clear and well-documented. Programs should be written in Matlab.  A full paper copy of all of the homework must be turned in.  In addition, we will ask you to email a copy of all Matlab code to the TA.

Some homeworks and projects may have a special challenge problem. Points from the challenge problems are extra credit. This means that I do not consider these points until after the final course cutoffs have been set. 

Academic Honesty

All class work is to be done independently. You are allowed to discuss class material, homework problems, and general solution strategies with your classmates. When it comes to formulating/writing/programming solutions you must work alone. If you make use of other sources in coming up with your answers you must site these sources clearly (papers or books in the literature, friends or classmates, information downloaded from the web, whatever).

It is best to try to solve problems on your own, since problem solving is an important component of the course. But I will not deduct points if you make use of outside help, provided that you cite your sources clearly. Representing other people's work as your own, however, is plagiarism and is in violation of university policies. Instances of academic dishonesty will be dealt with harshly, and usually result in a hearing in front of a student honor council, and a grade of XF.  (Note, this and other course policies are taken from those of Prof. David Mount).

Course Outline

This is a rough outline of the course.  The specific topics discussed and the timing of the quizzes are subject to change.

  1. Introduction to computer vision.
  2. Cameras
    1. Projection and Photometry
  3. Matlab introduction  & Linear Algebra Refresher
  4. Intensities – 2.1D vision.
    1. Boundary detection (edges)

                                                               i.      Linear filtering

                                                             ii.      Discontinuities (edge detection)

                                                            iii.      SNAKEs (finding the bounding contours of objects)

    1. Perceptual grouping 
    2. Boundary detection (Regions).

QUIZ 

    1. Lightness constancy
    2. Color
    3. Texture
    4. Features (corners, lines).
    5. Perceptual Grouping
  1. 3D Vision
    1. Stereo
    2. Structure-from-Motion

                                                               i.      Flow

                                                             ii.      2 frame perspective structure-from-motion

                                                            iii.      Multi-frame affine structure-from-motion

  1. Application: Image-based Rendering

QUIZ 

  1. Shading
    1. Photometric stereo for Lambertian
    2. Recognition as linear combinations of intensities.
  2. Recognition
    1. Template matching
    2. Pose determination
    3. Invariance
  3.  Classification
  4.  Approaches to Vision

 FINAL EXAM