Project 1 - Correspondences and Mosaic.

Posted on Thu, February 27. The project must be submitted to the Teaching Assistant by Thursday March 13 (11:59pm). The project must be submitted by electronic mail to guerra@cs.umd.edu with a subject "CMSC 733: P1". You must submit a readme file and your source (Matlab .m) files. You are required to have three files named "A.m", "B.m", and "C.m" among your source files. These files implement each part of the project, respectively. They should read the given input and display the expected output. You must tar your multiple files into a single file named p1.tar.

Overview.

The goal of this project is (A) to find corresponding points in images of the same scene, (B) create a mosaic that consists of one larger image of the scene, and (C) use the mosaicing technique to remove features from a video. In part A, the expected output is the first image (alphanumerical order) with the disparity vectors for the point matches found. The output for parts B and C are the mosaic images. You must use the Image Processing Toolbox of Matlab to implement your project. The toolbox provides functions including reading and writing image files, edge detection, correlation and other matrix manipulation, and linear system resolution.

Part A.

The problem of feature correspondence (stereo matching or stereo disparity problem) can be stated as finding pairs of features in two or more perspective images such that each pair corresponds to the projections of the same scene point. Stereo matching algorithms based on the similarity of features are divided into two groups according to the matching primitives: the feature-based matching technique attempts to establish a correspondence between sparse sets of image features; and the area-based matching technique that applies to all of the image points. In the feature-based approach [ZDFL96], the image pair is first processed by an operator to extract the features that are stable when the viewpoint is changed. Features are detected independently in both images. The matching process is then applied to the attributes associated with the detected features.

The extraction of feature points has two basic approaches: detecting edges as a chain code or corners based on gradients and curvatures of the surface. The chain code consisting of edge elements computed by an edge detection algorithm (e.g. Canny edge detector) is searched for points having maxima curvature in their attributes [AB86, DF90, MY86]. The Harris corner detector [HS88] is a version of the Plessey corner detector [No88].

A similarity measure for two points pL and pR is a function over two windows (2n+1) x (2m+1) centered at the points and estimates how likely is the disparity between the two points. The absolute intensity differences (AIS) [Ka94], the sum of squared differences (SSD), and the normalized cross correlation are instances of similarity measures. Let ML and MR be two sets of point features in the two images, respectively. A strength matrix is obtained by computing the similarity function between each feature point in ML and all features in MR. The strategy winner-take-all (WTA) [PMF85, RHZ76, ZLM81] matches the features that have the maximum strength for both row and column of the strength matrix.

Part B.

The scene is represented by five pictures taken by one camera. The camera remained in the same position except for a rotation. Each file is in the .pgm file format. If two cameras (or one camera at two different times) take a picture of a planar scene, the images are related by a general linear transformation [FA97]. No matter what the relative positions and orientations of the cameras. The same linear transformation also appears concerning two views of any scene (possibly not restricted to a plane) when both views are taken from the same position but in different directions. The mosaic creation problem consists in given a number of different images, all taken by a camera that remained in the same place but rotated in different directions, create a single large image (mosaic) which contains all the original images, warped appropriately. The mosaic maps all the images onto the same coordinate system.

Initially, the mosaicing process works with two images at a time. Find automatically corresponding points in both images that are the projection of the same scene point. In order to perform this task, you may use Part A implementation. Record the pixel coordinates of that point in both images. Use the point correspondences to solve for the transformation which maps points in the second image to points in the first. You will need at least four points in order to solve a linear system with the eight unknowns that define the transformation matrix. You merge the two images by creating a new image with the first image, and the results of applying the affine transformation to the points of the second image. You may have to do some interpolation, as the points are not likely to fall exactly on integer pixel coordinates. The full mosaic is obtained by performing this process for each pair of images.

Part C.

In the video, a planar poster is obscured by several wooden dowels. Create a mosaic of the poster, filling in for the pieces behind the dowels from other views. The video is taken by a constantly translating camera. Initially, you must compute the normal/optical flow between two images in order to identify the fence. Once the fence is identified using the normal/optical flow, then replace the fence by blank (white pixels). Use the mosaic process of Part B to fill in the blanks in each frame of the video.

Submission Files.

You must submit a readme file, named readme.txt. The readme file enumerates and describes all files in you tar file. The readme file should also contain a description of your implementations. You must also submit your source (Matlab .m) files. All multiple files must be in a single tar file (p1.tar) and sent to guerra@cs.umd.edu.

References

[AB86] H. Asada and M. Brady. The curvature primal sketch, IEEE Trans. Pattern Analysis and Machine Intelligence, Volume 8, pages 2-14, 1986.
[DF90] R. Deriche and O. Faugeras. 2D-curves matching using high curvature points: Applications to stereovision, In Proceedings of the 10th International Conference on Pattern Recognition, Volume 1, pages 240-242, 1990.
[FA97] C. Fermuller and Y. Aloimonos. On the Geometry of Visual Correspondence, IJCV(21), No. 3, February 1997, pages 223-247, 1997. Download
[HS88] C. Harris and M. Stephens. A combined corner and edge detector, In Proceedings of the Alvey Conference, pages 189-192, 1988.
[Ka94] T. Kanade. Development of a video-rate stereo machine, In Proceedings of ARPA Image Understanding Workshop, pages 549-558, 1994.
[MY86] G. Medioni and Y. Yasumuto. Corner detection and curve representation using cubic b-spline, In Proceedings of the International Conference on Robotics and Automation, pages 764-769, 1986.
[No88] J. Noble. Finding corners, Image and Vision Computing", Volume 6, pages 121-128, 1988.
[PMF85] S. Pollard, J. Mayhew, and J. Frisby. PMF: A stereo correspondence algorithm using a disparity gradient limit, Perception, Volume 14, pages 449-470, 1985.
[RHZ76] A. Rosenfeld, R. Hummel, and Zucker. Scene labeling by relaxation operations, IEEE Trans. Systems, Man, and Cybernetics, Volume 6, Number 6, pages 420-433, 1976.
[ZDFL96] Z. Zhang, R. Deriche, O. Faugeras, and Q. Luon. A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry, Artificial Intelligence Journal, Volume 78, pages 87-119, 1996.

[ZLM81] S. Zucker, Y. Leclerc, and J. Mohammed. Continuous relaxation and local maxima selection: Conditions for equivalence, IEEE Trans. Pattern Analysis and Machine Intelligence, Volume 3, pages 117-127, 1981.

Web Accessibility