CfAR WEEKLY SEMINAR

Mondays at 2:30pm in AVW 2120 (or as notified)
Fall, 2005/Spring, 2006

Organizers: Ramani Duraiswami and David Jacobs

 

  Schedule


Dates

Titles

Speakers

9/12/05

Intro to vision at Maryland

Prof. Ramani Duraiswami, David Jacobs , Rama Chellappa, and Yiannis Aloimonos

9/20/05

(Tue, 10am)

The Hybrid Imaging Approach

Dr. Yoav Schechner

9/28/05

 

 

10/03/05

Detecting Rotational Symmetries (V. Shiv Naga Prasad and Larry S. Davis); Closely Coupled Object Detection and Segmentation (Liang Zhao, Larry S. Davis ); Face Recognition in the Presence of Multiple Illumination Sources (Gaurav Aggarwal, Rama Chellappa)

V. Shiv Naga Prasad, Dr. Liang Zhao, and Gaurav Aggarwal (CFAR)

10/10/05

On-Line Density-Based Appearance Modeling for Object Tracking, Deformation Invariant Image Matching, Object Recognition in High Clutter Images Using Line Features, Robust Point Matching for Two-Dimensional Nonrigid Shapes, On the Equivalence of Common Approaches to Lighting Insensitive Recognition, Fast Multiple Object Tracking via a Hierarchical Particle Filter, An Algebraic Approach to Surface Reconstruction from Gradient Fields

Bohyung Han, Haibin Ling, Philip David, Yefeng Zheng, Margarita Osadchy, Changjiang Yang, Amit Agrawal

10/17/05

No seminar, ICCV

 

10/26/05

(Wed, 11 am)

 

Prof. Kristin Dana (Rutgers University)

10/31/05

Revisiting the Image Brightness Constraint

Venu Madhav Govindu

11/28/05

Mapping land cover and land cover change using pattern recognition algorithms status and challenges

Chengquan Huang

TBD

 

Prof. Misha Kazhdan (asst. prof. in graphics at Johns Hopkins)

Friday, February 3/06, 11:00 AM

Spectral Methods for Regularization in Learning Theory

 

Alessandro Verri

Universita' di Genova

 

Friday, February 10/06, 2:00 PM

Linear ordering of Objects Using Graph 1-Factor

Gopi Meenakshisundaram

University of California Irvine

Monday, February 13/06, 2:00 PM

Video visualization - Beyond pixels and frames

 

Yaron Caspi

The Weizmann Institute

Monday, March 13/06, 2:00 PM

Autocalibration, Crowd Segmentation and Person Reidentification - An Industry View at Challenges in Visual Surveillance

Peter Tu and Nils Krahnstoever

Visualization & Computer Vision Lab

GE Global Research

Niskayuna, New York

Friday, March 17/06, 11:00 AM

Improving Audio Source Localization by Learning the Precedence Effect

Kevin W. Wilson

MIT

Monday, April 3, 11:00 AM

Automatic Sales Lead Generation from Web Data

Raghu Krishnapuram

IBM India Research Lab

New Delhi, INDIA

 

TBD

TBD

Venu Govindaraju

Monday, April 10, 11:00 AM

Gradient domain methods for recovering shape, reflectance and illumination from images

Amit Agrawal

University of Maryland

Monday, April 17, 11:00 AM

Passive Vision, the Joy of Sitting Still

Robert Pless

Computer Science and Engineering

Washington University

St. Louis, MO

Friday, April 21, 1 PM

Modeling Age Progression in Young Faces Abstract

Narayanan Ramanathan

University Maryland

Monday, April 24

On Unlocking Mysteries of Past Civilizations as Challenging New Problems in Computer Vision and Pattern Recognition

David B. Cooper

Professor of Engineering

Brown University

Friday, April 28

A Joint Model of Illumination and Shape for Visual Tracking

Amit Kale

Center for Visualization and Virtual Environments

University of Kentucky

Monday, May 1

The Fundamental Matrix in Human Action Recognition

Dr. Mubarak Shah
Computer Vision Lab
School of  Computer Science
University of Central Florida
http://www.cs.ucf.edu/~vision/

Friday, May 5, 2:00 PM

Non-linear dimensionality reduction

 

Alfred Hero

University of Michigan

Monday, May 8, 11:00 AM

Informational Intelligence

Dr. Stefan Jaeger

Institute for Advanced Computer Studies

Language and Media Processing Laboratory

University of Maryland

Friday, May 12, 1:00 PM

Interpolation artifacts in sub-pixel variational image processing

 

Gustavo K. Rohde

Naval Research Laboratory
Washington, DC

TBD

TBD

Sameer Shirdhonkar

 

 

 

 

 

Top

  Abstracts


09/20/05

The Hybrid Imaging Approach

Speaker

Dr. Yoav Schechner,

Dept. of Electrical Eng. Technion - Israel Inst. Technology, Haifa

http://www.ee.technion.ac.il/~yoav/

Abstract

Computer vision typically regards images as given entities to be processed. However, richer information can be extracted by modifying and analyzing the imaging process itself. This modification includes the sensor or the illumination, in conjunction to carefully tailored algorithms. This hybridization exploits the advantages of both the sensor and the algorithmic components of a vision system. We describe our recent results in this approach, which apply to the full observation setup: illumination of the object, scattering media between the object and the camera, optical phenomena in the camera, and multi-sensor computational processing. In particular, the talk shows new results in the development of multiplexing for enhanced imaging under varying illumination directions. We then describe denoising that is tailored to vision in scattering media. In addition, we describe a method for blindly estimating simultaneous spatio-temporal inconsistencies of sensors (gain, vignetting, radiometric response). Finally, we explore audio-visual interaction, whereby a vision algorithm using a sparsity prior uniquely pinpoints the pixels that correspond to sound sources, with high definition.

Top

10/03/05

CFAR ICCV papers

Speaker

V. Shiv Naga Prasad, Dr. Liang Zhao, and Gaurav Aggarwal (CFAR)

Abstract

Detecting Rotational Symmetries

V. Shiv Naga Prasad and Larry S. Davis

Abstract:

We present an algorithm for detecting multiple rotational symmetries in natural images. Given an image, its gradient magnitude field is computed, and information from the gradients is spread using a diffusion process in the form of a Gradient Vector Flow (GVF) field. We construct a graph whose nodes correspond to pixels in the image, connecting points that are likely to be rotated versions of one another. The $n$-cycles present in the graph are made to vote for $C_n$ symmetries, their votes being weighted by the errors in transformation between GVF in the neighborhood of the voting points, and the irregularity of the $n$-sided polygons formed by the voters.  The votes are accumulated at the centroids of possible rotational symmetries, generating a confidence map for each order of symmetry. We tested the method with several natural images.

Closely Coupled Object Detection and Segmentation

Liang Zhao, Larry S. Davis
Abstract

We propose a closely coupled object detection and segmentation algorithm for enhancing both processes in a cooperative and iterative manner. Figure-ground segmentation reduces the effect of background clutter on template matching; the matched template provides shape constraints on segmentation. More precisely, we estimate the probability of each pixel belonging to the foreground by a weighted sum of the estimates based on shape and color alone. The weight on the shape-based estimate is related to the probability that a familiar object is present and is updated dynamically so that we enforce shape constraints only where the object is present. Experiments on detecting people in images of cluttered scenes demonstrate that the proposed algorithm improves both segmentation and detection. More accurate object boundaries are extracted; higher object detection rates and lower false alarm rates are achieved than performing the two processes separately or sequentially.

Face Recognition in the Presence of Multiple Illumination Sources

Gaurav Aggarwal, Rama Chellappa.

Abstract:

Most existing face recognition algorithms work well for controlled images but are quite susceptible to changes in illumination and pose. This has led to the rise of analysis-by- synthesis approaches due to their inherent potential to handle these external factors. Though these approaches work quite well, most of them assume that the face is illuminated by a single light source which is usually not true in realistic conditions. In this paper, we propose an algorithm to recognize faces illuminated by arbitrarily placed, multiple light sources. The algorithm does not need to know the number of light sources and works extremely well even while recognizing faces illuminated by different number of light sources. Results using this algorithm are reported on multiple-illumination datasets generated from PIE [10] and Yale Face Database B [5]. We also highlight the importance of the hard non-linearity in the Lambert’s law which is often ignored, probably to linearize the estimation process.

Top

10/10/05

CFAR ICCV papers

Speaker

Bohyung Han, Haibin Ling, Philip David, Yefeng Zheng, Margarita Osadchy, Changjiang Yang, Amit Agrawal (CfAR)

Abstract

On-Line Density-Based Appearance Modeling for Object Tracking

Bohyung Han, Larry Davis

Object tracking is a challenging problem in real-time computer vision due to variations of lighting condition, pose, scale, and view-point over time. However, it is exceptionally difficult to model appearance with respect to all of those variations in advance; instead, on-line update algorithms are employed to adapt to these changes.

We present a new on-line appearance modeling technique which is based on sequential density approximation. This technique provides accurate and compact representations using Gaussian mixtures, in which the number of Gaussians is automatically determined. This procedure is performed in linear time at each time step, which we prove by amortized analysis. Features for each pixel and rectangular region are modeled together by the proposed sequential density approximation algorithm, and the target model is updated in scale robustly. We show the performance of our method by simulations and tracking in natural videos.

 

Deformation Invariant Image Matching

Haibin Ling and David Jacobs

We propose a novel framework to build descriptors of local intensity that are invariant to general deformations. In this framework, an image is embedded as a 2D surface in 3D space, with intensity weighted relative to distance in $x$-$y$. We show that as this weight increases, geodesic distances on the embedded surface are less affected by image deformations. In the limit, distances are deformation invariant. We use geodesic sampling to get neighborhood samples for interest points, then use a geodesic-intensity histogram (GIH) as a deformation invariant local descriptor. In addition to its invariance, the new descriptor automatically finds its support region. This means it can safely gather information from a large neighborhood to improve discriminability. Furthermore, we propose a matching method for this descriptor that is invariant to affine lighting changes. We have tested this new descriptor on interest point matching for two data sets, one with synthetic deformation and lighting change, another with real non-affine deformations. Our method shows promising matching results compared to several other approaches.

Posters

Object Recognition in High Clutter Images Using Line Features

Philip David and Daniel DeMenthon

We present an object recognition algorithm that uses model and image line features to locate complex objects in high clutter environments.

Finding correspondences between model and image features is the main challenge in most object recognition systems. In our approach, corresponding line features are determined by a three-stage process. The first stage generates a large number of approximate pose hypotheses from correspondences of one or two lines in the model and image. Next, the pose hypotheses from the previous stage are quickly ranked by comparing local image neighborhoods to the corresponding local model neighborhoods. Fast nearest neighbor and range search algorithms are used to implement a distance measure that is unaffected by clutter and partial occlusion.

The ranking of pose hypotheses is invariant to changes in image scale, orientation, and partially invariant to affine distortion. Finally, a robust pose estimation algorithm is applied for refinement and verification, starting from the few best approximate poses produced by the previous stages. Experiments on real images demonstrate robust recognition of partially occluded objects in very high clutter environments.

Robust Point Matching for Two-Dimensional Nonrigid Shapes

Yefeng Zheng and David Doermann

Recently, nonrigid shape matching has received more and more attention. For nonrigid shapes, most neighboring points cannot move independently under deformation due to physical constraints.

Furthermore, the rough structure of a shape should be preserved under a deformation, otherwise even people can recognize the shape. Therefore, though the absolute distance between two points may change significantly, the neighborhood of a point is well preserved in general. Based on this observation, we formulate point matching as a graph matching problem. Each point is a node in the graph, and two nodes are connected by an edge if their Euclidean distance is less than a threshold. The optimal match between two graphs is the one that maximizes the number of matched edges. The shape context distance is used to initialize the graph matching, followed by relaxation labeling to refine the match. Nonrigid deformation is overcome by bringing one shape closer to the other in each iteration using deformation parameters estimated from the current point correspondence. Experiments on real and synthesized data demonstrate the effectiveness of our approach: it outperforms the shape context and TPS-RPM algorithms under nonrigid deformation and noise on a public data set.

On the Equivalence of Common Approaches to Lighting Insensitive Recognition

Margarita Osadchy, David Jacobs, Michael Lindenbaum

Lighting variation is commonly handled by methods invariant to additive and multiplicative changes in image intensity. It has been demonstrated that comparing images using the direction of the gradient can produce broader insensitivity to changes in lighting conditions, even for 3D scenes.  We analyze two common approaches to image comparison that are invariant, normalized correlation using small correlation windows, and comparison based on a large set of oriented difference of Gaussian filters. We show analytically that these methods calculate a monotonic (cosine) function of the gradient direction difference and hence are equivalent to the direction of gradient method.  Our analysis is supported with experiments on both synthetic and real scenes.

Fast Multiple Object Tracking via a Hierarchical Particle Filter

Changjiang Yang, Ramani Duraiswami, Larry Davis

An Algebraic Approach to Surface Reconstruction from Gradient Fields

Amit Agrawal, Rama Chellappa, Ramesh Raskar.

 Top

 

 

10/31/05

Revisiting the Image Brightness Constraint

Speaker

Venu Madhav Govindu

Abstract

I will present a principled approach to using the image brightness constraint for optical flow algorithms. Using a simple noise model, a probabilistic representation for optical flow will be derived. It will be shown that this representation subsumes existing approaches to flow modeling and explains the behavior of some conventional approaches. Modified algorithms will be developed for a stratification of smoothness assumptions, namely constant, affine and smooth flow.

Top

Friday, February 3, 11:00 AM

Spectral Methods for Regularization in Learning Theory

Speaker

Alessandro Verri

Universita' di Genova

Abstract

In this talk we show that a large class of regularization methods designed for solving ill-posed inverse problems gives rise to consistent learning algorithms. The intuition behind our approach is that, by looking at regularization from a filter function perspective, filtering out undesired components of the target function ensures stability with respect to the random sampling thereby inducing good generalization properties. We present a formal derivation of the methods under study by recalling that learning can be written as the inversion of a linear embedding equation given a stochastic discretization. Consistency as well as finite sample bounds are derived for both regression and classification.

(joint work with Lorenzo Rosasco and Ernesto De Vito) alessandro

Friday, February 10, 2:00 PM

Linear ordering of Objects Using Graph 1-Factor

Speaker

Gopi Meenakshisundaram

University of California Irvine

Abstract

Destined to live with the RAM model of computing for a foreseeable future, optimal linear ordering of elements to improve cache coherency and performance of out of core algorithms becomes crucial.  While ordering the elements, the access pattern has to be taken into account, which in turn is application dependent. Assuming, between pairs of elements, we have the probability estimates of the second element being accessed after the first, we propose a solution to the problem of linear ordering of elements using 1-factor graph partitioning algorithm.

 

The versatility of this algorithm is shown by its application to various problems in computer graphics including cache-coherent triangle ordering (also called stripification), simplification, compression, efficient back-face culling, quadrilateral mesh stripification, and tetrahedral mesh stripification. In simplicial complex realization of manifold spaces, the algorithm can be extended to generate space-filling curves. The graph abstraction of the problem, makes the solution seamlessly extendable to elements in higher dimensions including higher dimensional databases and nodes of the hierarchical partitioning of the objects like quadtrees and octrees in computer graphics.

 

 

 

 

 

Monday, February 13, 2:00 PM

Video visualization - Beyond pixels and frames

Speaker

Yaron Caspi

The Weizmann Institute

Abstract

Video data is represented by pixels and frames. This restricts the way it is captured, accessed and visualized.  On one hand, visual information is distributed across all frames, and therefore, in order to depict the visual information, the entire video sequence must be viewed sequentially, frame by frame.  On the other hand, important visual information is lost by the limited frame rate.  Similarly in the spatial domain, sensor and optics limit the capturing process, while huge redundancy prevents an efficient visualization of information. In this talk I will show how to exceed both limitations of capturing devices and of visual displays. In particular, how fusion of information from multiple sources allows to exceed temporal and spatial limitations, and how visualization of video data can benefit from importance ranking. I will describe a process that depicts the essence of video or animation, by embedding high dimensional data in low dimensional Euclidean space. I will also show how super-pixels (in contrast to pixels) contribute to the exploitation of temporal redundancy for the task of spatial segmentation of regions with high importance

 

 

Monday, March 13, 2:00 PM

Autocalibration, Crowd Segmentation and Person Reidentification - An Industry View at Challenges in Visual Surveillance

Speakers

Peter Tu and Nils Krahnstoever

Visualization & Computer Vision Lab

GE Global Research

Niskayuna, New York

Abstract

The Visualization and Computer Vision Group at GE Global Research serves a large number of GE businesses in areas such as medical image processing, industrial inspection, and intelligent video systems.  We will give a brief overview of our group and will then focus on recent security related projects. We will motivate specific research needs that originate from our collaboration with GE Security. The objective of the ongoing research is to extend the functionality and robustness of the intelligent video product line currently offered by GE. The technical presentation will include recent work on crowd segmentation, person-reidentification and auto-calibration. The crowd segmentation work will focus on a parts-based approach that is able to segment crowds using a version of Expectation Maximization. A key feature of this approach is that the number of people in the crowd need not be known in advance. Our person-reidentification work focuses on the ability to fit deformable models to images and generate stable signatures based on color and edge information. Finally, the talk will present an approach to reliably and automatically perform metric camera calibration from detections and tracks of people and show how metric calibration can be utilized for various detection, tracking and crowd segmentation tasks.

 

 

Friday, March 17, 11:00 AM

Improving Audio Source Localization by Learning the Precedence Effect

Speaker

Kevin W. Wilson

MIT

Abstract

Speech source localization in reverberant environments has proved difficult for microphone array systems.  One reason for this is the failure of most techniques to properly model localization error for nonstationary signals in reverberant environments. In contrast, the human auditory system can robustly localize nonstationary sources, such as speech, in reverberant environments. Insight into the human auditory system's "error model" is provided by the precedence effect, in which people localize sources based largely on cues from sound onsets and which has been hypothesized to improve localization performance in reverberant environments. Inspired by the precedence effect, we consider the problem of learning a mapping from reverberated signal spectrograms to localization precision.

 

Inspired by the precedence effect, we consider the problem of learning a mapping from reverberated signal spectrograms to localization precision. We find this mapping using ridge regression, and using the learned mappings in the generalized cross-correlation framework, we demonstrate improved localization performance.  Additionally, the resulting mappings exhibit behavior consistent with the psychoacoustics of the precedence effect.

 

 

Monday, April 3, 11:00 AM

Automatic Sales Lead Generation from Web Data

Speaker

Raghu Krishnapuram

IBM India Research Lab

New Delhi, INDIA

Abstract

The World Wide Web has grown into an "information-mesh", with most important facts being reported through Web sites. Several news papers, press releases, trade journals, business magazines and other related sources are on-line. These sources could be used to identify prospective buyers of products and services automatically. In this talk, we present a system called ETAP (Electronic Trigger Alert Program) that extracts "trigger events" from Web data that help in identifying prospective customers. Trigger events are events of corporate relevance and are indicative of the propensity of companies to purchase new products. Examples of trigger events are "change in management", "revenue growth" and "mergers & acquisitions".  We pose the problem of trigger event extraction as a classification problem and present methods to generate the training data required to learn the classifiers automatically. We also propose a method of feature abstraction that uses named entity recognition to solve the problem of data sparsity. Our experiments show the effectiveness of the method.

 

Biographical sketch:

 

Raghu Krishnapuram received his Ph.D. degree in electrical and computer engineering from Carnegie Mellon University, Pittsburgh, in 1987. From 1987 to 1997, he was on the faculty of the Department of Computer Engineering and Computer Science at the University of Missouri, Columbia.  >From 1997 to 2000, Dr. Krishnapuram was a Full Professor at the Department of Mathematical and Computer Sciences, Colorado School of Mines,Golden, Colorado. Since then, he has been at at IBM India Research Lab, New Delhi. Dr. Krishnapuram's research encompasses many aspects of Web mining, information retrieval, e-commerce, fuzzy set theory, neural networks, pattern recognition, computer vision, and image processing. He has published over 160 papers in journals and conferences in these areas. Dr. Krishnapuram is an IEEE Fellow, and a co-author (with J. Bezdek, J. Keller and N. Pal) of the book "Fuzzy Models and Algorithms for Pattern Recognition and Image Processing".

 

 

Monday, April 10, 11:00 AM

Gradient domain methods for recovering shape, reflectance and

 illumination from images

Speaker

Amit Agrawal

University of Maryland

Abstract

Classical approaches for recovering shape such as Photometric Stereo and Shape from Shading requires surface reconstruction from the estimated gradient field, which is usually non-integrable. Most of the previous approaches lacks the property of local error confinement and cannot handle outliers. We analyze the space of all possible solutions for surface reconstruction from gradient fields and present a general framework for obtaining meaningful solutions in this space. We derive several new algorithms using our framework that give feature preserving reconstructions in presence of noise and outliers.

Traditionally, edge suppression is achieved by setting the image gradients to zero using thresholds. We present an approach for removing edges in an image using another image taken under different illumination conditions by (a) gradient projection and (b) gradient field transformation using cross-projection tensors. We show results on several applications such as recovering intrinsic images (reflectance/illumination maps), recovering foreground layer, removing shadows from color images and removing glass reflections.

 

 

Monday, April 17, 11:00 AM

Passive Vision, the Joy of Sitting Still

Speaker

Robert Pless

Computer Science and Engineering

Washington University

St. Louis, MO

Abstract

Many classical vision algorithms mimic the structure and function of the human visual system — a strategy which has successfully driven research into stereo and structure from motion based algorithms. However, for problems such as surveillance, tracking, anomaly detection and scene segmentation; the lessons of the human visual system are not so clear. For these problems, significant advantages are possible in a "Passive Vision" paradigm that advocates collecting statistical representations of scene variation from a single viewpoint over very long time periods. This talk motivates this approach by providing a collection of examples where very simple statistics, which can be easily kept over very long time periods, dramatically simplify scene interpretation problems including segmentation, feature attribution, and offer orders of magnitude performance improvement for tracking algorithms.

 

 

 

Friday, April 21, 1:00 PM

Modeling Age Progression in Young Faces Abstract

Speaker

Narayanan Ramanathan

University of Maryland

Abstract

We propose a craniofacial growth model that characterizes growth related shape variations observed in human faces during formative years. The model draws inspiration from the `revised' cardioidal strain transformation model proposed in psychophysical studies related to craniofacial growth. The model takes into account anthropometric evidences collected on facial growth and hence is in accordance with the observed growth patterns in human faces across years. We characterize facial growth by means of growth parameters defined over facial landmarks often used in anthropometric studies. We illustrate how the age-based anthropometric constraints on facial proportions translate into linear and non-linear constraints on facial growth parameters and propose methods to compute the optimal growth parameters. The proposed craniofacial growth model can be used to predict one's appearance across years and to perform face recognition across age progression. This is demonstrated on a database of age separated face images of individuals under 18 years of age.

 

 

Monday, April 24, 11:00 AM

On Unlocking Mysteries of Past Civilizations as Challenging New Problems in Computer Vision and Pattern Recognition

Speaker

David B. Cooper

Professor of Engineering

Brown University

Abstract

Archaeological excavation-site analysis and digital preservation of national heritage are two fields using data and having needs that are wonderful for geometry-based applications of computer vision and pattern recognition.  In this talk I will briefly illustrate a number of projects – past, present, and future – that we have been involved in, and I will go into extensive detail on our work on the automatic reconstruction of ceramic pot representations from 3D noisy dense-data measurements of their damaged fragments (i.e., their pot sherds).  Archaeological site analysis by computer is a wonderful playground for developing new computer-vision/pattern-recognition methodologies for analyzing, inferencing, and manipulating 2D and 3D geometric structure from noisy data.  The attraction is that the geometry involved is highly varied, there is an enormous amount of fragments available,  little work has been done in using sophisticated shape analysis for working with the data, and many problems amenable to solution have yet to be formulated.

 Ceramic pot fragments are among the most numerous finds at archaeological sites and also among the most useful for making inferences about the history and use of the site.   Typically, a ceramic pot breaks into 15 to 30 pieces, and these pot sherds might be collected into piles, each of the order of 200 sherds, which contain incomplete sets of sherds from a number of pots.  Since it takes a skilled technician anywhere from a few hours to a few days to reconstruct one such pot, very few pots are reconstructed.  In this talk, I formulate the problem of  automatic pot model estimation by computer based on 3D dense-data laser-scan measurements of the sherd outer surfaces (typically 3,000 to more than 10,000 points per sherd) as a problem in statistical learning of 3D freeform geometry based on noisy measurement data of  unorganized fragments.  What makes the problem difficult is that sherds may be missing, they may be small thus containing little surface shape information, they are chipped and they may be eroded. 

 

 

Friday, April 28, 1:00 PM

A Joint Model of Illumination and Shape for Visual Tracking

Speaker

Amit Kale

Center for Visualization and Virtual Environments

University of Kentucky

Abstract

Visual tracking involves generating an inference about the motion of an object from measured image locations in a video sequence. In this talk I will present a unified framework that incorporates shape and illumination in the context of visual tracking. First, we introduce a multiplicative, low dimensional model of illumination that is defined by a linear combination of a set of smoothly changing basis functions. Secondly, we show that a small number of centroids in this new space can be used to represent the illumination conditions existing in the scene. These centroids can be learned from ground truth and are shown to generalize well to other objects of the same class for the scene. Finally we show how this illumination model can be combined with shape in a probabilistic sampling framework. Results of the joint shape-illumination model will be demonstrated in the context of vehicle and face tracking in challenging conditions.

 

 

Monday, May 1, 11:00 AM

The Fundamental Matrix in Human Action Recognition

Speaker

Dr. Mubarak Shah
Computer Vision Lab
School of  Computer Science
University