
|
About me ...
I am a doctoral candidate in the Department of Computer Science at University of Maryland, College Park.
I got my Bachelor's from the Department of Computer Science and Engineering at the Indian Institute of Technology, Madras (India).
My research interests are in Computer Vision, Pattern Recognition, Image Processing and related areas.
My PhD advisor is Dr. Rama Chellappa.
My thesis focusses on illumination-invariant matching of objects.
In addition, I have done research on large scale indexing of biometrics, shape matching and indexing, cohort analysis for biometric
matching, video-based face recognition, tracking and 3D object matching.
I was a long-term research intern at IBM T J Watson Research Center from February 2006 to October 2006.
At IBM, I worked with Dr. Nalini Ratha and Dr. Ruud Bolle who are part of the Exploratory Computer Vision Group.
I am planning to graduate in Fall 2007. I am looking for a full-time position in a research laboratory.
If you are a prospective employer, please take a look at my CV and drop me a line at gaurav AT cs DOT umd DOT edu.
|
Please click on the + to read the abstracts.
- Soma Biswas, Gaurav Aggarwal and Rama Chellappa. Robust Estimation of Albedo for Illumination-invariant Matching and Shape
Recovery. In Proceedings of the Eleventh IEEE International Conference on Computer Vision (ICCV), Rio, Brazil, October, 2007.
pdf
Abstract
In this paper, we propose a non-stationary stochastic filtering
framework for the task of albedo estimation from a
single image. There are several approaches in literature for
albedo estimation, but few include the errors in estimates of
surface normals and light source directions to improve the
albedo estimate. The proposed approach effectively utilizes
the error statistics of surface normals and illumination direction
for robust estimation of albedo. The albedo estimate
obtained is further used to generate albedo-free normalized
images for recovering the shape of an object. Illustrations
and experiments are provided to show the efficacy of the approach
and its application to illumination-invariant matching
and shape recovery.
- Gaurav Aggarwal, Soma Biswas and Rama Chellappa. Symmetric Shapes are Hardly Ever Ambiguous. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June, 2007.
pdf
Abstract
Given any two images taken under different illumination conditions, there always exist a physically realizable object which is consistent with both the images even if the lighting
in each scene is constrained to be a known point light source at infinity [10]. In this paper, we show that images are much less ambiguous for the class of bilaterally symmetric Lambertian
objects. In fact, the set of such objects can be partitioned into equivalence classes such that it is always possible to distinguish between two objects belonging to different
equivalence classes using just one image per object. The conditions required for two objects to belong to the same equivalence class are very restrictive, thereby leading to the
conclusion that images of symmetric objects are hardly ambiguous. The observation leads to an illumination-invariant matching algorithm to compare images of bilaterally symmetric
Lambertian objects. Experiments on real data are performed to show the implications of the theoretical result even when the symmetry and Lambertian assumptions are
not strictly satisfied.
- Soma Biswas, Gaurav Aggarwal, and Rama Chellappa. Efficient Indexing For Articulation Invariant Shape
Matching And Retrieval. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June, 2007.
pdf
Abstract
Most shape matching methods are either fast but too simplistic to give the desired performance or promising as far as performance is concerned but computationally demanding.
In this paper, we present a very simple and efficient approach that not only performs almost as good as many state-of-the-art techniques but also scales up to large databases.
In the proposed approach, each shape is indexed based on a variety of simple and easily computable features which are invariant to articulations and rigid transformations. The
features characterize pairwise geometric relationships between interest points on the shape, thereby providing robustness to the approach. Shapes are retrieved using an
efficient scheme which does not involve costly operations like shape-wise alignment or establishing correspondences. Even for a moderate size database of 1000 shapes, the retrieval
process is several times faster than most techniques with similar performance. Extensive experimental results are presented to illustrate the advantages of our approach
as compared to the best in the field.
- Gaurav Aggarwal and Rama Chellappa. Learning Symmetry: A Shape from Shading Approach. The Snowbird Learning Workshop, San Juan, Puerto Rico, March, 2007
pdf
Abstract
Given that an object or scene is symmetric, suitable geometric and/or photometric constraints can be derived which aid
in machine understanding of its images. There are several examples of application of symmetric assumption [4][3][1][2]
in Computer Vision literature but hardly any of them emphasize on evaluating the symmetry of the scene. Quite clearly,
symmetric assumption when forced on an asymmetric scene may result in incorrect inferences. In this paper, we propose a
theoretical formulation to verify the symmetry of a scene or object given just one image. The symmetric points in a scene
will in general have different intensity values in an image due to asymmetric placement of illumination sources which makes
the problem non-trivial. The Shape from Shading formulation we present, models the physical process of image formation,
thereby making it possible to evaluate symmetry of an object given just one image taken under arbitrary illumination source
- Rama Chellappa and Gaurav Aggarwal. Video Biometrics. Invited Paper in International Conference on Image Analysis and Processing, September, 2007.
pdf
Abstract
A strong requirement to come up with secure and userfriendly ways to authenticate and identify people, to safeguard
their rights and interests, has probably been the main guiding force behind biometrics research. Though a
vast amount of research has been done to recognize people based on still images, the problem is still far from solved
for unconstrained scenarios. This has led to an increased interest in using video for the task of biometric recognition.
Not only does video has potential to provide more information, but also is more suitable for recognizing people in general
surveillance scenarios. Other than the multitude of still frames, video makes it possible to characterize biometrics
based on inherent dynamics like gait which is not possible with still images. In this paper, we discuss several recent
algorithms to illustrate the usefulness of videos to identify people based on their face and gait. A brief discussion on
remaining challenges is also included.
- Gaurav Aggarwal, Nalini K. Ratha, Ruud M. Bolle and Rama Chellappa. Cohort Analysis for Biometric Verification and Identification. Under review for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2007.
Abstract
Most biometric matching techniques make decisions based solely on a score that represents the
similarity of the query template with reference template(s) of the claimed identity. Though there have
been attempts to perform score fusion, the emphasis has mainly been on multi-classifier and template
fusion (when multiple templates per identity are available). These commonly adopted fusion techniques
rarely make use of the large number of non-matching templates in the database. The situation is no
different for identification techniques. Though the query is compared with all the enrolled entities, the
identification decision is often made based solely on either the top score or manual examination of
top few entries. In this paper, we propose algorithms that make use of the often ignored non-matching
templates to improve the verification and identification performance. For each enrolled subject, we
identify its cohort (similar identities) based on a simple score based selection criterion. The similarity
score of a query template is estimated using its similarity not only with the claimed identity but also
with the cohort of the claimed identity. The scores are fused using two different approaches: a likelihood
ratio based normalization scheme and a Support Vector Machine (SVM)-based classifier. No a priori
knowledge about the database or matcher is required. We perform experiments on multiple biometrics
databases in face and fingerprint modalities using several different matching algorithms to show that the
proposed cohort-based algorithms significantly improve the verification and identification performance
at the expense of a few extra matches.
- Rama Chellappa and Gaurav Aggarwal. Pose and Illumination Issues in Face and Gait-based
Recognition. Advances in Biometrics: Sensors, Systems and Algorithms, Nalini Ratha and Venu
Govindaraju (Eds.), Springer (in press).
Abstract
The chapter aims at exposing the readers to the challenges involved in accounting for
pose and illumination changes for face and gait based human identification. A carefully
chosen snapshot of the research done on the problem is presented. Attempt has been
made to provide intuition which is often found missing in such chapters. Necessary
mathematical formulations are provided wherever required for good understanding.
Face recognition being much more mature as compared to gait based human identification,
is given more attention.
- S. Kevin Zhou, Gaurav Aggarwal, Rama Chellappa, and David W. Jacobs. Appearance Characterization of Linear Lambertian Objects, Generalized Photometric Stereo and Illumination-Invariant Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 29(2), 230-245, February, 2007.
pdf
Abstract
Traditional photometric stereo algorithms employ a Lambertian reflectance model with a varying
albedo field and involve the appearance of only one object. In this paper, we generalize photometric
stereo algorithms to handle all appearances of all objects in a class, in particular the human face class,
by making use of the linear Lambertian property. A linear Lambertian object is one which is linearly
spanned by a set of basis objects and has a Lambertian surface. The linear property leads to a rank
constraint and, consequently, a factorization of an observation matrix that consists of exemplar images of
different objects (e.g., faces of different subjects) under different, unknown illuminations. Integrability
and symmetry constraints are used to fully recover the subspace bases using a novel linearized algorithm
that takes the varying albedo field into account. The effectiveness of the linear Lambertian property is
further investigated by using it for the problem of illumination-invariant face recognition using just one
image. Attached shadows are incorporated in the model by a careful treatment of the inherent nonlinearity
in Lamberts law. This enables us to extend our algorithm to perform face recognition in the
presence of multiple illumination sources. Experimental results using standard data sets are presented.
- Soma Biswas, Gaurav Aggarwal and Rama Chellappa. Invariant Geometric Representation of 3D Point Clouds for Registration and Matching. In Proceedings of the Thirteenth International Conference on Image Processing (ICIP), Atlanta, GA, USA, October, 2006.
pdf
Abstract
Though implicit representations of surfaces have often been
used for various computer graphics tasks like modeling and
morphing of objects, it has rarely been used for registration
and matching of 3D point clouds. Unlike in graphics, where
the goal is precise reconstruction, we use isosurfaces to derive
a smooth and approximate representation of the underlying
point cloud which helps in generalization. Implicit surfaces
are generated using a variational interpolation technique. Implicit
function values on a set of concentric spheres around
the 3D point cloud of object are used as features for matching.
Geometric-invariance is achieved by decomposing implicit
values based feature set into various spherical harmonics.
The decomposition provides a compact representation of
3D point clouds while achieving rotation invariance.
- Gaurav Aggarwal, Nalini Ratha and Ruud M. Bolle. Biometric Verification: Looking Beyond Raw Similarity Scores. In Proceedings of the IEEE Computer Society Workshop on Biometrics (CVPR), New York, USA, June, 2006.
pdf
Abstract
Most biometric verification techniques make decisions
based solely on a score that represents the similarity of the
query template with the reference template of the claimed
identity stored in the database. When multiple templates are
available, a fusion scheme can be designed using the similarities
with these templates. Combining several templates
to construct a composite template and selecting a set of useful
templates has also been reported in addition to usual
multi-classifier fusion methods when multiple matchers are
available. These commonly adopted techniques rarely make
use of the large number of non-matching templates in the
database or training set. In this paper, we highlight the
usefulness of such a fusion scheme while focusing on the
problem of fingerprint verification. For each enrolled template,
we identify its cohorts (similar fingerprints) based on
a selection criterion. The similarity scores of the query template
with the reference template and its cohorts from the
database are used to make the final verification decision using
two approaches: a likelihood ratio based normalization
scheme and a Support Vector Machine (SVM)-based
classifier. We demonstrate the accuracy improvements using
the proposed method with no a priori knowledge about
the database or the matcher under consideration using a
publicly available database and matcher. Using our cohort
selection procedure and the trained SVM, we show that accuracy
can be significantly improved at the expense of few
extra matches.
- S. Saha, C. Shen, C. Hsu, Gaurav Aggarwal, A. Veeraraghavan, A. Sussman and S. Bhattacharyya. Model-Based OpenMP Implementation of a 3D Facial Pose Tracking System. Workshop on Parallel
and Distributed Multimedia, ICPP Workshops, 2006.
pdf
Abstract
Most image processing applications are characterized
by computation-intensive operations, and high memory and
performance requirements. Parallelized implementation on
shared-memory systems offer an attractive solution to this class of
applications. However, we cannot thoroughly exploit the advantages
of such architectures without proper modeling and analysis
of the application. In this paper we describe our implementation
of a 3D facial pose tracking system using the OpenMP platform.
Our implementation is based on a design methodology that uses
coarse-grain dataflow graphs to model and schedule the application.
We present our modeling approach, details of the implementation
that we derived based on this modeling approach, and
associated performance results. The parallelized implementation
achieves significant speedup, and meets or exceeds the target
frame rate under various configurations.
- S. Saha, V. Kianzad, J. Schlessman, G. Aggarwal, S. S. Bhattacharyya, W. Wolf and R. Chellappa. An Architectural Level Design Methodology for Smart Camera Applications. International Journal for Embedded Systems, 2006.
pdf
Abstract
Today's embedded computing applications are characterized by
increased functionality, and hence increased design complexity
and processing requirements. The resulting design spaces are vast.
As a result, designers are typically able to evaluate only small subsets
of architectural solutions, partitionings, and mappings of the
system functionalities. A more comprehensive design space exploration
enables designers to select higher quality solutions and provides
substantial savings on the overall cost of the system.
However, such exploration is not often practiced today since there
is a lack of efficient methodologies and design tools to facilitate
the exploration.
In this paper, we propose an architectural level design methodology
that provides a means for such comprehensive design space
exploration. We develop our methodology in the context of implementing
two smart camera applications — an embedded face
detection system and a 3D facial pose tracking system. The target
platforms for this study includes a reconfigurable system on chip, a
multiprocessor system, and a programmable digital signal processor
(PDSP) system. We present models for performance estimation
and validate these models with experimental values obtained from
implementing our system on different hardware and software platforms.
This modelling approach is shown to be efficient, accurate,
and intuitive for designers to work with. Using this approach, we
show how a wide range of design options can be selected that
trade-off various architectural features.
- Gaurav Aggarwal and Rama Chellappa. Face Recognition in the Presence of Multiple Illumination Sources. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV), Beijing, China, Pages 1169-1176, October, 2005.
pdf
Abstract Most existing face recognition algorithms work well for controlled images but are quite susceptible to changes in illumination and pose. This has led to the rise of analysis-by-synthesis approaches due to their inherent potential to handle these external factors. Though these approaches work quite well, most of them assume that the face is illuminated by a single light source which is usually not true in realistic conditions. In this paper, we propose an algorithm to recognize faces illuminated by arbitrarily placed, multiple light sources. The algorithm does not need to know the number of light sources and works extremely well even while recognizing faces illuminated by different number of light sources. Results using this algorithm are reported on multiple-illumination datasets generated from PIE and Yale Face Database B. We also highlight the importance of the hard non-linearity in the Lambert's law which is often ignored, probably to linearize the estimation process.
- Rama Chellappa, S. Srinivasan, Gaurav Aggarwal and Ashok Veeraraghavan. Image Sequence Stabilization, Mosaicking and Super-Resolution. Handbook of Image and Video Processing, 2nd Edition, A. Bovik (Ed.), Academic Press, 2005.
Abstract Image stabilization, mosaicking and motion super-resolution are processes operating on a temporal sequence of images of a largely static scene viewed by a moving camera. The apparent motion observed in the image can be approximated to comply with a global motion model under a variety of circumstances. A simple and efficient algorithm for recovering the global motion parameters is presented here. The 2D stabilization, mosaicking and super-resolution processes are described, and experimental results are demonstrated. The estimation of 2D and 3D motion has been studied for over two decades now, and the following bibliography provides a useful set of starting references for the interested reader.
- Gaurav Aggarwal, Soma Biswas and Rama Chellappa. UMD Experiments with FRGC data. In Proceedings of IEEE Workshop on Face Recognition Grand Challenge Experiments (CVPR), June, 2005.
pdf
Abstract Although significant work has been done in the field of face recognition, the performance of state-of-the art face recognition algorithms is not good enough to be effective in operational systems. Though most algorithms work well for controlled images, they are quite susceptible to changes in illumination and pose. Face Recognition Grand Challenge (FRGC) is an effort to examine such issues to suitably guide future research in the area. This paper describes the efforts made at UMD in this direction. We present our results on several experiments suggested in FRGC. We believe that though pattern classification techniques play an extremely significant role in automatic face recognition under controlled conditions, physical modeling is required to generalize across varying situations. Accordingly, we describe a generative approach to recognize faces across varying illumination. Unlike most current methods, our method does not ignore shadows. Instead we use them to our benefit by modeling attached shadows in our formulation.
- Gaurav Aggarwal, Ashok Veeraraghavan and Rama Chellappa. 3D Facial Pose Tracking in Uncalibrated Videos. International Conference on Pattern Recognition and Machine Intelligence (PReMI), 2005. Published in Lecture Notes in Computer Science, Volume 3776, Pages 515-520, Dec 2005.
pdf
Abstract This paper presents a method to recover the 3D configuration of a face in each frame of a video. The 3D configuration consists of the three translational parameters and the three orientation parameters which correspond to the yaw, pitch and roll of the face. Such information is important for applications like face modeling, recognition, expression analysis, etc. which require head stabilization. The approach combines the structural advantages of geometric modeling with the statistical advantages of a particle-filter based inference. The face is modeled as the curved surface of a cylinder which is free to translate and rotate arbitrarily. The geometric modeling takes care of pose and self-occlusion while the statistical modeling handles moderate occlusion and illumination variations. Experimental results on multiple datasets are provided to show the efficacy of the approach. The insensitivity of our approach to calibration parameters (focal length) is also shown.
- Rama Chellappa, Ashok Veeraraghavan and Gaurav Aggarwal. Pattern Recognition in Video. Invited paper in International Conference on Pattern
Recognition and Machine Intelligence (PReMI), 2005. Published in Lecture Notes in Computer Science, Volume 3776, Pages 11-20, , Dec 2005.
pdf
Abstract Images constitute data that live in a very high dimensional space, typically of the order of hundred thousand dimensions.
Drawing inferences from correlated data of such high dimensions often becomes intractable. Therefore traditionally several of these problems like face recognition, object recognition, scene understanding etc. have been approached using techniques in pattern recognition. Such methods in conjunction with methods for dimensionality reduction have been highly popular and successful in tackling several image processing tasks. Of late, the advent of cheap, high quality video cameras has generated new interests in extending still image-based recognition methodologies to video sequences. The added temporal dimension in these videos makes problems like face and gait-based human recognition, event detection, activity recognition addressable. Our research has focussed on solving several of these problems through a pattern recognition approach. Of course, in video streams patterns refer to both patterns in the spatial structure of image intensities around interest points and temporal patterns that arise either due to camera motion or object motion. In this paper, we discuss the applications of pattern recognition in video to problems like face and gait-based human recognition, behavior classification, activity recognition and activity based person iden
tification.
- V. Kianzad, S. Saha, J. Schlessman, G. Aggarwal, S. S.Bhattacharyya, W. Wolf, R. Chellappa. An Architectural Level Design Methodology for Embedded Face Detection. In Proceedings of the 3rd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), Jersey City, NJ, USA, Pages 136-141, 2005.
pdf
Abstract Face detection and recognition research has attracted great attention in recent years. Automatic face detection has great potential in a large array of application areas, including banking and security system access control, video surveillance, and multimedia information retrieval, etc. Face detection is a computationally difficult problem, the complexity of which is exacerbated when targeted for mobile and embedded deployment. In this paper, we discuss an architectural level design methodology for implementation of an embedded face detection system on a reconfigurable system on chip. We present models for performance estimation and validate these models with experimental values obtained from implementing our system on an FPGA platform. This modeling approach is shown to be efficient, accurate, and intuitive for designers to work with. Using this approach, we present several design options that trade-off various architectural features.
- Gaurav Aggarwal, Amit R. Chowdhury and Rama Chellappa. A System Identification Approach for Video-based Face Recognition. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR), Cambridge, UK, Pages 175-178, August, 2004.
pdf
Abstract
The paper poses video-to-video face recognition as a dynamical system identification and classification problem. Video-to-video means that both gallery and probe consists of videos. We model a moving face as a linear dynamical system whose appearance changes with pose. An auto-regressive and moving average (ARMA) model is used to represent such a system. The choice of ARMA model is based on its ability to take care of the change in appearance while modeling the dynamics of pose, expression etc. Recognition is performed using the concept of subspace angles to compute distances between probe and gallery video sequences. The results obtained are very promising given the extent of pose, expression and illumination variation in the video data used for experiments.
|
These documents are made available as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each copyright holder. These works may not be reposted without the explicit permission of the copyright holder.
|