PhD Defense: Efficient Image and Video Representations for Retrieval

Talk
Sravanthi Bondugula
Time: 
05.06.2016 13:30 to 15:30
Location: 

AVW 3450

Image(Video) retrieval is an interesting problem of retrieving images(videos) similar to the query. Similar images(videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we shall present techniques that obtain improved input representations in both supervised and unsupervised settings for images and videos.
First, we will discuss the supervised hashing techniques. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspect of achieving faster retrieval with binary codes in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code(address). We address this problem by presenting an efficient supervised hashing method that aims to explicitly map all the images of the same class to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of AlexNet with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times.
In the second part, we address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the previous supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art.
In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors to high dimensional binary codes. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. We overcome the computational challenges of directly applying the spherical hashing algorithm and present a practical hierarchical model that utilizes divide and conquer techniques to compress such high dimensional vectors. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We further show that we significantly boost the performance using features from other modalities.
Examining Committee:
Chair: Dr. Larry S. Davis
Dean’s rep: Dr. Louiqa Raschid
Members: Dr. Ramani Duraiswami
Dr. Dana Nau
Dr. V.S. Subrahmanian