PhD Proposal: Deep Learning for Image Geo-Localization and Its Application to Media Forensics

Talk
Bor-Chun Chen
Time: 
10.31.2018 16:00 to 18:00
Location: 

AVW 4172

Abstract:

Image with its location information can enable many different applications. Such as personalization and location-based services/marketing, and it can also be used as forensic evidence. However, there are some major challenges: first, it is extremely challenging to infer location from the image because many images only contain few cues about its location, and these cues can often be ambiguous. Second, even if the location information contains in the metadata, it can be easily tampered. Third, if there are a set of images taken from known location readily available, can we utilizing them to verify other query images with higher accuracy that are claimed to taken from the same location?To address these challenges, we first propose an image localization framework for extracting fine-grained location information (i.e. business venues) from images. Out framework utilizes the information available from social media websites such as Instagram and Yelp to extract a set of location-related concepts. Using these concepts with a multimodal recognition model, we were able to extract location information based on the image content.Secondly, in the case where the metadata is available, we propose a multi-task learning model to verify its authenticity by detecting the discrepancy between image content and its metadata. Our model first detects meteorological properties such as weather condition, sun angle, and temperatures from the image content and comparing it with the information from the online weather database. To facilitate the training and evaluating of our model, we create a large-scale outdoor dataset labeled with meteorological properties.Thirdly, we address the event verification problem by designing a convolutional neural networks configuration specifically target for image localization. The proposed networks utilize the bilinear pooling layer and attention module to extract detail location information from the image content.Finally, we describe some future research directions including generating synthesis training data using generative adversarial networks and extension from image to video.

Examining Committee:

Chair: Dr. Larry S. Davis Dept. rep: Dr. Thomas Goldstein Members: Dr. David W. Jacobs