PhD Proposal: Exploiting High Level Knowledge for Incrementally Understanding Scenes and Videos

Talk

Varun Nagaraja

Time:

05.13.2015 11:00 to 12:30

Location:

AVW 4424

URL:

https://talks.cs.umd.edu/talks/1032

Structure in scenes and videos has been extensively used for improving the performance of tasks like object detection and event detection. While structure provides high level knowledge to filter out noise from the low level detections, it can also be used as a guide to incrementally understand scenes and videos. The advantage of incremental understanding is restricting the amount of computation time and/or resources spent for various detection tasks. In the first part of my work, I propose a technique for incrementally constructing Markov networks to perform event detection in basketball videos. In the second part, I propose a technique for incrementally searching regions in an image to detect objects of a query class.
To detect events in a structured scenario like a basketball game, the rules of the game can be applied to remove false positive events that are hypothesized by a low level event detector. Typically, the high level semantic analysis involves constructing a Markov network over the low level detections to encode relationships between them. In complex higher order networks (e.g. Markov Logic Networks), each low level detection can be part of many relationships and the network size grows rapidly as a function of the number of detections. I propose a feedback based incremental technique to keep the network size small. The network is initialized with detections above a high confidence threshold and then based on the high level semantics in the initial network, relevant detections are incrementally selected from the remaining ones that are below the threshold.
In situations where we are interested in identifying the location of an object of a particular class, a passive computer vision system would process all the regions in the image to finally output a small region. Instead, we can use the structure in the scene to search for objects without processing the entire image. I propose a search technique that sequentially processes image regions such that the regions that are more likely to correspond to the query class object are explored earlier. The problem is framed as a Markov decision process and an imitation learning algorithm is used to learn a search strategy. Since structure in the scene is very essential to perform an intelligent search, our technique is illustrated on indoor scene images as they contain both unary structure information (depth, height) and spatial context between objects in the scene.
Examining Committee:
Committee Chair: - Dr. Larry S. Davis
Dept's Representative - Dr. Thomas Goldstein
Committee Member(s): - Dr. David Jacobs

Upcoming Events

Event

04.26.2024 12:00 to 13:30

IRB-4105

Computer Science APT Meeting

Event

04.26.2024 13:00 to 14:00

IRB-5105

Computer Science Instructional Faculty Meeting

Talk

04.26.2024 13:30 to 15:00

ATL 3100A

PhD Proposal: Towards the Verification of Quantum Networks
Yusuf Alnawakhtha

Event

04.26.2024 15:00 to 16:30

IRB-0318

Computer Science Education Committee Meeting

Talk

04.29.2024 11:30 to 12:30

IRB 4107

PhD Proposal: Multi-Agent Autonomous Decision Making in Artificial Intelligence
Saptarashmi Bandyopadhyay

Talk

04.29.2024 15:00 to 16:00

IRB 5105

PhD Proposal: Scaling Policy Gradient Methods to Open-Ended Domains
Ryan Sullivan

Talk

04.30.2024 10:00 to 12:00

IRB 4105

AI Empowered Music Education
Snehesh Shrestha

Talk

04.30.2024 12:30 to 15:00

IRB 4107

Towards Trustworthy Models in Machine Learning
Xiaoyu Liu

Talk

05.01.2024 15:00 to 17:00

IRB IRB-4105

PhD Defense: Feedback for Vision
Michael Maynord

Talk

05.02.2024 12:30 to 14:00

IRB 4107

Towards AI Alignment: Advancing Fairness, Reliability, and Human-Like Perception in AI
Bang An