PhD Defense: Feedback for Vision
IRB IRB-4105
This thesis explores the application of feedback in image and action understanding, as well as video monitoring. It introduces Mid-Vision Feedback (MVF), a mechanism that modulates perception based on high-level categorical expectations, enhancing accuracy and contextual consistency in object classification. This approach is extended to action understanding through Sub-Action Modulation (SAM), which incorporates context into action interpretation by hierarchically grouping action primitives. SAM demonstrates superior performance over various video understanding architectures, improving action recognition and anticipation accuracies. Additionally, a configurable perception pipeline architecture, the Image Surveillance Assistant (ISA), is presented to aid watchstanders in video surveillance tasks by integrating human-specified expectations into the perceptual loop. Lastly, taking inspiration from contextual contrasting in MVF, a learning formulation for motion and context separation is proposed, showing improvements in action recognition and anticipation accuracies across multiple datasets.
Examining Committee
Chair:
Dr. John Aloimonos
Dean's Representative:
Dr. Shihab Shamma
Members:
Dr. Cornelia Fermüller
Dr. Dinesh Manocha
Dr. Ramani Duraiswami