Semi-supervised Active Learning for Video Action Detection


AAAI 2024

University of Central Florida
SSL

Overview of our proposed approach: Attention masks Mvar and Mgrad are computed for temporal coherence and gradient smoothness using the spatio-temporal localization.

Abstract

In this work, we focus on label efficient learning for video action detection. We develop a novel semi-supervised active learning approach which utilizes both labeled as well as unlabeled data along with informative sample selection for action detection. Video action detection requires spatio-temporal localization along with classification, which poses several challenges for both active learning informative sample selection as well as semi-supervised learning pseudo label generation. First, we propose NoiseAug, a simple augmentation strategy which effectively selects informative samples for video action detection. Next, we propose fft-attention, a novel technique based on high-pass filtering which enables effective utilization of pseudo label for SSL in video action detection by emphasizing on relevant activity region within a video. We evaluate the proposed approach on three different benchmark datasets, UCF-101-24, JHMDB-21, and Youtube-VOS. First, we demonstrate its effectiveness on video action detection where the proposed approach outperforms prior works in semi-supervised and weakly-supervised learning along with several baseline approaches in both UCF101-24 and JHMDB-21. Next, we also show its effectiveness on Youtube-VOS for video object segmentation demonstrating its generalization capability for other dense prediction tasks in videos.

Video

Results

SSL
SSL

Qualitative Analysis

SSL

BibTeX

@inproceedings{singh2024semi,
      title={Semi-supervised Active Learning for Video Action Detection},
      author={Singh, Ayush and Rana, Aayush J and Kumar, Akash and Vyas, Shruti and Rawat, Yogesh Singh},
      booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
      volume={38},
      number={5},
      pages={4891--4899},
      year={2024}
    }