|
Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding
Akash Kumar,
Zsolt Kira,
Yogesh Singh Rawat
Under review
First foundation model adaptation for dense multimodal video detection task without any labels. Context aware and self-paced progressive scene learning approach.
|
|
Stable Mean Teacher for Semi-Supervised Video Action Detection
Akash Kumar,
Sirshapan Mitra,
Yogesh Singh Rawat
Association for the Advancement of Artificial Intelligence (AAAI), 2025
paper  / 
code
Learning from mistakes on labelled set and transfer that learning to pseudo labels from unlabeled set to enhance spatio-temporal localization.
Class-agnostic spatio-temporal refinement module and temporal coherency constraint for better spatio-temporal localization.
|
|
Semi-supervised Active Learning for Video Action Detection
Ayush Singh,
Aayush J Rana,
Akash Kumar,
Shruti Vyas,
Yogesh Singh Rawat
Association for the Advancement of Artificial Intelligence (AAAI), 2024
paper  / 
code
High-pass filtering for enhanced pseudo labels to improvise spatio-temporal localization. Simple sample augmentation strategy for informative sample selection.
|
|
End-to-End Semi-Supervised Learning for Video Action Detection
Akash Kumar,
Yogesh Singh Rawat
Computer Vision and Pattern Recognition Conference (CVPR), 2022
paper  / 
code
First end-to-end semi-supervised approach for video action detection task. Short-term and long-term smoothness constraints to exploit spatio-temporal coherency.
|
|
Reviewer, IEEE Transaction on Image Processing
Reviewer, CVPR 2023, 2024, 2025
Reviewer, ICLR 2023, 2024, 2025
Reviewer, ECCV 2022, 2024
Reviewer, ICCV 2023
Reviewer, NeurIPS 2023, 2024
|
Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.
|