Research Interests
My research broadly lies at the intersection of vision and language. Specifically, I am interested in grounding language in images and videos which entails associating language phrases to visual concepts. Such visual-linguistic associations encompass objects, actions, their relations and are crucial to rich image and video understanding.
Publications
Google Scholar Semantic Scholar
-
Vision-Language Pre-training Generalization: From Image-Text Pairs to Diverse Vision-Text Tasks
Arka Sadhu, Ram Nevatia
Accepted to WACV’24
Paper -
Unaligned Video-Text Pre-training using Iterative Alignment
Arka Sadhu, Licheng Yu, Animesh Sinha, Yu Chen, Ram Nevatia, Ning Zhang
Paper -
Gradient-based Memory Editing for Task-Free Continual Learning
Xisen Jin, Arka Sadhu, Junyi Du, Xiang Ren
Neurips 2021
ArXiv Code -
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi
CVPR 2021
ArXiv Code Website -
Video Question Answering with Phrases via Semantic Roles
Arka Sadhu, Kan Chen, Ram Nevatia
NAACL 2021
ArXiv Code -
Improving Object Detection and Attribute Recognition by Feature Entanglement Reduction
Zhaoheng Zheng, Arka Sadhu, Ram Nevatia
ICIP 2021
ArXiv -
Utilizing Every Image Object for Semi-supervised Phrase Grounding
Haidong Zhu, Arka Sadhu, Zhaoheng Zheng, Ram Nevatia
WACV 2021
ArXiv -
Visually Grounded Continual Learning of Compositional Phrases
Xisen Jin, Junyi Du, Arka Sadhu, Ram Nevatia, Xiang Ren
EMNLP 2020
ArXiv Code Website -
Video Object Grounding using Semantic Roles in Language Description
Arka Sadhu, Kan Chen, Ram Nevatia
CVPR 2020
ArXiv Code -
Zero-Shot Grounding of Objects from Natural Language Queries
Arka Sadhu, Kan Chen, Ram Nevatia
ICCV 2019 (Oral)
ArXiv Code