Research Interests

My research broadly lies at the intersection of vision and language. Specifically, I am interested in grounding language in images and videos which entails associating language phrases to visual concepts. Such visual-linguistic associations encompass objects, actions, their relations and are crucial to rich image and video understanding.

Publications

Google Scholar Semantic Scholar

  1. Visual Semantic Role Labeling for Video Understanding
    Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi
    CVPR 2021
    ArXiv Code Website

  2. Video Question Answering with Phrases via Semantic Roles
    Arka Sadhu, Kan Chen, Ram Nevatia
    NAACL 2021
    ArXiv Code

  3. Improving Object Detection and Attribute Recognition by Feature Entanglement Reduction
    Zhaoheng Zheng, Arka Sadhu, Ram Nevatia
    ICIP 2021
    ArXiv

  4. Utilizing Every Image Object for Semi-supervised Phrase Grounding
    Haidong Zhu, Arka Sadhu, Zhaoheng Zheng, Ram Nevatia
    WACV 2021
    ArXiv

  5. Visually Grounded Continual Learning of Compositional Phrases
    Xisen Jin, Junyi Du, Arka Sadhu, Ram Nevatia, Xiang Ren
    EMNLP 2020
    ArXiv Code Website

  6. Video Object Grounding using Semantic Roles in Language Description
    Arka Sadhu, Kan Chen, Ram Nevatia
    CVPR 2020
    ArXiv Code

  7. Zero-Shot Grounding of Objects from Natural Language Queries
    Arka Sadhu, Kan Chen, Ram Nevatia
    ICCV 2019 (Oral)
    ArXiv Code