Dheeraj Varghese

I’m a PhD candidate at the VIS Lab, supervised by Cees Snoek. I work on generalist multimodal foundation models as part of the Horizon Europe ELLIOT project. My research focuses on designing unified architectures that combine modalities into a shared space, aiming for models that generalize well, adapt efficiently, and assist meaningfully across a wide range of tasks.

Previously, I worked on combining discrete diffusion and autoregression for multilingual image generation with Mohammad M. Derakhshani, and explored curriculum learning in vision-language models under the supervision of Yuki Asano.

At my core, I’m an applied engineer with an enthusiasm for recreating intelligence that serves as a tool, to make tasks easier for the human user. Sample efficiency in learning, blurring the context window, and unified representation spaces - all capture my attention at the moment.

news

Nov 19, 2025	Presented NeoBabel on Day 2 of the Cohere Connect Conference ⚡
Aug 19, 2025	Served as a reviewer for the ICCV LIMIT Workshop 2025.
Jul 18, 2025	Two of my works: NeoBabel and TaxonomiGQA are out! 🎉
Mar 11, 2025	Co-organized a hackathon for the First Workshop on Structure & Generalization in Multimodal Language Understanding (SAGE-MLU 2025)
Mar 21, 2024	Will be a Teaching Assistant for Natural Language Processing at VU!

selected publications

Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It

Yulu Qin*, Dheeraj Varghese*, Adam Dahlgren Lindström, and 3 more authors

NeurIPS, 2025

Bib HTML PDF

@article{qin2025visionandlanguagetraininghelpsdeploy,
  title = {Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It},
  author = {Qin*, Yulu and Varghese*, Dheeraj and Lindström, Adam Dahlgren and Donatelli, Lucia and Misra, Kanishka and Kim, Najoung},
  year = {2025},
  journal = {NeurIPS}
}

NeoBabel: An Inclusive Multilingual Open Tower for Visual Generation

Mohammad Mahdi Derakhshani*, Dheeraj Varghese*, Marzieh Fadaee, and 1 more author

In EurIPS 2025 Workshop on Principles of Generative Modeling (PriGM) , 2025

Bib HTML PDF

@inproceedings{anonymous2025neobabel,
  title = {NeoBabel: An Inclusive Multilingual Open Tower for Visual Generation},
  author = {Derakhshani*, Mohammad Mahdi and Varghese*, Dheeraj and Fadaee, Marzieh and Snoek, Cees GM},
  booktitle = {EurIPS 2025 Workshop on Principles of Generative Modeling (PriGM)},
  year = {2025},
  url = {https://openreview.net/forum?id=tYanZu9DUG},
}