Transparent Button

Hi, I'm Farhan, an incoming PhD student at KSoC, UofU. I'm currently working as a lecturer at IUT-CSE and a part-time research intern at Yaana Technologies. Prior to that, I obtained my Bachelor's degree from IUT-CSE.

Currently, I am exploring domain adaptation and continual learning for Vision-Language Models (VLMs), specifically its application in computer use. Broadly speaking, my research interests span:

  • Continual & Lifelong learning for VLMs
  • Agent-centric Benchmarks
  • Multimodal Adaptation & Transfer Learning

News and Updates

Research Highlights

Visual Robustness Benchmark for Visual Question Answering (VQA)

Md Farhan Ishmam*, Ishmam Tashdeed*, Talukder Asir Saadat*, Md Hamjajul Ashmafee, Abu Raihan Mostofa Kamal, Md Azam Hossain

  • Can VQA models maintain their performance in real-world scenarios, especially when prone to realistic visual corruptions, e.g. noise, blur?
  • How can we evaluate the robustness of VQA models when subjected to such common corruptions?
  • We propose a large-scale benchmark and robustness evaluation metrics to evaluate VQA models before deployment.
  • We quantify aspects of performance drop, e.g. rate, range, mean. We found that models with higher accuracy are not necessarily more robust.

BanglaTLit: A Benchmark Dataset for Back-Transliteration of Romanized Bangla

Md Fahim*, Fariha Tanjim Shifat*, Md Farhan Ishmam*, Deeparghya Dutta Barua, Fabiha Haider, Md Sakib Ul Rahman Sourove, Farhad Alam Bhuiyan

  • How can we enhance the representation of Romanized Bangla for automatic back-transliteration using seq2seq models?
  • We propose large-scale pre-training corpus and Bangla back-transliteration datasets for fully fine-tuning language encoders and seq2seq models.
  • We aggregate representations from transliterated Bangla encoders with seq2seq models [Dual Encoders->Aggregation->Decoder architecture] to achieve SOTA on Bangla back-transliteration.

From Image to Language: A Critical Analysis of Visual Question Answering (VQA) Approaches, Challenges, and Opportunities

Md Farhan Ishmam, Md Sakib Hossain Shovon, Muhammad Firoz Mridha, Nilanjan Dey

  • Comprehensive survey on VQA datasets, methods, metrics, challenges, and research opportunities.
  • New taxonomy that systematically categorizes VQA literature and multimodal learning tasks.
  • Novel real-world applications of VQA in domains e.g. assistive technology, education, and healthcare.