About me

I build large language models end to end (from data curation and pretraining to fine-tuning and evaluation) with a focus on code generation in settings the field tends to overlook.

I am completing my Ph.D. in Computer Science at George Mason University, advised by Dr. Marcos Zampieri and Dr. Antonios Anastasopoulos. My dissertation, Exploring and Adapting Code LLMs for Underrepresented Domains, asks a recurring question: what happens to code models once you step outside English, outside Python, and outside the benchmarks everyone already optimizes for?

In Fall 2026 I join the University of Notre Dame as a Provost's Postdoctoral Fellow in Computer Science & Engineering, working with Dr. Joanna C. S. Santos on safety-by-construction guardrails and multilingual program synthesis for Code LLMs. Along the way I have released open benchmarks, corpora, and models, including mHumanEval, TigerLLM, and MojoBench, that the community can build on.

Open to collaboration

I am always glad to talk with people working on Code LLMs, multilingual NLP, or LLM safety. If you would like to collaborate, the fastest way to reach me is email.

Email me
Research interests

Code LLMs & program synthesis

Adapting code models to low-resource programming languages and underexplored domains.

LREC 2026ACL 2025NAACL 2025IEEE BigData 2024

Multilingual & low-resource NLP

Evaluation and modeling across natural languages, with a focus on Bangla and code-mixed text.

NAACL 2025EACL 2026BLP @ EMNLP 2023

LLM safety & AI in CS education

Guardrails for code assistants and the role of LLMs in introductory computing.

EACL 2026SIGCSE-TS 2025JIIS 2025

Benchmarks & datasets

Large, openly released resources that make under-tested settings measurable.

mHumanEvalCSEPromptsMentalHelp
Selected publications
2026
TigerCoder: A Novel Suite of LLMs for Code Generation in Bangla
Nishat Raihan, Antonios Anastasopoulos, Marcos Zampieri
LREC 2026
2026
CodeGuard: Improving LLM Guardrails in CS EducationEACL Findings
Nishat Raihan, Noah Erdachew, Jayoti Devi, Joanna C. S. Santos, Marcos Zampieri
Findings of EACL 2026
2025
TigerLLM: A Family of Bangla Large Language Models
Nishat Raihan, Marcos Zampieri
ACL 2025 (Short Papers)
2025
mHumanEval: A Multilingual Benchmark to Evaluate LLMs for Code Generation
Nishat Raihan, Antonios Anastasopoulos, Marcos Zampieri
NAACL 2025 (Main, Long Papers)