Research output

Publications

My name in bold. A superscript ^m marks mentees I co-authored with. The colored pill on each paper gives its area, matching the tags on my CV.

40+

Publications

650+

Citations

14

h-index

19

i10-index

Google Scholar Full CV (PDF)

Journal articles

J2

On the Performance of Large Language Models on Introductory Programming Assignments

Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Mohammed Latif Siddiq, Christian Newman, Tharindu Ranasinghe, Joanna C. S. Santos, Marcos Zampieri

Journal of Intelligent Information Systems, Springer (2025)

Code LLMs

J1

An Android-Based Useful Text Extraction Framework Using Image and Natural Language Processing

Rafsanjany Kushol, Imamul Ahsan, Md. Nishat Raihan

International Journal of Computer Theory and Engineering (IJCTE), vol. 10, no. 3, pp. 77–83 (2018)

Vision

Conference papers

C12

TigerCoder: A Novel Suite of LLMs for Code Generation in Bangla

Nishat Raihan, Antonios Anastasopoulos, Marcos Zampieri

Language Resources and Evaluation Conference (LREC 2026)

Code LLMs

C11

CodeGuard: Improving LLM Guardrails in CS Education

Nishat Raihan, Noah Erdachew^m, Jayoti Devi^m, Joanna C. S. Santos, Marcos Zampieri

Findings of the 19th Conference of the European Chapter of the ACL (EACL 2026)

AI Safety

A taxonomy, an 8K-prompt dataset, and PromptShield (0.93 F1) that cut harmful code completions by 30–65% while preserving legitimate coding assistance.

C10

TigerLLM: A Family of Bangla Large Language Models

Nishat Raihan, Marcos Zampieri

63rd Annual Meeting of the ACL (ACL 2025), Short Papers

NLP

C9

mHumanEval: A Multilingual Benchmark to Evaluate Large Language Models for Code Generation

Nishat Raihan, Antonios Anastasopoulos, Marcos Zampieri

NAACL 2025, Main, Long Papers

Code LLMs

PDF Code Dataset Blog

C8

MojoBench: Language Modeling and Benchmarks for Mojo

Nishat Raihan, Joanna C. S. Santos, Marcos Zampieri

Findings of NAACL 2025

Code LLMs

C7

Large Language Models in Computer Science Education: A Systematic Literature Review

Nishat Raihan, Mohammed Latif Siddiq, Joanna C. S. Santos, Marcos Zampieri

56th ACM Technical Symposium on Computer Science Education (SIGCSE-TS 2025)

CS Education

C6

Code LLMs: A Taxonomy-based Survey

Nishat Raihan, Christian Newman, Marcos Zampieri

2024 IEEE International Conference on Big Data (IEEE BigData 2024)

Code LLMs

C5

CSEPrompts: A Benchmark of Introductory Computer Science Prompts

Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Christian Newman, Tharindu Ranasinghe, Marcos Zampieri

27th International Symposium on Methodologies for Intelligent Systems (ISMIS 2024)

CS Education

C4

MentalHelp: A Multi-Task Dataset for Mental Health in Social Media

Md Nishat Raihan, Sadiya Sayara Chowdhury Puspo^m, Shafkat Farabi, Ana-Maria Bucur, Tharindu Ranasinghe, Marcos Zampieri

LREC-COLING 2024

NLP

Paper PDF Dataset

C3

An Experimental Analysis on the Sensitivity of the Most Widely Used Edge Detection Methods to Different Noise Types

M. Raihan, N. Ulfat, N. Saqib

International Conference on Computing Advancements (ICCA 2022)

Vision

C2

A Novel Approach to Classify Electrocardiogram Signals Using Deep Neural Networks

T. Ahmed, A. Rahman, T. M. Chowdhury, R. Kushol, M. N. Raihan

2nd International Conference on Computer and Information Sciences (ICCIS 2020), IEEE

Time Series

C1

A Complete Bangla Optical Character Recognition System: An Effective Approach

T. Ahmed, M. N. Raihan, R. Kushol, M. S. Salekin

22nd International Conference on Computer and Information Technology (ICCIT 2019), IEEE

Vision

Workshop & shared-task papers

W18

Temporal Text Classification with Large Language Models

Nishat Raihan, Marcos Zampieri

6th International Conference on NLP for the Digital Humanities (NLP4DH 2026)

NLP

W17

Large Language Models for Mental Health: A Multilingual Evaluation

Nishat Raihan, Sadiya Sayara Chowdhury Puspo, Ana-Maria Bucur, Stevie Chancellor, Marcos Zampieri

2nd Workshop on Language Models for Low-Resource Languages (LoResLM @ EACL 2026)

NLP

W16

Overview of BLP-2025 Task 2: Code Generation in Bangla

Nishat Raihan, Mohammad Anas Jawad, Md Mezbaur Rahman, Noshin Ulfat, Pranav Gupta, Mehrab Mustafy Rahman, Shubhra Kanti Karmakar, Marcos Zampieri

2nd Workshop on Bangla Language Processing (BLP @ IJCNLP-AACL 2025)

Code LLMs

W15

Py-holmes: Causal Testing for Deep Neural Networks in Python

Wren McQueary, Sadia Afrin Mim, Md. Nishat Raihan, Justin Smith, Brittany Johnson

Companion Proceedings of FSE 2024

ML Testing

W14

The BEA 2024 Shared Task on the Multilingual Lexical Simplification Pipeline

Matthew Shardlow et al. (incl. M. N. Raihan)

19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

NLP

W13

An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset Using the MultiLS Framework

Matthew Shardlow et al. (incl. M. N. Raihan)

3rd Workshop on Tools and Resources for People with Reading Difficulties (READI @ LREC-COLING 2024)

NLP

W12

EmoMix-3L: A Code-Mixed Dataset for Bangla-English-Hindi Emotion Detection

Nishat Raihan et al.

WILDRE Workshop, LREC-COLING 2024

NLP

W11

MasonTigers at SemEval-2024 Task 9: Solving Puzzles with an Ensemble of Chain-of-Thoughts

Md Nishat Raihan, Dhiman Goswami, Al Nahian Bin Emran, Sadiya Sayara Chowdhury Puspo, Amrita Ganguly, Marcos Zampieri

18th International Workshop on Semantic Evaluation (SemEval @ NAACL 2024)

NLP

W10

MasonTigers at SemEval-2024 Task 1: An Ensemble Approach for Semantic Textual Relatedness

D. Goswami, S. S. C. Puspo, M. N. Raihan, A. N. B. Emran, A. Ganguly, M. Zampieri

18th International Workshop on Semantic Evaluation (SemEval @ NAACL 2024)

NLP

W9

MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection

S. S. C. Puspo, M. N. Raihan, D. Goswami, A. N. B. Emran, A. Ganguly, O. Uzuner

18th International Workshop on Semantic Evaluation (SemEval @ NAACL 2024)

AI Safety

W8

MasonPerplexity at ClimateActivism 2024: Advanced Ensemble Techniques and Data Augmentation for Stance and Hate Event Identification

A. Ganguly, S. S. C. Puspo, D. Goswami, M. N. Raihan

LT-EDI Workshop, EACL 2024

NLP

W7

MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles

A. Ganguly, A. N. B. Emran, S. S. C. Puspo, M. N. Raihan, D. Goswami, M. Zampieri

7th CASE Workshop (CASE @ EACL 2024)

NLP

W6

MasonTigers at LT-EDI-2024: An Ensemble Approach towards Detecting Homophobia and Transphobia in Social Media Comments

D. Goswami, S. S. C. Puspo, M. N. Raihan, A. N. B. Emran

4th LT-EDI Workshop (LT-EDI @ EACL 2024)

NLP

W5

Offensive Language Identification in Transliterated and Code-Mixed BanglaBest Paper

Md Nishat Raihan, Umma Hani Tanmoy, Anika Binte Islam, Kai North, Tharindu Ranasinghe, Antonios Anastasopoulos, Marcos Zampieri

1st Workshop on Bangla Language Processing (BLP @ EMNLP 2023), Singapore

NLP

W4

nlpBDpatriots at BLP-2023 Task 1: A Two-Step Classification for Violence Inciting Text Detection in Bangla

Md Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Marcos Zampieri

1st Workshop on Bangla Language Processing (BLP @ EMNLP 2023), Singapore

NLP

W3

nlpBDpatriots at BLP-2023 Task 2: A Transfer Learning Approach to Bangla Sentiment Analysis

Dhiman Goswami, Md Nishat Raihan, Sadiya Sayara Chowdhury Puspo, Marcos Zampieri

1st Workshop on Bangla Language Processing (BLP @ EMNLP 2023), Singapore

NLP

W2

OffMix-3L: A Novel Code-Mixed Dataset in Bangla-English-Hindi for Offensive Language Identification

Dhiman Goswami, Md Nishat Raihan, Antara Mahmud, Antonios Anastasopoulos, Marcos Zampieri

1st Workshop in South East Asian Language Processing (SEALP @ AACL 2023), Bali

NLP

W1

SentMix-3L: A Bangla-English-Hindi Code-Mixed Dataset for Sentiment Analysis

Md Nishat Raihan, Dhiman Goswami, Antara Mahmud, Antonios Anastasopoulos, Marcos Zampieri

11th International Workshop on NLP for Social Media (AACL 2023), Bali

NLP

Preprints & technical reports

P6

A Taxonomy of Programming Languages for Code Generation

Nishat Raihan, Christian Newman, Marcos Zampieri

arXiv:2604.00239 (2026)

Code LLMs

P5

Multi-SaLLM: A Multilingual Security Assessment of Generated Code

Mohammed Latif Siddiq, Noshin Ulfat, Nishat Raihan, Joanna C. S. Santos, Marcos Zampieri

Preprint (2025)

AI Safety

P4

Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi

Md Nishat Raihan, Dhiman Goswami, Antara Mahmud

arXiv:2309.10272 (2023)

NLP

P3

Determining the Optimal Number of Clusters for Time Series Datasets with Symbolic Pattern Forest

Md Nishat Raihan

arXiv:2310.00820 (2023)

Time Series

P2

Contrast Enhancement of Medical X-Ray Image Using Morphological Operators with Optimal Structuring Element

R. Kushol, M. Raihan, M. S. Salekin, A. B. M. Rahman

arXiv:1905.08545 (2019)

Vision

P1

An Effective Navigation System Combining Object Detection and Obstacle Detection Based on Depth Information for the Visually Impaired

Md Raihan, Hossain Mohammad Seym

B.Sc. Thesis, Dept. of Computer Science and Engineering (2018)

Vision