GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO

dc.contributor.authorRoy Dipta, Shubhashis
dc.contributor.authorMahbub, Khairul
dc.contributor.authorNajjar, Nadia
dc.date.accessioned2026-02-12T16:43:44Z
dc.date.issued2026-01-11
dc.description.abstractWe present a Bengali mathematical reasoning model called GanitLLM (named after the Bangla word for mathematics, "Ganit"), together with a new difficulty-aware Bengali math corpus and a curriculum-based GRPO pipeline. Bengali is one of the world's most widely spoken languages, yet existing LLMs either reason in English and then translate, or simply fail on multi-step Bengali math, in part because reinforcement learning recipes are tuned for high-resource languages and collapse under reward sparsity in low-resource settings. To address this, we construct Ganit, a rigorously filtered and decontaminated Bengali math dataset with automatic difficulty tags derived from the pass@k of a strong evaluator model. Building on this dataset, we propose Curriculum-GRPO, which combines multi-stage training (SFT + GRPO) with difficulty-aware sampling and verifiable rewards for format, numerical correctness, and Bengali reasoning. On Bn-MGSM and Bn-MSVAMP, GanitLLM-4B improves over its Qwen3-4B base by +8 and +7 accuracy points, respectively, while increasing the percentage of Bengali reasoning tokens from 14% to over 88% and reducing average solution length from 943 to 193 words.
dc.description.urihttp://arxiv.org/abs/2601.06767
dc.format.extent15 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m24hiy-kehw
dc.identifier.urihttps://doi.org/10.48550/arXiv.2601.06767
dc.identifier.urihttp://hdl.handle.net/11603/41846
dc.language.isoen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Student Collection
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subjectComputer Science - Machine Learning
dc.subjectComputer Science - Computation and Language
dc.subjectComputer Science - Artificial Intelligence
dc.subjectUMBC Interactive Robotics and Language Lab
dc.titleGanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-9176-1782

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2601.06767v1.pdf
Size:
1.39 MB
Format:
Adobe Portable Document Format