BanglaTalk: Towards Real-Time Speech Assistance for Bengali Regional Dialects

dc.contributor.authorHasan, Jakir
dc.contributor.authorRoy Dipta, Shubhashis
dc.date.accessioned2025-11-21T00:30:23Z
dc.date.issued2025-10-07
dc.description.abstractReal-time speech assistants are becoming increasingly popular for ensuring improved accessibility to information. Bengali, being a low-resource language with a high regional dialectal diversity, has seen limited progress in developing such systems. Existing systems are not optimized for real-time use and focus only on standard Bengali. In this work, we present BanglaTalk, the first real-time speech assistance system for Bengali regional dialects. BanglaTalk follows the client-server architecture and uses the Real-time Transport Protocol (RTP) to ensure low-latency communication. To address dialectal variation, we introduce a dialect-aware ASR system, BRDialect, developed by fine-tuning the IndicWav2Vec model in ten Bengali regional dialects. It outperforms the baseline ASR models by 12.41-33.98% on the RegSpeech12 dataset. Furthermore, BanglaTalk can operate at a low bandwidth of 24 kbps while maintaining an average end-to-end delay of 4.9 seconds. Low bandwidth usage and minimal end-to-end delay make the system both cost-effective and interactive for real-time use cases, enabling inclusive and accessible speech technology for the diverse community of Bengali speakers.
dc.description.urihttp://arxiv.org/abs/2510.06188
dc.format.extent15 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2irrz-zqey
dc.identifier.urihttps://doi.org/10.48550/arXiv.2510.06188
dc.identifier.urihttp://hdl.handle.net/11603/40880
dc.language.isoen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.relation.ispartofUMBC Student Collection
dc.rightsThis item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
dc.subjectComputer Science - Machine Learning
dc.subjectComputer Science - Artificial Intelligence
dc.subjectComputer Science - Computation and Language
dc.subjectUMBC Interactive Robotics and Language Lab
dc.titleBanglaTalk: Towards Real-Time Speech Assistance for Bengali Regional Dialects
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0002-9176-1782

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
251006188v1.pdf
Size:
1.39 MB
Format:
Adobe Portable Document Format