BanglaTalk: Towards Real-Time Speech Assistance for Bengali Regional Dialects
| dc.contributor.author | Hasan, Jakir | |
| dc.contributor.author | Roy Dipta, Shubhashis | |
| dc.date.accessioned | 2025-11-21T00:30:23Z | |
| dc.date.issued | 2025-10-07 | |
| dc.description.abstract | Real-time speech assistants are becoming increasingly popular for ensuring improved accessibility to information. Bengali, being a low-resource language with a high regional dialectal diversity, has seen limited progress in developing such systems. Existing systems are not optimized for real-time use and focus only on standard Bengali. In this work, we present BanglaTalk, the first real-time speech assistance system for Bengali regional dialects. BanglaTalk follows the client-server architecture and uses the Real-time Transport Protocol (RTP) to ensure low-latency communication. To address dialectal variation, we introduce a dialect-aware ASR system, BRDialect, developed by fine-tuning the IndicWav2Vec model in ten Bengali regional dialects. It outperforms the baseline ASR models by 12.41-33.98% on the RegSpeech12 dataset. Furthermore, BanglaTalk can operate at a low bandwidth of 24 kbps while maintaining an average end-to-end delay of 4.9 seconds. Low bandwidth usage and minimal end-to-end delay make the system both cost-effective and interactive for real-time use cases, enabling inclusive and accessible speech technology for the diverse community of Bengali speakers. | |
| dc.description.uri | http://arxiv.org/abs/2510.06188 | |
| dc.format.extent | 15 pages | |
| dc.genre | journal articles | |
| dc.genre | preprints | |
| dc.identifier | doi:10.13016/m2irrz-zqey | |
| dc.identifier.uri | https://doi.org/10.48550/arXiv.2510.06188 | |
| dc.identifier.uri | http://hdl.handle.net/11603/40880 | |
| dc.language.iso | en | |
| dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
| dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department | |
| dc.relation.ispartof | UMBC Student Collection | |
| dc.rights | This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author. | |
| dc.subject | Computer Science - Machine Learning | |
| dc.subject | Computer Science - Artificial Intelligence | |
| dc.subject | Computer Science - Computation and Language | |
| dc.subject | UMBC Interactive Robotics and Language Lab | |
| dc.title | BanglaTalk: Towards Real-Time Speech Assistance for Bengali Regional Dialects | |
| dc.type | Text | |
| dcterms.creator | https://orcid.org/0000-0002-9176-1782 |
Files
Original bundle
1 - 1 of 1
