LingVarBench: Benchmarking LLMs on Entity Recognitions and Linguistic Verbalization Patterns in Phone-Call Transcripts

dc.contributor.authorMohammadi, Ali
dc.contributor.authorPaldhe, Manas
dc.contributor.authorChhabra, Amit
dc.contributor.authorSon, Youngseo
dc.contributor.authorSeshagiri, Vishal
dc.date.accessioned2026-02-12T16:44:16Z
dc.date.issued2025-01-18
dc.descriptionEACL 2026, Rabat, Morocco, March 24-29, 2026
dc.description.abstractWe study structured entity extraction from phone-call transcripts in customer-support and healthcare settings, where annotation is costly, and data access is limited by privacy and consent. Existing methods degrade under disfluencies, interruptions, and speaker overlap, yet large real-call corpora are rarely shareable. We introduce LingVarBench, a benchmark and semantic synthetic data generation pipeline that generates linguistically varied training data via (1) LLM-sampled entity values, (2) curated linguistic verbalization patterns covering diverse disfluencies and entity-specific readout styles, and (3) a value-transcript consistency filter. Using this dataset, DSPy's SIMBA automatically synthesizes and optimizes extraction prompts, reducing manual prompt engineering and targeting robustness to verbal variation. On real customer transcripts, prompts optimized solely on LingVarBench outperform zero-shot baselines and match or closely approach human-tuned prompts for structured entities such as ZIP code, date of birth, and name (F1 approximately 94-95 percent). For subjective questionnaire items, optimized prompts substantially improve over zero-shot performance and approach human-tuned prompts. LingVarBench offers a practical and cost-efficient path to deployment in a direct-answer setting, with real annotations later enabling additional refinement.
dc.description.sponsorshipFinally, we thank Infinitus System, Inc. for providing access to the real-world production transcripts used for evaluation, as well as for the financial support and computational resources that enabled this research.
dc.description.urihttps://arxiv.org/abs/2508.15801
dc.format.extent17 pages
dc.genreconference papers and proceedings
dc.genrepostprints
dc.identifierdoi:10.13016/m23sjb-d3hx
dc.identifier.urihttps://doi.org/10.48550/arXiv.2508.15801
dc.identifier.urihttp://hdl.handle.net/11603/41873
dc.language.isoen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.relation.ispartofUMBC Student Collection
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectUMBC KAI2 Knowledge-infused AI and Inference lab
dc.titleLingVarBench: Benchmarking LLMs on Entity Recognitions and Linguistic Verbalization Patterns in Phone-Call Transcripts
dc.typeText

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2508.15801v2.pdf
Size:
1.46 MB
Format:
Adobe Portable Document Format