Blue Sky: Expert-in-the-Loop Representation Learning Framework for Audio Anti-Spoofing: Multimodal, Multilingual, Multi-speaker, Multi-attack (4M) Scenarios
Loading...
Links to Files
Author/Creator
Author/Creator ORCID
Date
2025
Type of Work
Department
Program
Citation of Original Publication
Khanjani, Zahra, Vandana P. Janeja, Christine Mallinson, and Sanjay Purushotham. “Blue Sky: Expert-in-the-Loop Representation Learning Framework for Audio Anti-Spoofing: Multimodal, Multilingual, Multi-Speaker, Multi-Attack (4M) Scenarios.” In Proceedings of the 2025 SIAM International Conference on Data Mining (SDM), 327–30. Proceedings. Society for Industrial and Applied Mathematics, 2025. https://doi.org/10.1137/1.9781611978520.32.
Rights
© 2025 Society for Industrial and Applied Mathematics
Abstract
Audio spoofing has surged with the rise of generative artificial intelligence, posing a serious threat to online communication. Recent studies have shown promising avenues in detecting spoofed audio specifically those that use human expert knowledge in representation learning, but more work is needed to evaluate performance across various realistic scenarios that tend to pose challenges in spoofed audio detection. In this paper, we introduce a comprehensive framework for expert-in-the-loop representation learning for audio anti-spoofing that is robust enough to address four specific challenging scenarios. Multimodal, Multilingual, Multi-speaker, and Multi-attack (4M). Preliminary results demonstrate the framework’s potential effectiveness in audio anti-spoofing.