Blue Sky: Expert-in-the-Loop Representation Learning Framework for Audio Anti-Spoofing: Multimodal, Multilingual, Multi-speaker, Multi-attack (4M) Scenarios

Khanjani, Zahra, Vandana P. Janeja, Christine Mallinson, and Sanjay Purushotham. “Blue Sky: Expert-in-the-Loop Representation Learning Framework for Audio Anti-Spoofing: Multimodal, Multilingual, Multi-Speaker, Multi-Attack (4M) Scenarios.” In Proceedings of the 2025 SIAM International Conference on Data Mining (SDM), 327–30. Proceedings. Society for Industrial and Applied Mathematics, 2025. https://doi.org/10.1137/1.9781611978520.32.

Rights

Subjects

Multilingual
UMBC Cybersecurity Institute
generative arti-ficial intelligence
Multi-attack (4M )Scenarios
Audio spoofing
Multimodal
UMBC M
Multi-speaker

Abstract

Audio spoofing has surged with the rise of generative artificial intelligence, posing a serious threat to online communication. Recent studies have shown promising avenues in detecting spoofed audio specifically those that use human expert knowledge in representation learning, but more work is needed to evaluate performance across various realistic scenarios that tend to pose challenges in spoofed audio detection. In this paper, we introduce a comprehensive framework for expert-in-the-loop representation learning for audio anti-spoofing that is robust enough to address four specific challenging scenarios. Multimodal, Multilingual, Multi-speaker, and Multi-attack (4M). Preliminary results demonstrate the framework’s potential effectiveness in audio anti-spoofing.

Blue Sky: Expert-in-the-Loop Representation Learning Framework for Audio Anti-Spoofing: Multimodal, Multilingual, Multi-speaker, Multi-attack (4M) Scenarios

Files

Links to Files

Permanent Link

Collections

Author/Creator

Author/Creator ORCID

Date

Type of Work

Department

Program

Citation of Original Publication

Rights

Subjects

Abstract