Investigating Causal Cues: Strengthening Spoofed Audio Detection with Human-Discernible Linguistic Features

Khanjani, Zahra; Ale, Tolulope; Wang, Jianwu; Davis, Lavon; Mallinson, Christine; Janeja, Vandana

Investigating Causal Cues: Strengthening Spoofed Audio Detection with Human-Discernible Linguistic Features

dc.contributor.author	Khanjani, Zahra
dc.contributor.author	Ale, Tolulope
dc.contributor.author	Wang, Jianwu
dc.contributor.author	Davis, Lavon
dc.contributor.author	Mallinson, Christine
dc.contributor.author	Janeja, Vandana
dc.date.accessioned	2024-10-28T14:30:44Z
dc.date.available	2024-10-28T14:30:44Z
dc.date.issued	2024-09-09
dc.description.abstract	Several types of spoofed audio, such as mimicry, replay attacks, and deepfakes, have created societal challenges to information integrity. Recently, researchers have worked with sociolinguistics experts to label spoofed audio samples with Expert Defined Linguistic Features (EDLFs) that can be discerned by the human ear: pitch, pause, word-initial and word-final release bursts of consonant stops, audible intake or outtake of breath, and overall audio quality. It is established that there is an improvement in several deepfake detection algorithms when they augmented the traditional and common features of audio data with these EDLFs. In this paper, using a hybrid dataset comprised of multiple types of spoofed audio augmented with sociolinguistic annotations, we investigate causal discovery and inferences between the discernible linguistic features and the label in the audio clips, comparing the findings of the causal models with the expert ground truth validation labeling process. Our findings suggest that the causal models indicate the utility of incorporating linguistic features to help discern spoofed audio, as well as the overall need and opportunity to incorporate human knowledge into models and techniques for strengthening AI models. The causal discovery and inference can be used as a foundation of training humans to discern spoofed audio as well as automating EDLFs labeling for the purpose of performance improvement of the common AI-based spoofed audio detectors.
dc.description.sponsorship	Authors would like to acknowledge support from the National Science Foundation Award #2210011. The codes and audio samples are available through our GitHub repository [8].
dc.description.uri	http://arxiv.org/abs/2409.06033
dc.format.extent	10 pages
dc.genre	journal articles
dc.genre	preprints
dc.identifier	doi:10.13016/m2o3ti-keat
dc.identifier.citation	Khanjani, Zahra, Tolulope Ale, Jianwu Wang, Lavon Davis, Christine Mallinson, and Vandana P. Janeja. “Investigating Causal Cues: Strengthening Spoofed Audio Detection with Human-Discernible Linguistic Features,” September 9, 2024. https://doi.org/10.48550/arXiv.2409.06033.
dc.identifier.uri	https://doi.org/10.48550/arXiv.2409.06033
dc.identifier.uri	http://hdl.handle.net/11603/36769
dc.language.iso	en
dc.relation.isAvailableAt	The University of Maryland, Baltimore County (UMBC)
dc.relation.ispartof	UMBC Faculty Collection
dc.relation.ispartof	UMBC GESTAR II
dc.relation.ispartof	UMBC Computer Science and Electrical Engineering Department
dc.relation.ispartof	UMBC Center for Real-time Distributed Sensing and Autonomy
dc.relation.ispartof	UMBC Office for the Vice President of Research
dc.relation.ispartof	UMBC Joint Center for Earth Systems Technology (JCET)
dc.relation.ispartof	UMBC Language, Literacy, and Culture Department
dc.relation.ispartof	UMBC Information Systems Department
dc.relation.ispartof	UMBC Center for Accelerated Real Time Analysis
dc.relation.ispartof	UMBC Office of Institutional Advancement
dc.relation.ispartof	UMBC Staff Collection
dc.relation.ispartof	UMBC Center for Social Science Scholarship
dc.relation.ispartof	UMBC Data Science
dc.relation.ispartof	UMBC Student Collection
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International CC BY-NC-ND 4.0 Deed
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Electrical Engineering and Systems Science - Audio and Speech Processing
dc.subject	Computer Science - Sound
dc.subject	Computer Science - Computation and Language
dc.subject	UMBC Big Data Analytics Lab
dc.subject	UMBC Cybersecurity Institute
dc.title	Investigating Causal Cues: Strengthening Spoofed Audio Detection with Human-Discernible Linguistic Features
dc.type	Text
dcterms.creator	https://orcid.org/0000-0002-9933-1170
dcterms.creator	https://orcid.org/0000-0003-0130-6135

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 2409.06033v1.pdf
Size:: 902.21 KB
Format:: Adobe Portable Document Format

Download

Collections

UMBC Faculty Collection
UMBC Center for Accelerated Real Time Analysis
UMBC Center for Real-time Distributed Sensing and Autonomy
UMBC Center for Social Science Scholarship
UMBC Computer Science and Electrical Engineering Department
UMBC Data Science