Reliable Audio Deepfake Detection in Variable Conditions via Quantum-Kernel SVMs
| dc.contributor.author | Amin, Lisan Al | |
| dc.contributor.author | Janeja, Vandana | |
| dc.date.accessioned | 2026-01-22T16:19:24Z | |
| dc.date.issued | 2025-12-21 | |
| dc.description | International Conference On Data Mining,Novemeber 12-15, 2025, Washington,USA | |
| dc.description.abstract | Detecting synthetic speech is challenging when labeled data are scarce and recording conditions vary. Existing end-to-end deep models often overfit or fail to generalize, and while kernel methods can remain competitive, their performance heavily depends on the chosen kernel. Here, we show that using a quantum kernel in audio deepfake detection reduces falsepositive rates without increasing model size. Quantum feature maps embed data into high-dimensional Hilbert spaces, enabling the use of expressive similarity measures and compact classifiers. Building on this motivation, we compare quantum-kernel SVMs (QSVMs) with classical SVMs using identical mel-spectrogram preprocessing and stratified 5-fold cross-validation across four corpora (ASVspoof 2019 LA, ASVspoof 5 (2024), ADD23, and an In-the-Wild set). QSVMs achieve consistently lower equalerror rates (EER): 0.183 vs. 0.299 on ASVspoof 5 (2024), 0.081 vs. 0.188 on ADD23, 0.346 vs. 0.399 on ASVspoof 2019, and 0.355 vs. 0.413 In-the-Wild. At the EER operating point (where FPR equals FNR), these correspond to absolute false-positiverate reductions of 0.116 (38.8%), 0.107 (56.9%), 0.053 (13.3%), and 0.058 (14.0%), respectively. We also report how consistent the results are across cross-validation folds and margin-based measures of class separation, using identical settings for both models. The only modification is the kernel; the features and SVM remain unchanged, no additional trainable parameters are introduced, and the quantum kernel is computed on a conventional computer. | |
| dc.description.sponsorship | This work is funded by the National Science Foundation Award #2346473 ”CIRC: DEV: Community Infrastructure for Advancing Audio Deepfake Detection” | |
| dc.description.uri | http://arxiv.org/abs/2512.18797 | |
| dc.format.extent | 9 pages | |
| dc.genre | conference papers and proceedings | |
| dc.genre | postprints | |
| dc.identifier | doi:10.13016/m276rh-xfb5 | |
| dc.identifier.uri | https://doi.org/10.48550/arXiv.2512.18797 | |
| dc.identifier.uri | http://hdl.handle.net/11603/41577 | |
| dc.language.iso | en | |
| dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
| dc.relation.ispartof | UMBC Information Systems Department | |
| dc.relation.ispartof | UMBC Faculty Collection | |
| dc.rights | CC0 1.0 Universal | |
| dc.rights.uri | https://creativecommons.org/publicdomain/zero/1.0/ | |
| dc.subject | Computer Science - Artificial Intelligence | |
| dc.subject | Computer Science - Sound | |
| dc.subject | UMBC Cybersecurity Institute | |
| dc.subject | UMBC Multi-Data (MData) Lab | |
| dc.title | Reliable Audio Deepfake Detection in Variable Conditions via Quantum-Kernel SVMs | |
| dc.type | Text | |
| dcterms.creator | https://orcid.org/0000-0003-0130-6135 | |
| dcterms.creator | https://orcid.org/0009-0005-0549-7727 |
Files
Original bundle
1 - 1 of 1
