EFFICIENT AUDIO SOURCE SEPARATION USING MEL-SPECTROGRAMS
| dc.contributor.advisor | Oates, James T | |
| dc.contributor.author | Weinheimer, Kurt Anthony | |
| dc.contributor.department | Computer Science and Electrical Engineering | |
| dc.contributor.program | Computer Science | |
| dc.date.accessioned | 2021-09-01T13:55:30Z | |
| dc.date.available | 2021-09-01T13:55:30Z | |
| dc.date.issued | 2020-01-20 | |
| dc.description.abstract | Audio source separation deals with extracting a source of audio from a mixture, for example vocals from a musical recording. Recent strides have been made in the release of the Open-Unmix GitHub project in September of 2019 to provide new researchers with a framework to hit the ground running with state-of-the-art techniques. The base architecture uses a 3 layer bidirectional LSTM to complete a pixel-wise regression problem to estimate masks for each source'sspectrogram. The theses explores the idea of replacing the spectrograms in this process with mel-spectrograms which have achieved marginally better results in other audio problems such as speech recognition. A novel inverse function to convert from mel-spectrogram to spectrogram is provided that runs exponentially faster than the best available function with similar accuracy. We found that the results documented by the Open-Unmix project were reproducible and that the mel-spectrogram model did not provide an improvement. | |
| dc.format | application:pdf | |
| dc.genre | theses | |
| dc.identifier | doi:10.13016/m2oles-k8fn | |
| dc.identifier.other | 12273 | |
| dc.identifier.uri | http://hdl.handle.net/11603/22851 | |
| dc.language | en | |
| dc.relation.isAvailableAt | The University of Maryland, Baltimore County (UMBC) | |
| dc.relation.ispartof | UMBC Computer Science and Electrical Engineering Department Collection | |
| dc.relation.ispartof | UMBC Theses and Dissertations Collection | |
| dc.relation.ispartof | UMBC Graduate School Collection | |
| dc.relation.ispartof | UMBC Student Collection | |
| dc.source | Original File Name: Weinheimer_umbc_0434M_12273.pdf | |
| dc.subject | Audio | |
| dc.subject | LSTM | |
| dc.subject | Mel-Spectrogram | |
| dc.subject | Separation | |
| dc.subject | Source | |
| dc.subject | Unmix | |
| dc.title | EFFICIENT AUDIO SOURCE SEPARATION USING MEL-SPECTROGRAMS | |
| dc.type | Text | |
| dcterms.accessRights | Distribution Rights granted to UMBC by the author. | |
| dcterms.accessRights | This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu |
