EFFICIENT AUDIO SOURCE SEPARATION USING MEL-SPECTROGRAMS

dc.contributor.advisorOates, James T
dc.contributor.authorWeinheimer, Kurt Anthony
dc.contributor.departmentComputer Science and Electrical Engineering
dc.contributor.programComputer Science
dc.date.accessioned2021-09-01T13:55:30Z
dc.date.available2021-09-01T13:55:30Z
dc.date.issued2020-01-20
dc.description.abstractAudio source separation deals with extracting a source of audio from a mixture, for example vocals from a musical recording. Recent strides have been made in the release of the Open-Unmix GitHub project in September of 2019 to provide new researchers with a framework to hit the ground running with state-of-the-art techniques. The base architecture uses a 3 layer bidirectional LSTM to complete a pixel-wise regression problem to estimate masks for each source'sspectrogram. The theses explores the idea of replacing the spectrograms in this process with mel-spectrograms which have achieved marginally better results in other audio problems such as speech recognition. A novel inverse function to convert from mel-spectrogram to spectrogram is provided that runs exponentially faster than the best available function with similar accuracy. We found that the results documented by the Open-Unmix project were reproducible and that the mel-spectrogram model did not provide an improvement.
dc.formatapplication:pdf
dc.genretheses
dc.identifierdoi:10.13016/m2oles-k8fn
dc.identifier.other12273
dc.identifier.urihttp://hdl.handle.net/11603/22851
dc.languageen
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department Collection
dc.relation.ispartofUMBC Theses and Dissertations Collection
dc.relation.ispartofUMBC Graduate School Collection
dc.relation.ispartofUMBC Student Collection
dc.sourceOriginal File Name: Weinheimer_umbc_0434M_12273.pdf
dc.subjectAudio
dc.subjectLSTM
dc.subjectMel-Spectrogram
dc.subjectSeparation
dc.subjectSource
dc.subjectUnmix
dc.titleEFFICIENT AUDIO SOURCE SEPARATION USING MEL-SPECTROGRAMS
dc.typeText
dcterms.accessRightsDistribution Rights granted to UMBC by the author.
dcterms.accessRightsThis item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Weinheimer_umbc_0434M_12273.pdf
Size:
1.65 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Weinheimer-Kurt_Open.pdf
Size:
270.77 KB
Format:
Adobe Portable Document Format
Description: