An Algorithm for a Weakly Supervised Version of the Cocktail Party Problem

Author/Creator

Author/Creator ORCID

Department

Computer Science and Electrical Engineering

Program

Computer Science

Citation of Original Publication

Rights

Distribution Rights granted to UMBC by the author.
Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan thorugh a local library, pending author/copyright holder's permission.
This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.

Abstract

In experimental sciences, noise can be viewed as any random variation in the original data and can impede the normal perception of a signal. From the physics perspective, both noise and sound are vibrations through a medium, and thus it is difficult to distinguish between them. But the brain can effectively distinguish between them based on its perception. This research topic involves identifying noise in an audio/sound signal and removing it from the original signal to construct a clean audio signal. Here we are trying to address the Cocktail Party Effect (CPE) using the blind signal separation (BSS) approach. The goal is to perform BSS on audio data using less information than is required of current methods. The brain has the capability to channelize auditory attention to one of many active stimuli and filter out everything else. This phenomenon observed during a cocktail party, where attendants focus on conversation disregarding all the other conversations and music, is referred to as the CPE. BSS is a problem area which involves separating a set of source signals from a set of mixed signals. The challenge here is that this task is to be done without any assisting information or with little knowledge about the individual source signals, the environment, and the exact mixing procedure. Having said that this is a highly underdetermined problem, it is observed that working solutions can be obtained by controlling the conditions governing the problem. In this study, we first worked on a noisy signal, decomposed the signal using the Independent Component Analysis (ICA) approach and manually identified the noise to reconstruct a clean audio signal. Once we have a clean signal, and we have manually identified the noise signal, we implemented a novel approach which uses an ICA algorithm along with Multiple Instance Learning (MIL) to address the underlying Cocktail Party Problem. In this approach, the presence of noise is detected using a weak supervisory signal. The process of mixing noise to source signal and other characteristics of the noise signal is still unknown. The audio signal is broken into independent signals using ICA, but instead of categorizing these individual components, a MIL algorithm classifies them into bags. These sets of labeled bags can be then used to reconstruct the noise-free, clean signal.