Algorithms for Source Separation - with Cocktail Party Applications

Rasmus Kongsgaard Olsson

    Research output: Book/ReportPh.D. thesis

    4018 Downloads (Pure)


    In this thesis, a number of possible solutions to source separation are suggested. Although they differ significantly in shape and intent, they share a heavy reliance on prior domain knowledge. Most of the developed algorithms are intended for speech applications, and hence, structural features of speech have been incorporated. Single-channel separation of speech is a particularly challenging signal processing task, where the purpose is to extract a number of speech signals from a single observed mixture. I present a few methods to obtain separation, which rely on the sparsity and structure of speech in a time-frequency representation. My own contributions are based on learning dictionaries for each speaker separately and subsequently applying a concatenation of these dictionaries to separate a mixture. Sparse decompositions required for the decomposition are computed using nonnegative matrix factorization as well as basis pursuit. In my work on the multi-channel problem, I have focused on convolutive mixtures, which is the appropriate model in acoustic setups. We have been successful in incorporating a harmonic speech model into a greater probabilistic formulation. Furthermore, we have presented several learning schemes for the parameters of such models, more specifically, the expectation-maximization (EM) algorithm and stochastic and Newton-type gradient optimization.
    Original languageEnglish
    Place of PublicationKgs. Lyngby, Denmark
    Publication statusPublished - Nov 2007
    SeriesDTU Compute PHD

    Bibliographical note



    Dive into the research topics of 'Algorithms for Source Separation - with Cocktail Party Applications'. Together they form a unique fingerprint.

    Cite this