Skip to main navigation Skip to search Skip to main content

Environment-aware ideal binary mask estimation using monaural cues

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    Abstract

    We present a monaural approach to speech segregation that estimates the ideal binary mask (IBM) by combining amplitude modulation spectrogram (AMS) features, pitch-based features and speech presence probability (SPP) features derived from noise statistics. To maintain a high mask estimation accuracy in the presence of various background noises, the system employs environment-specific segregation models and automatically selects the appropriate model for a given input signal. Furthermore, instead of classifying each timefrequency (T-F) unit independently, the a posteriori probabilities of speech and noise presence are evaluated by considering adjacent TF units. The proposed system achieves high classification accuracy.
    Original languageEnglish
    Title of host publication2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
    Number of pages4
    PublisherIEEE
    Publication date2013
    DOIs
    Publication statusPublished - 2013
    Event2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics - Mohonk Mountain House , New Paltz, United States
    Duration: 20 Oct 201323 Oct 2013

    Conference

    Conference2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
    LocationMohonk Mountain House
    Country/TerritoryUnited States
    CityNew Paltz
    Period20/10/201323/10/2013

    Keywords

    • Ideal binary mask estimation
    • Speech segregation
    • Background noise classification

    Fingerprint

    Dive into the research topics of 'Environment-aware ideal binary mask estimation using monaural cues'. Together they form a unique fingerprint.

    Cite this