Environment-aware ideal binary mask estimation using monaural cues

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

We present a monaural approach to speech segregation that estimates the ideal binary mask (IBM) by combining amplitude modulation spectrogram (AMS) features, pitch-based features and speech presence probability (SPP) features derived from noise statistics. To maintain a high mask estimation accuracy in the presence of various background noises, the system employs environment-specific segregation models and automatically selects the appropriate model for a given input signal. Furthermore, instead of classifying each timefrequency (T-F) unit independently, the a posteriori probabilities of speech and noise presence are evaluated by considering adjacent TF units. The proposed system achieves high classification accuracy.
Original languageEnglish
Title of host publication2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Number of pages4
PublisherIEEE
Publication date2013
DOIs
Publication statusPublished - 2013
Event2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics - Mohonk Mountain House , New Paltz, New York, United States
Duration: 20 Oct 201323 Oct 2013

Conference

Conference2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
LocationMohonk Mountain House
CountryUnited States
CityNew Paltz, New York
Period20/10/201323/10/2013

Keywords

  • Ideal binary mask estimation
  • Speech segregation
  • Background noise classification

Fingerprint Dive into the research topics of 'Environment-aware ideal binary mask estimation using monaural cues'. Together they form a unique fingerprint.

Cite this