Abstract
This study is concerned with the challenge of automatically
segregating a target speech signal from interfering background
noise. A computational speech segregation system is presented
which exploits logarithmically-scaled amplitude modulation
spectrogram (AMS) features to distinguish between speech and
noise activity on the basis of individual time-frequency (T-F)
units. One important parameter of the segregation system is
the window duration of the analysis-synthesis stage, which determines
the lower limit of modulation frequencies that can be
represented but also the temporal acuity with which the segregation
system can manipulate individual T-F units. To clarify
the consequences of this trade-off on modulation-based speech
segregation performance, the influence of the window duration
was systematically investigated
Original language | English |
---|---|
Title of host publication | Proceedings of Interspeech 2015 |
Number of pages | 5 |
Publication date | 2015 |
Publication status | Published - 2015 |
Event | INTERSPEECH 2015 : Speech beyond Speech - Dresden, Germany Duration: 6 Sept 2015 → 10 Sept 2015 |
Conference
Conference | INTERSPEECH 2015 |
---|---|
Country/Territory | Germany |
City | Dresden |
Period | 06/09/2015 → 10/09/2015 |
Keywords
- Speech segregation
- Ideal binary mask
- Amplitude modulation spectrogram features
- Temporal resolution