A method to detect the distance of a speaker from a single microphone in a room environment is proposed. Several features, related to statistical parameters of speech source excitation signals, are introduced and are shown to depend on the distance between source and receiver. Those features are used to train a pattern recognizer for distance detection. The method is tested using a database of speech recordings in four rooms with different acoustical properties. Performance is shown to be independent of the signal gain and level, but depends on the reverberation time and the characteristics of the room. Overall, the system performs well especially for close distances and for rooms with low reverberation time and it appears to be robust to small distance mismatches. Finally, a listening test is conducted in order to compare the results of the proposed method to the performance of human listeners.
|Journal||I E E E Transactions on Audio, Speech and Language Processing|
|Publication status||Published - 2011|