A novel method for the estimation of the distance of a sound source from binaural speech signals is proposed. The method relies on several statistical features extracted from such signals and their binaural cues. Firstly, the standard deviation of the difference of the magnitude spectra of the left and right binaural signals is used as a feature for this method. In addition, an extended set of additional statistical features that can improve distance detection is extracted from an auditory front-end which models the peripheral processing of the human auditory system. The method incorporates the above features into two classification frameworks based on Gaussian mixture models and Support Vector Machines and the relative merits of those frameworks are evaluated. The proposed method achieves distance detection when tested in various acoustical environments and performs well in unknown environments. Its performance is also compared to an existing binaural distance detection method.
|I E E E Transactions on Audio, Speech and Language Processing
|Published - 2013