Abstract
A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s.
Original language | English |
---|---|
Title of host publication | 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. |
Volume | 3 |
Publisher | IEEE |
Publication date | 2006 |
ISBN (Print) | 1-4244-0469-X |
DOIs | |
Publication status | Published - 2006 |
Event | 2006 IEEE International Conference on Acoustics, Speech and Signal Processing - Toulouse, France Duration: 14 May 2006 → 19 May 2006 Conference number: 31 |
Conference
Conference | 2006 IEEE International Conference on Acoustics, Speech and Signal Processing |
---|---|
Number | 31 |
Country/Territory | France |
City | Toulouse |
Period | 14/05/2006 → 19/05/2006 |