Abstract
We address the problem of classification of speakers based on measurements of features obtained from their speech. The process is an adaptation of biometric methods used to identify people. The process for speech differs since speech is not stationary. We therefore propose the classification of speakers by the statistical distributions of parameters which may be accurately estimated by modern signal processing techniques. The intent is to develop a speaker clustering algorithm which is independent of transmission channel and insensitive to language variations, and which may be re-trained, with minimal data, to include a new speaker. We demonstrate effectiveness on the problem of identification of the speakers gender, and present evidence that the methods may be extended to the general problem of speaker identification.
Original language | English |
---|---|
Pages (from-to) | 170-178 |
Number of pages | 9 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 4120 |
State | Published - 2000 |
Externally published | Yes |
Event | Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation III - San Diego, USA Duration: Jul 31 2000 → Aug 1 2000 |