Multivariate speech activity detector based on the syllable rate

David C. Smith, Jeffrey Townsend, Douglas J. Nelson, Dan Richman

Research output: Contribution to journalConference articlepeer-review

9 Scopus citations

Abstract

Computationally efficient speech extraction algorithms have significant potential economic benefit, by automating an extremely tedious manual process. Previously, algorithms which discriminate between speech and one specific other signal type have been developed, and often fail when the specific non-speech signal is replaced by a different signal type. Moreover, several such signal specific discriminators have been combined in order to tackle the general speech vs. non-speech discrimination problem, with predictable negative results. When the number of discriminating features is large, compression methods such as Principal Components have been applied to reduce dimension, even though information may be lost in the process. In this paper, graphical tools are applied to determine a set of features which produce excellent speech vs. non-speech clustering. This cluster structure provides the basis for a general speech vs. non-speech discriminator, which significantly outperforms the TALKATIVE speech extraction algorithm.

Original languageEnglish
Pages (from-to)73-76
Number of pages4
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
StatePublished - 1999
Externally publishedYes
EventProceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-99) - Phoenix, AZ, USA
Duration: Mar 15 1999Mar 19 1999

Fingerprint

Dive into the research topics of 'Multivariate speech activity detector based on the syllable rate'. Together they form a unique fingerprint.

Cite this