A multivariate speech activity detector based on the syllable rate

David Smith, Jeffrey Townsend, Douglas J. Nelson

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

Computationally efficient algorithms which perform speech activity detection have significant potential economic and labor saving benefit, by automating an extremely tedious manual process. In many applications, it is desirable to extract intervals of speech which are bounded by segments of other signal types (fax/modem, music, static, dial tones, etc.). In the past, algorithms which successfully discriminate between speech and one specific other signal type have been developed. Frequently, these algorithms fail when the specific non-speech signal is replaced by a different non-speech signal. Little work has been done on combining such discriminators in order to solve the general speech vs. non-speech discrimination problem. Typically, several signal specific discriminators are blindly combined with predictable negative results. Moreover, when a large number of discriminators are involved, dimension reduction is achieved using Principal Components, which optimally compresses signal variance into the fewest number of dimensions. Unfortunately, these new coordinates are not necessarily optimal for discrimination. In this paper we apply graphical tools to determine a set of discriminators which produce excellent speech vs. non-speech clustering, thereby eliminating the guesswork in selecting good feature vectors. This cluster structure provides a basis for a general multivariate speech vs. non-speech discriminator, which compares very favorably with the TALKATIVE speech extraction algorithm.

Original languageEnglish
Pages (from-to)68-78
Number of pages11
JournalProceedings of SPIE - The International Society for Optical Engineering
Volume3461
DOIs
StatePublished - 1998
Externally publishedYes
EventAdvance Signal Processing Algorithms, Atchitectures, and Implementations VIII - San diego, CA, United States
Duration: Jul 22 1998Jul 24 1998

Fingerprint

Dive into the research topics of 'A multivariate speech activity detector based on the syllable rate'. Together they form a unique fingerprint.

Cite this