Hierarchical approach to select feature vectors for classification of text documents

Nagesh Kapalavayi, S. N. Jayaram Murthy, Gongzhu Hu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Digital revolution that started over fifteen years ago is contributing to the exponential growth in text documents that show up in many forms such as web pages, emails, resumes, scientific reports, digital archives, etc. It is of great importance to develop techniques for automatic text document classification as a service to information consumers. Earlier text document classification techniques have used 'keyword-based' features and related statistics to achieve good results. More recently, some of these techniques have been extended to include 'phrase-based' and 'concept-based' features to achieve better results. Majority of these techniques utilize a very large number of features that are extracted from the training set of documents. We present a hierarchical method for selection of a fever number of quality features to improve the classification efficiency.

Original languageEnglish
Title of host publicationIEEE International Conference on Computer Systems and Applications, 2006
PublisherIEEE Computer Society
Pages1180-1183
Number of pages4
ISBN (Print)1424402123, 9781424402120
DOIs
StatePublished - 2006
EventIEEE International Conference on Computer Systems and Applications, 2006 - Sharjah, United Arab Emirates
Duration: Mar 8 2006Mar 8 2006

Publication series

NameIEEE International Conference on Computer Systems and Applications, 2006
Volume2006

Conference

ConferenceIEEE International Conference on Computer Systems and Applications, 2006
Country/TerritoryUnited Arab Emirates
CitySharjah
Period03/8/0603/8/06

Fingerprint

Dive into the research topics of 'Hierarchical approach to select feature vectors for classification of text documents'. Together they form a unique fingerprint.

Cite this