TY - GEN
T1 - Hierarchical approach to select feature vectors for classification of text documents
AU - Kapalavayi, Nagesh
AU - Jayaram Murthy, S. N.
AU - Hu, Gongzhu
PY - 2006
Y1 - 2006
N2 - Digital revolution that started over fifteen years ago is contributing to the exponential growth in text documents that show up in many forms such as web pages, emails, resumes, scientific reports, digital archives, etc. It is of great importance to develop techniques for automatic text document classification as a service to information consumers. Earlier text document classification techniques have used 'keyword-based' features and related statistics to achieve good results. More recently, some of these techniques have been extended to include 'phrase-based' and 'concept-based' features to achieve better results. Majority of these techniques utilize a very large number of features that are extracted from the training set of documents. We present a hierarchical method for selection of a fever number of quality features to improve the classification efficiency.
AB - Digital revolution that started over fifteen years ago is contributing to the exponential growth in text documents that show up in many forms such as web pages, emails, resumes, scientific reports, digital archives, etc. It is of great importance to develop techniques for automatic text document classification as a service to information consumers. Earlier text document classification techniques have used 'keyword-based' features and related statistics to achieve good results. More recently, some of these techniques have been extended to include 'phrase-based' and 'concept-based' features to achieve better results. Majority of these techniques utilize a very large number of features that are extracted from the training set of documents. We present a hierarchical method for selection of a fever number of quality features to improve the classification efficiency.
UR - http://www.scopus.com/inward/record.url?scp=33750828621&partnerID=8YFLogxK
U2 - 10.1109/aiccsa.2006.205241
DO - 10.1109/aiccsa.2006.205241
M3 - Conference contribution
AN - SCOPUS:33750828621
SN - 1424402123
SN - 9781424402120
T3 - IEEE International Conference on Computer Systems and Applications, 2006
SP - 1180
EP - 1183
BT - IEEE International Conference on Computer Systems and Applications, 2006
PB - IEEE Computer Society
T2 - IEEE International Conference on Computer Systems and Applications, 2006
Y2 - 8 March 2006 through 8 March 2006
ER -