TY - GEN
T1 - Evaluation of classification methods for the prediction of hospital length of stay using medicare claims data
AU - Zikos, Dimitrios
AU - Tsiakas, Konstantinos
AU - Qudah, Fadiah
AU - Athitsos, Vassilis
AU - Makedon, Fillia
N1 - Publisher Copyright:
Copyright 2014 ACM.
PY - 2014/5/27
Y1 - 2014/5/27
N2 - In this paper, we investigate the performance of a series of classification methods for the prediction of the hospital Length of Stay (LOS), based on two temporally sequential clinical scenarios. We used a 2012 Medicare Provider Analysis and Review (MedPar) dataset, which contains records of Medicare beneficiaries who used inpatient hospital services. Our subset included 300,000 randomly selected cases. During the prepossessing we added new features and linked our data with external datasets, using common key identifiers. In the first scenario our goal was to predict the LOS using a subset of information which is readily available to the clinician upon the patient admission, while the second scenario assumes that there is available additional data (information on the patient diagnosis and clinical procedures). For our experiments we used three different classifiers: Naïve Bayes, AdaBoost and C4.5 Decision tree, for two different LOS cut-off points (4 day and 12 day hospital stay). The overall performance of our classifiers was ranging from fair to very good. On the other hand the true positive rate, that is the correct classification of the long hospital stays, was low, with an exception of Naïve Bayes, which demonstrated significantly better performance in the second scenario. Our results indicate that Naïve Bayes may be used for the prediction of the in-hospital LOS. Our analysis also indicates that the MedPar data combined with other data resources has the potential to provide a good basis for robust prediction analytics in hospitals.
AB - In this paper, we investigate the performance of a series of classification methods for the prediction of the hospital Length of Stay (LOS), based on two temporally sequential clinical scenarios. We used a 2012 Medicare Provider Analysis and Review (MedPar) dataset, which contains records of Medicare beneficiaries who used inpatient hospital services. Our subset included 300,000 randomly selected cases. During the prepossessing we added new features and linked our data with external datasets, using common key identifiers. In the first scenario our goal was to predict the LOS using a subset of information which is readily available to the clinician upon the patient admission, while the second scenario assumes that there is available additional data (information on the patient diagnosis and clinical procedures). For our experiments we used three different classifiers: Naïve Bayes, AdaBoost and C4.5 Decision tree, for two different LOS cut-off points (4 day and 12 day hospital stay). The overall performance of our classifiers was ranging from fair to very good. On the other hand the true positive rate, that is the correct classification of the long hospital stays, was low, with an exception of Naïve Bayes, which demonstrated significantly better performance in the second scenario. Our results indicate that Naïve Bayes may be used for the prediction of the in-hospital LOS. Our analysis also indicates that the MedPar data combined with other data resources has the potential to provide a good basis for robust prediction analytics in hospitals.
KW - Decision trees
KW - Healthcare
KW - Length of stay
KW - Machine learning
KW - Naïve bayes
KW - Prediction
UR - http://www.scopus.com/inward/record.url?scp=84939214661&partnerID=8YFLogxK
U2 - 10.1145/2674396.2674430
DO - 10.1145/2674396.2674430
M3 - Conference contribution
AN - SCOPUS:84939214661
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments, PETRA 2014
PB - Association for Computing Machinery
T2 - 7th ACM International Conference on Pervasive Technologies Related to Assistive Environments, PETRA 2014
Y2 - 27 May 2014 through 30 May 2014
ER -