On the statistical consistency of algorithms for binary classification under class imbalance

Aditya Krishna Menon, Harikrishna Narasimhan, Shivani Agarwal, Sanjay Chawla

Research output: Contribution to conferencePaperpeer-review

27 Scopus citations

Abstract

Class imbalance situations, where one class is rare compared to the other, arise frequently in machine learning applications. It is well known that the usual misclassification error is ill-suited for measuring performance in such settings. A wide range of performance measures have been proposed for this problem. However, despite the large number of studies on this problem, little is understood about the statistical consistency of the algorithms proposed with respect to the performance measures of interest. In this paper, we study consistency with respect to one such performance measure, namely the arithmetic mean of the true positive and true negative rates (AM), and establish that some practically popular approaches, such as applying an empirically determined threshold to a suitable class probability estimate or performing an empirically balanced form of risk minimization, are in fact consistent with respect to the AM (under mild conditions on the underlying distribution). Experimental results confirm our consistency theorems.

Original languageEnglish
Pages1640-1648
Number of pages9
StatePublished - 2013
Externally publishedYes
Event30th International Conference on Machine Learning, ICML 2013 - Atlanta, GA, United States
Duration: Jun 16 2013Jun 21 2013

Conference

Conference30th International Conference on Machine Learning, ICML 2013
Country/TerritoryUnited States
CityAtlanta, GA
Period06/16/1306/21/13

Fingerprint

Dive into the research topics of 'On the statistical consistency of algorithms for binary classification under class imbalance'. Together they form a unique fingerprint.

Cite this