Abstract
A method for developing generalized parametric regression models for count data is proposed and studied. The method is based on the framework of the T-geometric family of distributions. A T-geometric family consists of discrete distributions, which are analogues to the continuous distributions for the random variable T. The general methodology is applied to derive some generalized regression models for count data. These regression models can fit count data that are under-dispersed, equi-dispersed or over-dispersed. The extension to model truncated or inflated data is addressed. Some new generalized T-geometric regression models are applied to real world data sets to illustrate the flexibility of the models. The models were fitted to four response variables from health care data and their performance compared. No single regression model outperforms other models for all the four response variables. Thus, a researcher should evaluate different models before selecting a final regression model for a count response variable.
Original language | English |
---|---|
Pages (from-to) | 367-386 |
Number of pages | 20 |
Journal | Annals of Data Science |
Volume | 8 |
Issue number | 2 |
DOIs | |
State | Published - Jun 2021 |
Keywords
- Discrete analogue
- Generalized parametric models
- Under- and over-dispersion
- Zero-inflation