Modelling count response variables in informetric studies: Comparison among count, linear, and lognormal regression models

Isola Ajiferuke, Felix Famoye

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

The purpose of the study is to compare the performance of count regression models to those of linear and lognormal regression models in modelling count response variables in informetric studies. Identified count response variables in informetric studies include the number of authors, the number of references, the number of views, the number of downloads, and the number of citations received by an article. Also of a count nature are the number of links from and to a website. Data were collected from the United States Patent and Trademark Office (. www.uspto.gov), an open access journal (. www.informationr.net/ir/), Web of Science, and Maclean's magazine. The datasets were then used to compare the performance of linear and lognormal regression models with those of Poisson, negative binomial, and generalized Poisson regression models. It was found that due to over-dispersion in most response variables, the negative binomial regression model often seems to be more appropriate for informetric datasets than the Poisson and generalized Poisson regression models. Also, the regression analyses showed that linear regression model predicted some negative values for five of the nine response variables modelled, and for all the response variables, it performed worse than both the negative binomial and lognormal regression models when either Akaike's Information Criterion (AIC) or Bayesian Information Criterion (BIC) was used as the measure of goodness of fit statistics. The negative binomial regression model performed significantly better than the lognormal regression model for four of the response variables while the lognormal regression model performed significantly better than the negative binomial regression model for two of the response variables but there was no significant difference in the performance of the two models for the remaining three response variables.

Original languageEnglish
Pages (from-to)499-513
Number of pages15
JournalJournal of Informetrics
Volume9
Issue number3
DOIs
StatePublished - Jul 1 2015

Keywords

  • Count regression models
  • Count response variable
  • Informetric studies
  • Linear regression model
  • Lognormal regression model
  • Negative binomial regression model

Fingerprint

Dive into the research topics of 'Modelling count response variables in informetric studies: Comparison among count, linear, and lognormal regression models'. Together they form a unique fingerprint.

Cite this