Towards Improving Transparency of Count Data Regression Models for Health Impacts of Air Pollution

dc.contributor.authorJoseph, John F.
dc.contributor.authorFurl, Chad
dc.contributor.authorSharif, Hatim O.
dc.contributor.authorSunil, Thankam
dc.contributor.authorMacias, Charles G.
dc.date.accessioned2021-04-19T15:27:27Z
dc.date.available2021-04-19T15:27:27Z
dc.date.issued2021-04-09
dc.date.updated2021-04-19T15:27:28Z
dc.description.abstractIn studies on the health impacts of air pollution, regression analysis continues to advance far beyond classical linear regression, which many scientists may have become familiar with in an introductory statistics course. With each new level of complexity, regression analysis may become less transparent, even to the analyst working with the data. This may be especially true in count data regression models, where the response variable (typically given the symbol y) is count data (i.e., takes on values of 0, 1, 2, …). In such models, the normal distribution (the familiar bell-shaped curve) for the residuals (i.e., the differences between the observed values and the values predicted by the regression model) no longer applies. Unless care is taken to correctly specify just how those residuals are distributed, the tendency to accept untrue hypotheses may be greatly increased. The aim of this paper is to present a simple histogram of predicted and observed count values (POCH), which, while rarely found in the environmental literature but presented in authoritative statistical texts, can dramatically reduce the risk of accepting untrue hypotheses. POCH can also increase the transparency of count data regression models to analysts themselves and to the scientific community in general.
dc.description.departmentCivil and Environmental Engineering, and Construction Management
dc.identifierdoi: 10.3390/app11083375
dc.identifier.citationApplied Sciences 11 (8): 3375 (2021)
dc.identifier.urihttps://hdl.handle.net/20.500.12588/556
dc.rightsAttribution 4.0 United States
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectcount data
dc.subjectcorrelation
dc.subjectregression models
dc.titleTowards Improving Transparency of Count Data Regression Models for Health Impacts of Air Pollution
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
applsci-11-03375-v2.pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: