A generalized Hosmer–Lemeshow goodness-of-fit test for a family of generalized linear models
TEST - Trang 1-20 - 2023
Tóm tắt
Generalized linear models (GLMs) are very widely used, but formal goodness-of-fit (GOF) tests for the overall fit of the model seem to be in wide use only for certain classes of GLMs. We develop and apply a new goodness-of-fit test, similar to the well-known and commonly used Hosmer–Lemeshow (HL) test, that can be used with a wide variety of GLMs. The test statistic is a variant of the HL statistic, but we rigorously derive an asymptotically correct sampling distribution using methods of Stute and Zhu (Scand J Stat 29(3):535–545, 2002) and demonstrate its consistency. We compare the performance of our new test with other GOF tests for GLMs, including a naive direct application of the HL test to the Poisson problem. Our test provides competitive or comparable power in various simulation settings and we identify a situation where a naive version of the test fails to hold its size. Our generalized HL test is straightforward to implement and interpret and an R package is publicly available.
Tài liệu tham khảo
Agresti A (1996) An introduction to categorical data analysis. Wiley, New York
Bilder CR, Loughin TM (2014) Analysis of categorical data with R. Chapman and Hall/CRC, Boston
Blizzard L, Hosmer DW (2006) Parameter estimation and goodness-of-fit in log binomial regression. Biom J 48(1):5–22
Canary JD (2013) Grouped goodness-of-fit tests for binary regression models. PhD thesis, University of Tasmania
Canary JD, Blizzard L, Barry RP, Hosmer DW, Quinn SJ (2016) Summary goodness-of-fit statistics for binary generalized linear models with noncanonical link functions. Biom J 58(3):674–690
Cheng KF, Wu JW (1994) Testing goodness of fit for a parametric family of link functions. J Am Stat Assoc 89(426):657–664
Christensen R, Lin Y (2015) Lack-of-fit tests based on partial sums of residuals. Commun Stat Theory Methods 44(13):2862–2880
Fagerland MW, Hosmer DW (2013) A goodness-of-fit test for the proportional odds regression model. Stat Med 32(13):2235–2249
Fagerland MW, Hosmer DW (2016) Tests for goodness of fit in ordinal logistic regression models. J Stat Comput Simul 86(17):3398–3418
Fagerland MW, Hosmer DW, Bofin AM (2008) Multinomial goodness-of-fit tests for logistic regression models. Stat Med 27(21):4238–4253
González-Manteiga W, Crujeiras RM (2013) An updated review of goodness-of-fit tests for regression models. TEST 22(3):361–411
Halteman WA (1980) A goodness of fit test for binary logistic regression. Unpublished doctoral dissertation, Department of Biostatistics, University of Washington, Seattle, WA
Hosmer DW, Hjort NL (2002) Goodness-of-fit processes for logistic regression: simulation results. Stat Med 21(18):2723–2738
Hosmer DW, Lemeshow S (1980) Goodness of fit tests for the multiple logistic regression model. Commun Stat Theory Methods 9(10):1043–1069
Lin DY, Wei LJ, Ying Z (2002) Model-checking techniques based on cumulative residuals. Biometrics 58(1):1–12
Liu A, Meiring W, Wang Y (2004) Testing generalized linear models using smoothing spline methods. Stat Sin 15:235–256
Moore DS, Spruill MC (1975) Unified large-sample theory of general chi-squared statistics for tests of fit. Ann Stat 3:599–616
Pulkstenis E, Robinson TJ (2002) Two goodness-of-fit tests for logistic regression models with continuous covariates. Stat Med 21(1):79–93
Quinn SJ, Hosmer DW, Blizzard CL (2015) Goodness-of-fit statistics for log-link regression models. J Stat Comput Simul 85(12):2533–2545
Rodríguez-Campos MC, González-Manteiga W, Cao R (1998) Testing the hypothesis of a generalized linear regression model using nonparametric regression estimation. J Stat Plan Inference 67(1):99–122
Stute W, Zhu L-X (2002) Model checks for generalized linear models. Scand J Stat 29(3):535–545
Su JQ, Wei LJ (1991) A lack-of-fit test for the mean function in a generalized linear model. J Am Stat Assoc 86(414):420–426
Surjanovic N, Loughin TM (2021) Improving the Hosmer–Lemeshow goodness-of-fit test in large models with replicated trials. arXiv preprint arXiv:2102.12698
Tsiatis AA (1980) A note on a goodness-of-fit test for the logistic regression model. Biometrika 67(1):250–251
White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50(1):1–25
Xiang D, Wahba G (1995) Testing the generalized linear model null hypothesis versus ‘smooth’ alternatives. Technical Report 953, Department of Statistics, University of Wisconsin