A tutorial on selecting and interpreting predictive models for ordinal health-related outcomes

Health Services and Outcomes Research Methodology - Tập 15 - Trang 223-240 - 2015
Maria Guzman-Castillo1, Sally Brailsford2, Michelle Luke3, Honora Smith2
1Department of Public Health and Policy, University of Liverpool, Liverpool, UK
2University of Southampton, Southampton, UK
3University of Sussex, Flamer Brighton, UK

Tóm tắt

Ordinal variables are very often objects of study in health sciences. However, due to the lack of dissemination of models suited for ordinal variables, users often adopt other practices that result in the loss of statistical power. In this tutorial, different models from the family of logistic regression models are introduced as alternatives to handle and interpret ordinal outcomes. The models that were considered include: ordinal regression model (ORM), continuation ratio model (CRM), adjacent category model (ACM), generalised ordered logit model, sequential model, multinomial logit model, partial proportional odds model, partial continuation ratio model and stereotype ordered regression model. By using the relationship of hospital length of stay in a public hospital in Mexico with patient characteristics as an example, the models were used to describe the nature of such relationship and to predict the length of stay category to which a patient is most likely to belong. After an initial analysis, the ORM, CRM and ACM proved to be unsuitable for our data due to the transgression of the parallel regression assumption. The rest of the models were estimated in STATA. The results suggested analogous directionality of the parameter estimates between models, although the interpretation of the odds ratios varied from one model to another. Performance measurements indicated that the models had similar prediction performance. Therefore, when there is an interest in exploiting the ordinal nature of an outcome, there is no reason to maintain practices that ignore such nature since the models discussed here proved to be computationally inexpensive and easy to estimate, analyse and interpret.

Tài liệu tham khảo

Ananth, C., Kleinbaum, D.: Regression models for ordinal responses: a review of methods and applications. Int. J. Epidemiol. 26(6), 1323–1333 (1997) Anderson, J.: ‘Regression and ordered categorical variables’. J R Stat Soc Ser B (Methodol) 46(1), 1–30 (1984) Anderson, J., Philips, P.: Regression, discrimination and measurement models for ordered categorical variables. Appl. Stat. 30(1), 22–31 (1981) Armstrong, B., Sloan, M.: Ordinal regression models for epidemiologic data. Am. J. Epidemiol. 129(1), 191–204 (1989) Arrow, K.: Social choice and individual values, 2nd edn. Yale University Press, New Haven (1963) Ashby, D., Pocock, S., Shaper, A.: Ordered polytomous regression: an example relating serum biochemistry and haematology to alcohol consumption. Appl. Stat. 35(6), 289–301 (1986) Bender, R., Grouven, U.: Using binary logistic regression models for ordinal data with non-proportional odds. J. Clin. Epidemiol. 51(10), 809–816 (1998) Campbell, M., Donner, A., Webster, K.: Are ordinal models useful for classification? Stat. Med. 10(3), 383–394 (1991) Cliff, N.: Ordinal methods for behavioral data analysis. Lawrence Erlbaum Associates, Mahwah (1996) Clogg, C., Shihadeh, E.: Statistical models for ordinal variables. Sage, Thousand Oaks (1994) Cole, S.R., Ananth, C.V.: Regression models for unconstrained, partially or fully constrained continuation odds ratios. Int. J. Epidemiol. 30(6), 1379–1382 (2001) Defays, D.: An efficient algorithm for a complete link method. Comput. J. 20(4), 364–366 (1977) Dobbin, K., Simon, R.: Optimally splitting cases for training and testing high dimensional classifiers. BMC Med. Genomics 4(1), 31 (2011) Fienberg, S.: The analysis of cross-classified categorical data. MIT Press, Cambridge (1977) Flaatten, H., Bonde, J., Ruokonen, E., Winsø, O.: Classification for coding procedures in the intensive care unit. Acta Anaesthesiol. Scand. 46(8), 994–998 (2002) Fullerton, A.S.: A conceptual framework for ordered logistic regression models. Sociol. Methods Res. 38(2), 306–347 (2009) Goodman, L.A.: ‘The analysis of dependence in cross-classifications having ordered categories, using log-linear models for frequencies and log-linear models for odds’. Biometrics 149–160 (1983) Guzman Castillo, M.: Modelling Patient Length of Stay in Public Hospitals in Mexico. Ph.D. thesis, University of Southampton (2012) Hauser, R.M., Andrew, M.: 1. Another look at the stratification of educational transitions: the logistic response model with partial proportionality constraints. Sociol. Methodol. 36(1), 1–26 (2006) Hausman, J., Mcfadden, D.: Specification tests for the multinomial logit model. Econom. J. Econom. Soc. 52(5), 1219–1240 (1984) Hilbe, J.: Logistic regression models. CRC Press, Boca Raton (2009) Holtbrugge, W., Schumacher, M.: A comparison of regression models for the analysis of ordered categorical data. Appl. Stat. 40(2), 249–259 (1991) Hosmer, D., Lemeshow, S.: Applied logistic regression. John Wiley, New York (2010) Kahn, L.M., Morimune, K.: ‘Unions and employment stability: a sequential logit approach’. Int. Econ. Rev. 217–235 (1979) Keski-Rahkonen, A., Kaprio, J., Rissanen, A., Virkkunen, M., Rose, R.: Breakfast skipping and health-compromising behaviors in adolescents and adults. Eur. J. Clin. Nutr. 57(7), 842–853 (2003) Lall, R., Campbell, M., Walters, S., Morgan, K.: A review of ordinal regression models applied on health-related quality of life assessments. Stat. Methods Med. Res. 11(1), 49–67 (2002) Lansky, S., List, M., Lansky, L., Ritter-Sterr, C., Miller, D.: The measurement of performance in childhood cancer patients. Cancer 60(7), 1651–1656 (2006) Long, S., Freese, J.: Regression models for categorical dependent variables using Stata. StataCorp LP, College Station (2006) Luce, R.: Individual choice behavior : a theoretical analysis. John Wiley & Sons Inc, New York (1959) Lunt, M.: Prediction of ordinal outcomes when the association between predictors and outcome differs between outcome levels. Stat. Med. 24(9), 1357–1369 (2005) Mahoney, F., Barthel, D.: Functional evaluation: the Barthel index. Md. State Med. J. 14, 61–65 (1965) Mäntyselkä, P., Turunen, J., Ahonen, R., Kumpusalo, E.: Chronic pain and poor self-rated health. JAMA 290(18), 2435–2442 (2003) Mare, R.D.: Social background composition and educational growth. Demography 16(1), 55–71 (1979) McCullagh, P.: ‘Regression models for ordinal data’. J. R. Stat. Soc. Ser. B (Methodol.) 42(2), 109–142 (1980) McElroy, S.L., Frye, M.A., Suppes, T., Dhavale, D., Keck, P.E., Leverich, G.S., Altshuler, L., Denicoff, K.D., Nolen, W.A., Kupka, R., Grunze, H., Walden, J., Post, R.M.: Correlates of overweight and obesity in 644 patients with bipolar disorder. J. Clin. Psychiatry 63(3), 207–213 (2002) Newman, A., Foster, G., Givelber, R., Nieto, F., Redline, S., Young, T.: Progression and regression of sleep-disordered breathing with changes in weight: the sleep heart health study. Arch. Intern. Med. 165(20), 2408–2413 (2005) O’Connell, A.A.: Logistic regression models for ordinal response variables. Sage, Thousand Oaks (2010) Peterson, B., Harrell, F.: Partial proportional odds models for ordinal response variables. Appl. Stat. 39(2), 205–217 (1990) Rezanková, H.: Cluster analysis and categorical data. Statistika 89(9), 216–232 (2009) Small, K., Hsiao, C.: Multinomial logit specification tests. Int. Econ. Rev. 26(3), 619–627 (1985) Takazawa, K., Arisawa, K., Honda, S., Shibata, Y., Saito, H.: Lower-extremity muscle forces measured by a hand-held dynamometer and the risk of falls among day-care users in Japan: using multinomial logistic regression analysis. Disabil. Rehabil. 25(8), 399–404 (2003) Walker, S., Duncan, D.: Estimation of the probability of an event as a function of several independent variables. Biometrika 54(1–2), 167–179 (1967) Walston, J., McBurnie, M., Newman, A.: Frailty and activation of the inflammation and coagulation systems with and without clinical comorbidities: results from the cardiovascular health study. Arch. Intern. Med. 162(20), 2333–2341 (2002) Ward, J.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963) Williams, R.: Generalized ordered logit/partial proportional odds models for ordinal dependent variables. Stata J. 6(1), 58–82 (2007) Wilson, P.W.F., D’Agostino, R.B., Levy, D., Belanger, A.M., Silbershatz, H., Kannel, W.B.: Prediction of coronary heart disease using risk factor categories. Circulation 97(18), 1837–1847 (1998)