When Fixed and Random Effects Mismatch: Another Case of Inflation of Evidence in Non-Maximal Models

Computational Brain & Behavior - Tập 6 Số 1 - Trang 84-101 - 2023
João Veríssimo1,2
1Center of Linguistics, School of Arts and Humanities, University of Lisbon, Alameda da Universidade, Lisbon, Portugal
2Department of Linguistics, University of Potsdam, Potsdam, Germany

Tóm tắt

AbstractMixed-effects models that include both fixed and random effects are widely used in the cognitive sciences because they are particularly suited to the analysis of clustered data. However, testing hypotheses about fixed effects in the presence of random effects is far from straightforward and a set of best practices is still lacking. In the target article, van Doorn et al. (Computational Brain & Behavior, 2022) examined how Bayesian hypothesis testing with mixed-effects models is impacted by particular model specifications. Here, I extend their work to the more complex case of multiple correlated predictors, such as a predictor of interest and a covariate. I show how non-maximal models can display ‘mismatches’ between fixed and random effects, which occur when a model includes random slopes for the effect of interest, but fails to include them for those predictors that correlate with the effect of interest. Bayesian model comparisons with synthetic data revealed that such mismatches can lead to an underestimation of random variance and to inflated Bayes factors. I provide specific recommendations for resolving mismatches of this type: fitting maximal models, eliminating correlations between predictors, and residualising the random effects. Data and code are publicly available in an OSF repository at https://osf.io/njaup.

Từ khóa


Tài liệu tham khảo

Arnqvist, G. (2020). Mixed models offer no freedom from degrees of freedom. Trends in Ecology & Evolution, 35(4), 329–335. https://doi.org/10/ggkqs5

Aust, F., & Barth, M. (2020). papaja: Create APA manuscripts with R Markdown (R Package Version 0.1.0.9942).

Avetisyan, S., Lago, S., & Vasishth, S. (2020). Does case marking affect agreement attraction in comprehension? Journal of Memory and Language, 112, 104087. https://doi.org/10/ghbtpd

Azen, R., & Budescu, D. (2009). Applications of multiple regression in psychological research. The SAGE Handbook of Quantitative Methods in Psychology (pp. 285–310). SAGE Publications Ltd. https://doi.org/10.4135/9780857020994.n13

Baayen, R. H. (2010). A real experiment is a factorial experiment? The Mental Lexicon, 5(1), 149–157. https://doi.org/10/dk585r

Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. https://doi.org/10/fpb5dz

Baayen, R. H., Wurm, L. H., & Aycock, J. (2007). Lexical dynamics for low-frequency complex words: A regression study across tasks and modalities. The Mental Lexicon, 2(3), 419–463. https://doi.org/10/gkzxfg

Balling, L. W. (2008). A brief introduction to regression designs and mixed-effects modelling by a recent convert. In S. Göpferich, A. L. Jakobsen, & I. M. Mees (Eds.), Looking at eyes: Eye-tracking studies of reading and translation processing (pp. 175–192). Samfundslitteratur.

Balota, D. A., Cortese, M. J., Sergent-Marshall, S. D., Spieler, D. H., & Yap, M. J. (2004). Visual word recognition of single-syllable words. Journal of Experimental Psychology: General, 133(2), 283–316. https://doi.org/10/dx72cn

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10/gcm4wc

Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2018). Parsimonious mixed models. arXiv:1506.04967

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10/gcrnkw

Belsley, D. A., Kuh, E., & Welsch, R. E. (1980). Regression diagnostics: Identifying influential data and sources of collinearity. Wiley.

Bosch, S., Veríssimo, J., & Clahsen, H. (2019). Inflectional morphology in bilingual language processing: An age-of-acquisition study. Language Acquisition, 26(3), 339–360. https://doi.org/10/ggffzm

Brauer, M., & Curtin, J. J. (2018). Linear mixed-effects models and the analysis of nonindependent data: A unified framework to analyze categorical and continuous independent variables that vary within-subjects and/or within-items. Psychological Methods, 23(3), 389–411. https://doi.org/10/gd86gx

Brysbaert, M., Stevens, M., Mandera, P., & Keuleers, E. (2016). The impact of word prevalence on lexical decision times: Evidence from the Dutch Lexicon Project 2. Journal of Experimental Psychology: Human Perception and Performance, 42(3), 441–458. https://doi.org/10/ggb87s

Bulmer, M. (1998). Galton’s law of ancestral heredity. Heredity, 81(5), 579–585. https://doi.org/10/bgzk68

Bürkner, P.-C. (2017). brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software, 80(1), 1–28. https://doi.org/10/gddxwp

Coupé, C. (2018). Modeling linguistic variables with regression models: Addressing non-gaussian distributions, non-independent observations, and non-linear predictors with random effects and generalized additive models for location, scale, and shape. Frontiers in Psychology, 9, 513. https://doi.org/10/gddf53

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7–29. https://doi.org/10/5k3

Cunnings, I. (2012). An overview of mixed-effects statistical models for second language researchers. Second Language Research, 28(3), 369–382. https://doi.org/10/f35ngk

DeBruine, L. M., & Barr, D. J. (2021). Understanding mixed-effects models through data simulation. Advances in Methods and Practices in Psychological Science, 4(1), 251524592096511. https://doi.org/10/gjh9p5

Demidenko, E. (2013). Mixed models: Theory and applications with R (2nd ed.). Wiley.

Dickey, J. M., & Lientz, B. P. (1970). The weighted likelihood ratio, sharp hypotheses about chances, the order of a Markov chain. The Annals of Mathematical Statistics, 41(1), 214–226. https://doi.org/10/fc2nps

Friedman, L., & Wall, M. (2005). Graphical views of suppression and multicollinearity in multiple linear regression. The American Statistician, 59(2), 127–136. https://doi.org/10/dhhxqs

Galton, F. (1897). The average contribution of each several ancestor to the total heritage of the offspring. Proceedings of the Royal Society of London, 61(369–377), 401–413. https://doi.org/10/cw8wsv

Galton, F. (1898). A diagram of heredity. Nature, 57(1474), 293–293. https://doi.org/10/cffjkq

Harrell, F. E. (2015). Regression modeling strategies: With applications to linear models, logistic and ordinal regression, and survival analysis. Springer International Publishing. https://doi.org/10.1007/978-3-319-19425-7

Heck, D. W. (2019). A caveat on the Savage-Dickey density ratio: The case of computing Bayes factors for regression parameters. British Journal of Mathematical and Statistical Psychology, 72(2), 316–333. https://doi.org/10/gk4zsz

Heisig, J. P., & Schaeffer, M. (2019). Why you should always include a random alope for the lower-level variable involved in a cross-level interaction. European Sociological Review, 35(2), 258–279. https://doi.org/10/gf9nkc

Hoffman, L., & Rovine, M. J. (2007). Multilevel models for the experimental psychologist: Foundations and illustrative examples. Behavior Research Methods, 39(1), 101–117. https://doi.org/10/bvnt5m

Johnson, M. K., McMahon, R. P., Robinson, B. M., Harvey, A. N., Hahn, B., Leonard, C. J., Luck, S. J., & Gold, J. M. (2013). The relationship between working memory capacity and broad measures of cognitive ability in healthy adults and people with schizophrenia. Neuropsychology, 27(2), 220–229. https://doi.org/10/f4sct8

Judd, C. M., Westfall, J., & Kenny, D. A. (2012). Treating stimuli as a random factor in social psychology: A new and comprehensive solution to a pervasive but largely ignored problem. Journal of Personality and Social Psychology, 103(1), 54–69. https://doi.org/10/f33f3t

Keuleers, E., & Balota, D. A. (2015). Megastudies, crowdsourcing, and large datasets in psycholinguistics: An overview of recent developments. Quarterly Journal of Experimental Psychology, 68(8), 1457–1468. https://doi.org/10/gkzxsd

Kowal, D. R. (2021). Subset selection for linear mixed models. https://doi.org/10.48550/arXiv.2107.12890

Kruschke, J. K., & Liddell, T. M. (2018). The Bayesian new statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychonomic Bulletin & Review, 25(1), 178–206. https://doi.org/10/gc3gmn

Lee, M. D., & Wagenmakers, E.-J. (2013). Bayesian cognitive modeling: A practical course. Cambridge University Press. OCLC: ocn861318341

Lemhöfer, K., Dijkstra, T., Schriefers, H., Baayen, R. H., Grainger, J., & Zwitserlood, P. (2008). Native language influences on word recognition in a second language: A megastudy. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(1), 12–31. https://doi.org/10/fj5krc

Linck, J. A., & Cunnings, I. (2015). The utility and application of mixed-effects models in second language research. Language Learning, 65(S1), 185–207. https://doi.org/10/f7c46d

Matuschek, H., Kliegl, R., Vasishth, S., Baayen, R. H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10/gcx746

McNeish, D., & Kelley, K. (2019). Fixed effects models versus mixed effects models for clustered data: Reviewing the approaches, disentangling the differences, and making recommendations. Psychological Methods, 24(1), 20–35. https://doi.org/10/gdnbxn

Meteyard, L., & Davies, R. A. (2020). Best practice guidance for linear mixed-effects models in psychological science. Journal of Memory and Language, 112, 104092. https://doi.org/10/ggjqt5

Mitchell, D. C. (1984). An evaluation of subject-paced reading tasks and other methods of investigating immediate processes in reading. In D. E. Kieras & M. A. Just (Eds.), New methods in reading comprehension research. Erlbaum.

Morey, R. D., Rouder, J. N., Verhagen, J., & Wagenmakers, E.-J. (2014). Why hypothesis tests are essential for psychological science: A comment on Cumming (2014). Psychological Science, 25(6), 1289–1290. https://doi.org/10/gckf4j

Mulder, J., & Wagenmakers, E.-J. (2016). Editors’ introduction to the special issue “Bayes factors for testing hypotheses in psychological research: Practical relevance and new developments”. Journal of Mathematical Psychology, 72, 1–5. https://doi.org/10/btqx

Mulder, J., Wagenmakers, E.-J., & Marsman, M. (2020). A generalization of the Savage-Dickey density ratio for testing equality and order constrained hypotheses. The American Statistician, 1–8. https://doi.org/10/gk4zsx

Nezlek, J. B. (2008). An introduction to multilevel modeling for social and personality psychology: Multilevel analyses. Social and Personality Psychology Compass, 2(2), 842–860. https://doi.org/10/dsn35j

Oberauer, K. (2022). The Importance of random slopes in mixed models for Bayesian hypothesis testing. Psychological Science, 33(4), 648–665. https://doi.org/10.1177/09567976211046884

O’Brien, R. M. (2007). A caution regarding rules of thumb for variance inflation factors. Quality & Quantity, 41(5), 673–690. https://doi.org/10/bkrhm3

O’Brien, R. M. (2018). A consistent and general modified Venn diagram approach that provides insights into regression analysis (F. Zhou, Ed.). PLoS ONE, 13(5), e0196740. https://doi.org/10/gdkkmf

Paul, D. B. (1995). Controlling human heredity: 1865 to the present. Humanities Press.

Pedhazur, E. J. (1997). Multiple regression in behavioral research: Explanation and prediction (3rd ed). Harcourt Brace College Publishers.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed). Sage Publications

Rouder, J. N., Engelhardt, C. R., McCabe, S., & Morey, R. D. (2016). Model comparison in ANOVA. Psychonomic Bulletin & Review, 23(6), 1779–1786. https://doi.org/10/f9hfb4

Rouder, J. N., Haaf, J. M., & Vandekerckhove, J. (2018). Bayesian inference for psychology, part IV: Parameter estimation and Bayes factors. Psychonomic Bulletin & Review, 25(1), 102–113. https://doi.org/10/gc9qfx

Rouder, J. N., & Morey, R. D. (2012). Default Bayes factors for model selection in regression. Multivariate Behavioral Research, 47(6), 877–903. https://doi.org/10/ggsfx9

Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16(2), 225–237. https://doi.org/10/b3hsdp

Schad, D. J., Betancourt, M., & Vasishth, S. (2021). Toward a principled Bayesian workflow in cognitive science. Psychological Methods, 26(1), 103–126. https://doi.org/10/ghbtt6

Schad, D. J., Nicenboim, B., Bürkner, P.-C., Betancourt, M., & Vasishth, S. (2021). Workflow techniques for the robust use of Bayes factors

Schielzeth, H., & Forstmeier, W. (2009). Conclusions beyond support: Overconfident estimates in mixed models. Behavioral Ecology, 20(2), 416–420. https://doi.org/10/bcwvqw

Senn, S. (2011). Francis Galton and regression to the mean. Significance, 8(3), 124–126. https://doi.org/10/gf9b83

Shantz, K. (2017). Phrase frequency, proficiency and grammaticality interact in non-native processing: Implications for theories of SLA. Second Language Research, 33(1), 91–118. https://doi.org/10/f9k8j9

Singmann, H., & Kellen, D. (2019). An introduction to mixed models for experimental psychology. In D. Spieler & E. Schumacher (Eds.), New methods in cognitive psychology (First, pp. 4–31). Routledge. https://doi.org/10.4324/9780429318405-2

Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling (2nd ed). Sage

Stanton, J. M. (2001). Galton, Pearson, and the peas: A brief history of linear regression for statistics instructors. Journal of Statistics Education, 9(3), 3. https://doi.org/10/gd82dx

Tomaschek, F., Hendrix, P., & Baayen, R. H. (2018). Strategies for addressing collinearity in multivariate linguistic data. Journal of Phonetics, 71, 249–267. https://doi.org/10/gg67xj

van Doorn, J., Aust, F., Haaf, J. M., Stefan, A. M., & Wagenmakers, E.-J. (2022). Bayes factors for mixed models. Computational Brain & Behavior. https://doi.org/10/gnrmn8

Vanhove, J. (2021). Collinearity isn’t a disease that needs curing. Meta-Psychology, 5. https://doi.org/10/gnrk22

Vasishth, S. (2006). On the proper treatment of spillover in real-time reading studies: Consequences for psycholinguistic theories. In: Proceedings of the international conference on linguistic evidence. Tübingen, Germany.

Veríssimo, J., & Clahsen, H. (2014). Variables and similarity in linguistic generalization: Evidence from inflectional classes in Portuguese. Journal of Memory and Language, 76, 61–79. https://doi.org/10/ggfgmd

Veríssimo, J., Heyer, V., Jacob, G., & Clahsen, H. (2018). Selective effects of age of acquisition on morphological priming: Evidence for a sensitive period. Language Acquisition, 25(3), 315–326. https://doi.org/10/ggffzk

Veríssimo, J., Verhaeghen, P., Goldman, N., Weinstein, M., & Ullman, M. T. (2021). Evidence that ageing yields improvements as well as declines across attention and executive functions. Nature Human Behaviour. https://doi.org/10/gmh3bj

Wagenmakers, E.-J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010). Bayesian hypothesis testing for psychologists: A tutorial on the Savage-Dickey method. Cognitive Psychology, 60(3), 158–189. https://doi.org/10/btbnnf

Westfall, J., & Yarkoni, T. (2016). Statistically controlling for confounding constructs is harder than you think (U. S. Tran, Ed.). PLoS ONE, 11(3), e0152719. https://doi.org/10/f8wpvb

Wurm, L. H., & Fisicaro, S. A. (2014). What residualizing predictors in regression analyses does (and what it does not do). Journal of Memory and Language, 72, 37–48. https://doi.org/10/gffnjn

Yap, M. J., Balota, D. A., Sibley, D. E., & Ratcliff, R. (2012). Individual differences in visual word recognition: Insights from the English Lexicon Project. Journal of Experimental Psychology: Human Perception and Performance, 38(1), 53–79. https://doi.org/10.1037/a0024177

Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12(6), 1100–1122. https://doi.org/10/gcmrmp

York, R. (2012). Residualization is not the answer: Rethinking how to address multicollinearity. Social Science Research, 41(6), 1379–1386. https://doi.org/10/gk9zbk