How Strong Is the Evidence for a Causal Reciprocal Effect? Contrasting Traditional and New Methods to Investigate the Reciprocal Effects Model of Self-Concept and Achievement

Springer Science and Business Media LLC - Tập 35 - Trang 1-45 - 2023
Nicolas Hübner1, Wolfgang Wagner2, Steffen Zitzmann2, Benjamin Nagengast2,3
1Institute of Education, University of Tübingen, Tübingen, Germany
2Hector Research Institute of Education Sciences and Psychology, University of Tübingen, Tübingen, Germany
3Department of Education and the Brain & Motivation Research Institute (bMRI), Korea University, Seoul, Republic of Korea

Tóm tắt

The relationship between students’ subject-specific academic self-concept and their academic achievement is one of the most widely researched topics in educational psychology. A large proportion of this research has considered cross-lagged panel models (CLPMs), oftentimes synonymously referred to as reciprocal effects models (REMs), as the gold standard for investigating the causal relationships between the two variables and has reported evidence of a reciprocal relationship between self-concept and achievement. However, more recent methodological research has questioned the plausibility of assumptions that need to be satisfied in order to interpret results from traditional CLPMs causally. In this substantive-methodological synergy, we aimed to contrast traditional and more recently developed methods to investigate reciprocal effects of students’ academic self-concept and achievement. Specifically, we compared results from CLPMs, full-forward CLPMs (FF-CLPMs), and random intercept CLPMs (RI-CLPMs) with two weighting approaches developed to study causal effects of continuous treatment variables. To estimate these different models, we used rich longitudinal data of N = 3757 students from lower secondary schools in Germany. Results from CLPMs, FF-CLPMs, and weighting methods supported the reciprocal effects model, particularly when math self-concept and grades were considered. Results from the RI-CLPMs were less consistent. Implications from our study for the interpretation of effects from the different models and methods as well as for school motivation theory are discussed.

Tài liệu tham khảo

Angrist, J. D., & Pischke, J.-S. (2009). Mostly harmless econometrics: An empiricist’s companion. Princeton Univ. Press. Arens, A. K., Marsh, H. W., Pekrun, R., Lichtenfeld, S., Murayama, K., & vom Hofe, R. (2017). Math self-concept, grades, and achievement test scores: Long-term reciprocal effects across five waves and three achievement tracks. Journal of Educational Psychology, 109(5), 621–634. https://doi.org/10.1037/edu0000163 Bailey, D. H., Oh, Y., Farkas, G., Morgan, P., & Hillemeier, M. (2020). Reciprocal effects of reading and mathematics? Beyond the cross-lagged panel model. Developmental Psychology, 56(5), 912–921. https://doi.org/10.1037/dev0000902 Baumert J., Roeder P., Gruehn S., Heyn S., Köller O., Rimmele R. (1996). Bildungsverläufe und psychosoziale Entwicklung im Jugendhalter [Educational Careers and Psychological Development in Adolescents and Young Adulthood]. In Treumann K.-P., Neubauer G., Möller R., Abel J. (Eds.), Methoden und Anwendungen empirischer pädagogischer Forschung (pp. 170–180). Waxmann. Berry, D., & Willoughby, M. T. (2017). On the practical interpretability of cross-lagged panel models: Rethinking a developmental workhorse. Child Development, 88(4), 1186–1206. https://doi.org/10.1111/cdev.12660 Brunner, M., Keller, U., Dierendonck, C., Reichert, M., Ugen, S., Fischbach, A., & Martin, R. (2010). The structure of academic self-concepts revisited: The nested Marsh/Shavelson model. Journal of Educational Psychology, 102(4), 964–981. https://doi.org/10.1037/a0019644 Bryan, C. J., Tipton, E., & Yeager, D. S. (2021). Behavioural science is unlikely to change the world without a heterogeneity revolution. Nature Human Behaviour, 5(8), 980–989. https://doi.org/10.1038/s41562-021-01143-3 Bundesamt, S. (2010). Statistisches Jahrbuch für die Bundesrepublik Deutschland [Statistical Yearbook for the Federal Republic of Germany]. Statistisches Bundesamt. Burns, R. A., Crisp, D. A., & Burns, R. B. (2020). Re-examining the reciprocal effects model of self-concept, self-efficacy, and academic achievement in a comparison of the cross-lagged panel and random-intercept cross-lagged panel frameworks. British Journal of Educational Psychology, 90(1), 77–91. https://doi.org/10.1111/bjep.12265 Calsyn, R. J., & Kenny, D. A. (1977). Self-concept of ability and perceived evaluation of others: Cause or effect of academic achievement? Journal of Educational Psychology, 69(2), 136–145. https://doi.org/10.1037/0022-0663.69.2.136 Chmielewski, A. K., Dumont, H., & Trautwein, U. (2013). Tracking effects depend on tracking type. American Educational Research Journal, 50(5), 925–957. https://doi.org/10.3102/0002831213489843 Cole, S. R., & Hernán, M. A. (2008). Constructing inverse probability weights for marginal structural models. American Journal of Epidemiology, 168(6), 656–664. https://doi.org/10.1093/aje/kwn164 Cook, T. D., Shadish, W. R., & Wong, V. C. (2008). Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. Journal of Policy Analysis and Management, 27(4), 724–750. https://doi.org/10.1002/pam.20375 Cook, T. D., Steiner, P. M., & Pohl, S. (2009). How bias reduction is affected by covariate choice, unreliability, and mode of data analysis: Results from two types of within-study comparisons. Multivariate Behavioral Research, 44(6), 828–847. https://doi.org/10.1080/00273170903333673 Cunningham, S. (2021). Causal inference: The mixtape. Yale University Press. Curran, P. J., Howard, A. L., Bainter, S. A., Lane, S. T., & McGinley, J. S. (2014). The separation of between-person and within-person components of individual change over time: A latent curve model with structured residuals. Journal of Consulting and Clinical Psychology, 82(5), 879–894. https://doi.org/10.1037/a0035297 Ehm, J.-H., Hasselhorn, M., & Schmiedek, F. (2019). Analyzing the developmental relation of academic self-concept and achievement in elementary school children: Alternative models point to different results. Developmental Psychology, 55(11), 2336–2351. https://doi.org/10.1037/dev0000796 Enders, C. K. (2010). Applied missing data analysis. Guilford Press. Fong, C., Hazlett, C., & Imai, K. (2018). Covariate balancing propensity score for a continuous treatment: Application to the efficacy of political advertisements. The Annals of Applied Statistics, 12(1), 156–177. https://doi.org/10.1214/17-AOAS1101 Fong, C., Ratkovic, M., Imai, K., Hazlett, C., Yang, X., & Peng, S. (2021). Package CBPS: Covariate balancing propensity score. https://CRAN.R-project.org/package=CBPS Gische, C., West, S. G., & Voelkle, M. C. (2021). Forecasting causal effects of interventions versus predicting future outcomes. Structural Equation Modeling: A Multidisciplinary Journal, 28(3), 475–492. https://doi.org/10.1080/10705511.2020.1780598 Greifer, N. (2021a). Package cobalt: Covariate balance tables and plots. https://cran.r-project.org/web/packages/cobalt/cobalt.pdf Greifer, N. (2021b). Package WeightIt: Matching and weighting multiply imputed datasets. https://cran.r-project.org/web/packages/WeightIt/WeightIt.pdf Hainmueller, J. (2012). Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Political Analysis, 20(1), 25–46. https://doi.org/10.1093/pan/mpr025 Hallquist, M. N., & Wiley, J. F. (2018). Mplusautomation: An R package for facilitating large-scale latent variable analyses in Mplus. Structural Equation Modeling: A Multidisciplinary Journal, 25(4), 621–638. https://doi.org/10.1080/10705511.2017.1402334 Hamaker, E. L., Kuiper, R. M., & Grasman, R. P. P. P. (2015). A critique of the cross-lagged panel model. Psychological Methods, 20(1), 102–116. https://doi.org/10.1037/a0038889 Hamaker, E. L., & Muthén, B. (2020). The fixed versus random effects debate and how it relates to centering in multilevel modeling. Psychological Methods, 25(3), 365–379. https://doi.org/10.1037/met0000239 Hecht, M., & Zitzmann, S. (2021). Exploring the unfolding of dynamic effects with continuous-time models: Recommendations concerning statistical power to detect peak cross-lagged effects. Structural Equation Modeling: A Multidisciplinary Journal, 28(6), 894–902. https://doi.org/10.1080/10705511.2021.1914627 Helmke, A., & van Aken, M. A. G. (1995). The causal ordering of academic achievement and self-concept of ability during elementary school: A longitudinal study. Journal of Educational Psychology, 87(4), 624–637. https://doi.org/10.1037/0022-0663.87.4.624 Hernán, M. A., & Robins, J. M. (2020). Causal inference: What if. Chapman & Hall/CRC. Hirano, K., & Imbens, G. W. (2004). The propensity score with continuous treatments. In A. Gelman & X.-L. Meng (Eds.), Applied bayesian modeling and causal inference from incomplete-data perspectives (pp. 73–84). Wiley. https://doi.org/10.1002/0470090456.ch7 Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945. https://doi.org/10.2307/2289064 Hoyle, R. H. (Ed.). (2012). Handbook of structural equation modeling. Guilford Press. Huang, C. (2011). Self-concept and academic achievement: A meta-analysis of longitudinal relations. Journal of School Psychology, 49(5), 505–528. https://doi.org/10.1016/j.jsp.2011.07.001 Hübner, N., Spengler, M., Nagengast, B., Borghans, L., Schils, T., & Trautwein, U. (2022). When academic achievement (also) reflects personality: Using the personality-achievement saturation hypothesis (PASH) to explain differential associations between achievement measures and personality traits. Journal of Educational Psychology, 114(2), 326–345. https://doi.org/10.1037/edu0000571 Hübner, N., Trautwein, U., & Nagengast, B. (2021). Should I stay or should I go? Predictors and effects of studying abroad during high school. Learning and Instruction, 71, 101398. https://doi.org/10.1016/j.learninstruc.2020.101398 Hübner, N., Wille, E., Cambria, J., Oschatz, K., Nagengast, B., & Trautwein, U. (2017). Maximizing gender equality by minimizing course choice options? Effects of obligatory coursework in math on gender differences in STEM. Journal of Educational Psychology, 109(7), 993–1009. https://doi.org/10.1037/edu0000183 Imai, K., King, G., & Stuart, E. A. (2008). Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society: Series a (Statistics in Society), 171(2), 481–502. https://doi.org/10.1111/j.1467-985X.2007.00527.x Imai, K., & Ratkovic, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 243–263. https://doi.org/10.1111/rssb.12027 Jonkmann, K., Rose, N., & Trautwein, U. (2013). Tradition und Innovation: Entwicklungsverläufe an Haupt- und Realschulen in Baden-Württemberg und Mittelschulen in Sachsen: Abschlussbericht für die Länder Baden-Württemberg und Sachsen. [Tradition and Innovation: Student development at low- and intermediate-track schools in Baden-Württemberg and comprehensive track schools in Saxony: Final report for Baden-Württemberg and Saxony]. Hector Research Institute of Education Sciences and Psychology. Kang, J., Chan, W., Kim, M.-O., & Steiner, P. M. (2016). Practice of causal inference with the propensity of being zero or one: Assessing the effect of arbitrary cutoffs of propensity scores. Communications for Statistical Applications and Methods, 23(1), 1–20. https://doi.org/10.5351/CSAM.2016.23.1.001 Khorramdel, L., von Davier, M., Gonzalez, E., & Yamamoto, K. (2020). Plausible values: Principles of item response theory and multiple imputations. In D. B. Maehler & B. Rammstedt (Eds.), Methodology of Educational Measurement and Assessment. Large-Scale Cognitive Assessment (pp. 27–47). Springer International Publishing. https://doi.org/10.1007/978-3-030-47515-4_3 Lechner, C. M., Bhaktha, N., Groskurth, K., & Bluemke, M. (2021). Why ability point estimates can be pointless: A primer on using skill measures from large-scale assessments in secondary analyses. Measurement Instruments for the Social Sciences, 3(1). https://doi.org/10.1186/s42409-020-00020-5 Lehmann, R. H., & Lenkeit, J. (2008). ELEMENT. Erhebung zum Lese- und Mathematikverständnis - Entwicklungen in den Jahrgangsstufen 4 bis 6 in Berlin. Abschlussbericht über die Untersuchungen 2003, 2004 und 2005 an Berliner Grundschulen und grundständigen Gymnasien. Survey for reading and mathematics literacy. Development in grades 4 to 6 in Berlin. Final research report on the surveys in 2003, 2004, and 2005 in primary schools and undergraduate academic tracks in Berlin. Humboldt-Universität zu Berlin. Leyrat, C., Seaman, S. R., White, I. R., Douglas, I., Smeeth, L., Kim, J., Resche-Rigon, M., Carpenter, J. R., & Williamson, E. J. (2019). Propensity score analysis with partially observed covariates: How should multiple imputation be used? Statistical Methods in Medical Research, 28(1), 3–19. https://doi.org/10.1177/0962280217713032 Lüdtke, O., & Robitzsch, A. (2021). A critique of the random intercept cross-lagged panel model. PsyArXiv.. https://doi.org/10.31234/osf.io/6f85c Lüdtke, O., & Robitzsch, A. (2022). A comparison of different approaches for estimating cross-lagged effects from a causal inference perspective. Structural Equation Modeling: A Multidisciplinary Journal, 29(6), 88–907. https://doi.org/10.1080/10705511.2022.2065278 Lumley, T. (2018). Package ‘survey’. https://cran.r-project.org/web/packages/survey/survey.pdf Marsh, H. W. (1990a). Causal ordering of academic self-concept and academic achievement: A multiwave, longitudinal panel analysis. Journal of Educational Psychology, 82(4), 646–656. https://doi.org/10.1037/0022-0663.82.4.646 Marsh, H. W. (1990b). The structure of academic self-concept: The Marsh/Shavelson model. Journal of Educational Psychology, 82(4), 623–636. https://doi.org/10.1037/0022-0663.82.4.623 Marsh, H. W. (1992). Self description questionnaire (SDQ) III: A theoretical and empirical basis for the Measurement of multiple dimensions of late adolescent self-concept: A test manual and a research monograph. Macarthur, New South Wales, Australia: University of Western Sydney, Faculty of Education. Marsh, H. W., Byrne, B. M., & Yeung, A. S. (1999). Causal ordering of academic self-concept and achievement: Reanalysis of a pioneering study and. Educational Psychologist, 34(3), 155–167. https://doi.org/10.1207/s15326985ep3403_2 Marsh, H. W., & Craven, R. G. (2006). Reciprocal effects of self-concept and performance from a multidimensional perspective: Beyond seductive pleasure and unidimensional perspectives. Perspectives on Psychological Science, 1(2), 133–163. https://doi.org/10.1111/j.1745-6916.2006.00010.x Marsh, H. W., & Hau, K.-T. (2007). Applications of latent-variable models in educational psychology: The need for methodological-substantive synergies. Contemporary Educational Psychology, 32(1), 151–170. https://doi.org/10.1016/j.cedpsych.2006.10.008 Marsh, H. W., & Martin, A. J. (2011). Academic self-concept and academic achievement: Relations and causal ordering. British Journal of Educational Psychology, 81, 59–77. https://doi.org/10.1348/000709910X503501 Marsh, H. W., Martin, A. J., Yeung, A. S., & Craven, R. (2016). Competence self-perceptions. In A. J. Elliot, C. S. Dweck, & D. Yeager (Eds.), Handbook of competence and motivation (pp. 85–115). Guilford Press. Marsh, H. W., Pekrun, R., & Lüdtke, O. (2022). Directional ordering of self-concept, school grades, and standardized tests over five years: New tripartite models juxtaposing within- and between-person perspectives. Educational Psychology Review, 34, 2697–2744. https://doi.org/10.1007/s10648-022-09662-9 Marsh, H. W., Pekrun, R., Murayama, K., Arens, A. K., Parker, P. D., Guo, J., & Dicke, T. (2018). An integrated model of academic self-concept development: Academic self-concept, grades, test scores, and tracking over 6 years. Developmental Psychology, 54(2), 263–280. https://doi.org/10.1037/dev0000393 Marsh, H. W., Trautwein, U., Lüdtke, O., Köller, O., & Baumert, J. (2005). Academic self-concept, interest, grades, and standardized test scores: Reciprocal effects models of causal ordering. Child Development, 76(2), 397–416. https://doi.org/10.1111/j.1467-8624.2005.00853.x McNeish, D., & Wolf, M. G. (2021). Dynamic fit index cutoffs for confirmatory factor analysis models. In Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000425 Möller, J., Pohlmann, B., Köller, O., & Marsh, H. W. (2009). A meta-analytic path analysis of the internal/external frame of reference model of academic achievement and academic self-concept. Review of Educational Research, 79(3), 1129–1167. https://doi.org/10.3102/0034654309337522 Möller, J., Zitzmann, S., Helm, F., Machts, N., & Wolff, F. (2020). A meta-analysis of relations between achievement and self-concept. Review of Educational Research, 90(3), 376–419. https://doi.org/10.3102/0034654320919354 Mulder, J. D., & Hamaker, E. L. (2021). Three extensions of the random intercept cross-lagged panel model. Structural Equation Modeling: A Multidisciplinary Journal, 28(4), 638–648. https://doi.org/10.1080/10705511.2020.1784738 Muthén, L. K., & Muthén, B. O. (1998-2017). Mplus user’s guide (8th ed.). Muthén & Muthén. Niepel, C., Marsh, H. W., Guo, J., Pekrun, R., & Möller, J. (2021). Revealing dynamic relations between mathematics self-concept and perceived achievement from lesson to lesson: An experience-sampling study. Journal of Educational Psychology, 114(6), 1380–1393. https://doi.org/10.1037/edu0000716 Núñez-Regueiro, F., Juhel, J., Bressoux, P., & Nurra, C. (2021). Identifying reciprocities in school motivation research: A review of issues and solutions associated with cross-lagged effects models. Journal of Educational Psychology, 114(5), 945–965. https://doi.org/10.1037/edu0000700 O'Mara, A. J., Marsh, H. W., Craven, R. G., & Debus, R. L. (2006). Do self-concept interventions make a difference? A synergistic blend of construct validation and meta-analysis. Educational Psychologist, 41(3), 181–206. https://doi.org/10.1207/s15326985ep4103_4 Orth, U., Clark, D. A., Donnellan, M. B., & Robins, R. W. (2021). Testing prospective effects in longitudinal research: Comparing seven competing cross-lagged models. Journal of Personality and Social Psychology, 120(4), 1013–1034. https://doi.org/10.1037/pspp0000358 Orth, U., Meier, L. L., Bühler, J. L., Dapp, L. C., Krauss, S., Messerli, D., & Robins, R. W. (2022). Effect size guidelines for cross-lagged effects. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000499 Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge Univ. Press. Pearl, J. (2010). On the consistency rule in causal inference: Axiom, definition, assumption, or theorem? Epidemiology, 21(6), 872–875. https://doi.org/10.1097/EDE.0b013e3181f5d3fd Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal inference in statistics: A primer. Wiley. http://lib.myilibrary.com/detail.asp?ID=895561 Penk, C., Pöhlmann, C., & Roppelt, A. (2014). The role of test-taking motivation for students’ performance in low-stakes assessments: An investigation of school-track-specific differences. Large-Scale Assessments in Education, 2(1), 1–17. https://doi.org/10.1186/s40536-014-0005-4 Pinxten, M., de Fraine, B., van Damme, J., & D'Haenens, E. (2010). Causal ordering of academic self-concept and achievement: Effects of type of achievement measure. British Journal of Educational Psychology, 80, 689–709. https://doi.org/10.1348/000709910X493071 Pishgar, F., Greifer, N., Leyrat, C., & Stuart, E. (2020). Package MatchThem: Matching and weighting multiply imputed datasets. https://cran.r-project.org/web/packages/MatchThem/MatchThem.pdf Pishgar, F., Greifer, N., Leyrat, C., & Stuart, E. (2021). MatchThem: Matching and weighting after multiple imputation. The R Journal, 13(2), 292–305. https://journal.r-project.org/archive/2021/RJ-2021-073/index.html Preckel, F., Schmidt, I., Stumpf, E., Motschenbacher, M., Vogl, K., & Schneider, W. (2017). A test of the reciprocal-effects model of academic achievement and academic self-concept in regular classes and special classes for the gifted. Gifted Child Quarterly, 61(2), 103–116. https://doi.org/10.1177/0016986216687824 R Development Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing http://www.R-project.org Rehkopf, D. H., Glymour, M. M., & Osypuk, T. L. (2016). The consistency assumption for causal inference in social epidemiology: When a rose is not a rose. Current Epidemiology Reports, 3(1), 63–71. https://doi.org/10.1007/s40471-016-0069-5 Robitzsch, A., Grund, S., & Henke, T. (2021). Package ‘miceadds’. https://cran.r-project.org/web/packages/miceadds/miceadds.pdf Rogosa, D. (1980). A critique of cross-lagged correlation. Psychological Bulletin, 88(2), 245–258. https://doi.org/10.1037/0033-2909.88.2.245 Rose, N., Jonkmann, K., Hübner, N., Sälzer, C., Lüdtke, O., & Nagy, G. (2013). Durchführung und methodische Grundlagen der TRAIN-Studie [Implementation and methodological foundations of the TRAIN study]. In K. Jonkmann, N. Rose, & U. Trautwein (Eds.), Tradition und Innovation: Entwicklungsverläufe an Haupt- und Realschulen in Baden-Württemberg und Mittelschulen in Sachsen: Abschlussbericht für die Länder Baden-Württemberg und Sachsen (pp. 77–102). Hector Research Institute of Education Sciences and Psychology. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.2307/2335942 Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701. https://doi.org/10.1037/h0037350 Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. Wiley. Rubin, D. B. (2004). Teaching statistical inference for causal effects in experiments and observational studies. Journal of Educational and Behavioral Statistics, 29(3), 343–367. https://doi.org/10.3102/10769986029003343 Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), 279–313. https://doi.org/10.1037/a0014268 Schwanzer, A. D., Trautwein, U., Lüdtke, O., & Sydow, H. (2005). Entwicklung eines Instruments zur Erfassung des Selbstkonzepts junger Erwachsener. Diagnostica, 51(4), 183–194. https://doi.org/10.1026/0012-1924.51.4.183 Seaton, M., Marsh, H. W., Parker, P. D., Craven, R. G., & Yeung, A. S. (2015). The reciprocal effects model revisited. Gifted Child Quarterly, 59(3), 143–156. https://doi.org/10.1177/0016986215583870 Sewasew, D., Schroeders, U., Schiefer, I. M., Weirich, S., & Artelt, C. (2018). Development of sex differences in math achievement, self-concept, and interest from grade 5 to 7. Contemporary Educational Psychology, 54, 55–65. https://doi.org/10.1016/j.cedpsych.2018.05.003 Shadish, W. R. (2010). Campbell and Rubin: A primer and comparison of their approaches to causal inference in field settings. Psychological Methods, 15(1), 3–17. https://doi.org/10.1037/a0015916 Shadish, W. R., Clark, M. H., & Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. Journal of the American Statistical Association, 103(484), 1334–1344. https://doi.org/10.1198/016214508000000733 Shavelson, R. J., Hubner, J. J., & Stanton, G. C. (1976). Self-concept: Validation of construct interpretations. Review of Educational Research, 46(3), 407–441. https://doi.org/10.3102/00346543046003407 Sirin, S. R. (2005). Socioeconomic status and academic achievement: A meta-analytic review of research. Review of Educational Research, 75(3), 417–453. https://doi.org/10.3102/00346543075003417 Steyer, R. (2001). Classical (psychometric) test theory. In C. Ragin & T. Cook (Eds.), International encyclopedia of the social & behavioral sciences. Logic of inquiry and research design (pp. 481–520). Elsevier. https://doi.org/10.1016/B0-08-043076-7/00721-X Thoemmes, F., & Kim, E. S. (2011). A systematic review of propensity score methods in the social sciences. Multivariate Behavioral Research, 46(1), 90–118. https://doi.org/10.1080/00273171.2011.540475 Thoemmes, F., & Ong, A. D. (2016). A primer on inverse probability of treatment weighting and marginal structural models. Emerging Adulthood, 4(1), 40–59. https://doi.org/10.1177/2167696815621645 Tipton, E., & Olsen, R. B. (2018). A review of statistical methods for generalizing from evaluations of educational interventions. Educational Researcher, 47(8), 516–524. https://doi.org/10.3102/0013189X18781522 Tübbicke, S. (2021). Entropy balancing for continuous treatments. Journal of Econometric Methods, 11(1), 71–89. https://doi.org/10.1515/jem-2021-0002 Uchida, A., Michael, R. B., & Mori, K. (2018). An induced successful performance enhances student self-efficacy and boosts academic achievement. AERA Open, 4(4). https://doi.org/10.1177/2332858418806198 Usami, S. (2021). On the differences between general cross-lagged panel model and random-intercept cross-lagged panel model: Interpretation of cross-lagged parameters and model choice. Structural Equation Modeling: A Multidisciplinary Journal, 28(3), 331–344. https://doi.org/10.1080/10705511.2020.1821690 Usami, S., Hayes, T., & McArdle, J. J. (2015). On the mathematical relationship between latent change score and autoregressive cross-lagged factor approaches: Cautions for inferring causal relationship between variables. Multivariate Behavioral Research, 50(6), 676–687. https://doi.org/10.1080/00273171.2015.1079696 Usami, S., Murayama, K., & Hamaker, E. L. (2019a). A unified framework of longitudinal models to examine reciprocal relations. Psychological Methods, 24(5), 637–657. https://doi.org/10.1037/met0000210 Usami, S., Todo, N., & Murayama, K. (2019b). Modeling reciprocal effects in medical research: Critical discussion on the current practices and potential alternative models. PloS One, 14(9), 1–26. https://doi.org/10.1371/journal.pone.0209133 Valentine, J. C., DuBois, D. L., & Cooper, H. (2004). The relation between self-beliefs and academic achievement: A meta-analytic review. Educational Psychologist, 39(2), 111–133. https://doi.org/10.1207/s15326985ep3902\textunderscore Vanderweele, T. J. (2019). Principles of confounder selection. European Journal of Epidemiology, 34(3), 211–219. https://doi.org/10.1007/s10654-019-00494-6 Vanderweele, T. J., & Hernán, M. A. (2013). Causal inference under multiple versions of treatment. Journal of Causal Inference, 1(1), 1–20. https://doi.org/10.1515/jci-2012-0002 Vanderweele, T. J., Mathur, M. B., & Chen, Y. (2020). Outcome-wide longitudinal designs for causal inference: A new template for empirical studies. Statistical Science, 35(3), 437–466. https://doi.org/10.1214/19-STS728 Voelkle, M. C., Gische, C., Driver, C. C., & Lindenberger, U. (2018). The role of time in the quest for understanding psychological mechanisms. Multivariate Behavioral Research, 53(6), 782–805. https://doi.org/10.1080/00273171.2018.1496813 Voyer, D., & Voyer, S. D. (2014). Gender differences in scholastic achievement: A meta-analysis. Psychological Bulletin, 140(4), 1174–1204. https://doi.org/10.1037/a0036620 Watt, H. M., Hyde, J. S., Petersen, J., Morris, Z. A., Rozek, C. S., & Harackiewicz, J. M. (2017). Mathematics—a critical filter for STEM-related career choices? A longitudinal examination among Australian and U.S. adolescents. Sex Roles, 77, 254–271. https://doi.org/10.1007/s11199-016-0711-1 Watt, H. M., Shapka, J. D., Morris, Z. A., Durik, A. M., Keating, D. P., & Eccles, J. S. (2012). Gendered motivational processes affecting high school mathematics participation, educational aspirations, and career plans: A comparison of samples from Australia, Canada, and the United States. Developmental Psychology, 48(6), 1594–1611. https://doi.org/10.1037/a0027838 West, S. G., & Thoemmes, F. (2010). Campbell’s and Rubin’s perspectives on causal inference. Psychological Methods, 15(1), 18–37. https://doi.org/10.1037/a0015917 Wu, H., Guo, Y., Yang, Y., Zhao, L., & Guo, C. (2021). A meta-analysis of the longitudinal relationship between academic self-concept and academic achievement. Educational Psychology Review, 33, 1749–1778. https://doi.org/10.1007/s10648-021-09600-1 Wylie, R. C. (1979). The self-concept: Theory and research on selected topics (2nd ed.). University of Nebraska Press. Zhao, Q., & Percival, D. (2017). Entropy balancing is doubly robust. Journal of Causal Inference, 5(1), 20160010. https://doi.org/10.1515/jci-2016-0010 Zyphur, M. J., Voelkle, M. C., Tay, L., Allison, P. D., Preacher, K. J., Zhang, Z., Hamaker, E. L., Shamsollahi, A., Pierides, D. C., Koval, P., & Diener, E. (2020). From data to causes II: Comparing approaches to panel data analysis. Organizational Research Methods, 23(4), 688–716. https://doi.org/10.1177/1094428119847280