Theoretically-Consistent Cognitive Ability Test Development and Score Interpretation

A. Alexander Beaujean1, Nicholas Benson2
1Department of Psychology & Neuroscience, Baylor University, Waco, USA
2Department of Educational Psychology, Baylor University, Waco, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education [AERA/APA/NCME]. (2014). Standards for educational and psychological testing (4th ed.). Washington, DC: Authors.

Beaujean, A. A. (2018). Simulating data for clinical research: a tutorial. The Journal of Psychoeducational Assessment, 36, 7–20. https://doi.org/10.1177/0734282917690302 .

Beaujean, A. A., & Sheng, Y. (2014). Assessing the Flynn effect in the Wechsler scales. Journal of Individual Differences, 35, 63–78. https://doi.org/10.1027/1614-0001/a000128 .

Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061 .

Borsboom, D., Cramer, A. O. J., Kievit, R. A., Scholten, A. Z., & Franić, S. (2009). The end of construct validity. In R. W. Lissitz (Ed.), The concept of validity: revisions, new directions, and applications (pp. 135–170). Charlotte: Information Age Publishing.

Braden, J. P., & Ouzts, S. M. (2005). Review of the Kaufman assessment battery for children, second edition. In B. S. Plake & J. C. Impara (Eds.), The sixteenth mental measurements yearbook (2nd ed., pp. 517–520). Lincoln: Buros Institute of Mental Measurements.

Bringmann, L. F., & Eronen, M. I. (2016). Heating up the measurement debate: what psychologists can learn from the history of physics. Theory & Psychology, 26, 27–43. https://doi.org/10.1177/0959354315617253 .

Canivez, G. L., & Watkins, M. W. (2016). Review of the Wechsler intelligence scale for children-fifth edition: critique, commentary, and independent analyses. In A. S. Kaufman, S. E. Raiford, & D. L. Coalson (Eds.), Intelligent testing with the WISC-V (pp. 683–702). Hoboken: Wiley.

Carroll, J. B. (1996). A three-stratum theory of intelligence: Spearman’s contribution. In I. Dennis & P. Tapsfield (Eds.), Human abilities: their nature and measurement (pp. 1–17). Mahwah: Erlbaum.

Cattell, R. B. (1943). The measurement of adult intelligence. Psychological Bulletin, 40, 153–193. https://doi.org/10.1037/h0059973 .

Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: a critical experiment. Journal of Educational Psychology, 54, 1–22. https://doi.org/10.1037/h0046743 .

Cattell, R. B. (1987). Intelligence: its structure, growth, and action. New York: Elsevier.

Courville, T., Coalson, D. L., Kaufman, A. S., & Raiford, S. E. (2016). Does WISC-V scatter matter? In A. S. Kaufman, S. E. Raiford, & D. L. Coalson (Eds.), Intelligent testing with the WISC-V (pp. 209–228). Hoboken: Wiley.

Downing, S. M. (2006). Twelve steps for effective test development. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of testing (pp. 3–25). Mahwah: Lawrence Erlbaum.

Finkelstein, L. (2005). Problems of measurement in soft systems. Measurement, 38, 267–274. https://doi.org/10.1016/j.measurement.2005.09.002 .

Flanagan, D. P., & Alfonso, V. C. (2017). Essentials of WISC-V assessment (2nd ed.). Hoboken: Wiley.

Flanagan, D. P., Ortiz, S. O., & Alfonso, V. C. (2013). Essentials of cross-battery assessment (3rd ed.). Hoboken: Wiley.

Floyd, R. G., Bergeron, R., McCormack, A. C., Anderson, J. L., & Hargrove-Owens, G. L. (2005). Are Cattell-Horn-Carroll (CHC) broad ability composite scores exchangeable across batteries? School Psychology Review, 34, 329–357.

Frazier, T. W., & Youngstrom, E. A. (2007). Historical increase in the number of factors measured by commercial tests of cognitive ability: are we overfactoring? Intelligence, 35, 169–182. https://doi.org/10.1016/j.intell.2006.07.002 .

Grace, J. B., & Bollen, K. A. (2008). Representing general theoretical concepts in structural equation models: the role of composite variables. Environmental and Ecological Statistics, 15, 191–213. https://doi.org/10.1007/s10651-007-0047-7 .

Grégoire, J. (2013). Measuring components of intelligence: mission impossible? Journal of Psychoeducational Assessment, 31, 138–147. https://doi.org/10.1177/0734282913478034 .

Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430–450. https://doi.org/10.1037/1082-989X.6.4.430 .

Groth-Marnat, G. (1999). Financial efficacy of clinical assessment: rational guidelines and issues for future research. Journal of Clinical Psychology, 55, 813–824.

Grove, W. M., & Vrieze, S. I. (2013). The clinical versus mechanical prediction controversy. In K. F. Geisinger, B. A. Bracken, J. F. Carlson, J. I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 2: Testing and assessment in clinical and counseling psychology (pp. 51–62). Washington, DC: American Psychological Association.

Hale, J. B., Fiorello, C. A., Kavanagh, J. A., Hoeppner, J.-A. B., & Gaither, R. A. (2001). WISC-III predictors of academic achievement for children with learning disabilities: are global and factor scores comparable? School Psychology Quarterly, 16, 31–55. https://doi.org/10.1521/scpq.16.1.31.19158 .

Horn, J. L. (1963). Equations representing combinations of components in scoring psychological variables. Acta Psychologica, 21, 184–217. https://doi.org/10.1016/0001-6918(63)90048-9 .

Horn, J. L. (1985). Remodeling old models of intelligence. In B. B. Wolman (Ed.), Handbook of intelligence (pp. 267–300). New York: Wiley.

Horn, J. L. (1989). Models of intelligence. In R. L. Linn (Ed.), Intelligence, measurement, theory and public policy (pp. 29–73). Urbana: University of Illinois Press.

Horn, J. L. (1991). Measurement of intellectual capabilities: a review of theory. In K. S. McGrew, J. K. Werder, & R. W. Woodcock (Eds.), Woodcock-Johnson psycho-educational battery-revised technical manual (pp. 197–232). Chicago: Riverside.

Horn, J. L., & Blankson, A. N. (2012). Foundations for better understanding of cognitive abilities. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment: theories, tests, and issues (3rd ed., pp. 73–98). New York: Guilford Press.

Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of fluid and crystallized intelligence. Journal of Educational Psychology, 57, 253–270. https://doi.org/10.1037/h0023816 .

Horn, J. L., & McArdle, J. J. (2007). Understanding human intelligence since Spearman. In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at 100: historical developments and future directions (pp. 205–247). Mahwah: Erlbaum.

Hunsley, J., & Mash, E. J. (2007). Evidence-based assessment. Annual Review of Clinical Psychology, 3, 29–51. https://doi.org/10.1146/annurev.clinpsy.3.022806.091419 .

Jackson, J. S. H., & Maraun, M. (1996). The conceptual validity of empirical scale construction: the case of the sensation seeking scale. Personality and Individual Differences, 21, 103–110. https://doi.org/10.1016/0191-8869(95)00217-0 .

Jensen, A. R. (1993). Psychometric g and achievement. In B. R. Gifford (Ed.), Policy perspectives on educational testing (pp. 117–227). New York: Kluwer Academic Publishers.

Jensen, A. R. (2002). Galton’s legacy to research on intelligence. Journal of Biosocial Science, 34, 145–172. https://doi.org/10.1017/s0021932002001451 .

Kamphaus, R. W., Winsor, A. P., Rowe, E. W., & Kim, S. (2012). A history of intelligence test interpretation. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment (3rd ed., pp. 56–70). New York: Guilford.

Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73. https://doi.org/10.1111/jedm.12000 .

Kaufman, A. S., & Kaufman, N. L. (2004). Kaufman assessment battery for children-second edition. Circle Pines: American Guidance Service.

Kaufman, A. S., Raiford, S. E., & Coalson, D. L. (2016). Intelligent testing with the WISC-V. Hoboken: Wiley.

Keith, T. Z., & Reynolds, M. R. (2010). Cattell-Horn-Carroll abilities and cognitive tests: what we’ve learned from 20 years of research. Psychology in the Schools, 47, 635–650. https://doi.org/10.1002/pits.20496 .

Kingston, N. M., Scheuring, S. T., & Kramer, L. B. (2013). Test development strategies. In K. F. Geisinger, B. A. Bracken, J. F. Carlson, J. I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 1: test theory and testing and assessment in industrial and organizational psychology (pp. 165–184). Washington, DC: American Psychological Association.

Kline, P. (2000). The handbook of psychological testing (2nd ed.). London: Routledge.

Krause, M. S. (2012). Measurement validity is fundamentally a matter of definition, not correlation. Review of General Psychology, 16, 391–400. https://doi.org/10.1037/a0027701 .

Krause, M. S. (2013). The data analytic implications of human psychology’s dimensions being ordinally scaled. Review of General Psychology, 17, 318–325. https://doi.org/10.1037/a0032292 .

Littell, W. M. (1960). The Wechsler intelligence scale for children: review of a decade of research. Psychological Bulletin, 57, 132–156. https://doi.org/10.1037/h0044513 .

Luecht, R. M., Gierl, M. J., Tan, X., & Huff, K. (2006). Scalability and the development of useful diagnostic scales. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.

Luria, A. R. (1973). The working brain: an introduction to neuropsychology. New York: Basic Books.

Maraun, M. D. (1998a). Measurement as a normative practice: implications of Wittgenstein’s philosophy for measurement in psychology. Theory & Psychology, 8, 435–461. https://doi.org/10.1177/0959354398084001 .

Maraun, M. D. (1998b). The nexus misconceived: Wittgenstein made silly. Theory & Psychology, 8, 489–501. https://doi.org/10.1177/0959354398084004 .

Mari, L., Carbone, P., & Petri, D. (2015). Fundamentals of hard and soft measurement. In A. Ferrero, D. Petri, P. Carbone & M. Catelani (Eds.), Modern measurements: Fundamentals and applications (pp. 203–262). Hoboken, NJ: Wiley-IEEE Press.

McDonald, R. P. (1999). Test theory: a unified treatment. Mahwah: Erlbaum.

McGrew, K. S. (2009). CHC theory and the human cognitive abilities project: standing on the shoulders of the giants of psychometric intelligence research. Intelligence, 37, 1–10. https://doi.org/10.1016/j.intell.2008.08.004 .

McGrew, K. S., LaForte, E. M., & Schrank, F. A. (2014). Woodcock- Johnson IV technical manual. Rolling Meadows: Riverside.

Michell, J. (1999). Measurement in psychology: critical history of a methodological concept. New York: Cambridge University Press.

Michell, J. (2007). Measurement. In S. P. Turner & M. W. Risjord (Eds.), Philosophy of anthropology and sociology (pp. 71–119). Amsterdam: North Holland.

Michell, J. (2011). Qualitative research meets the ghost of Pythagoras. Theory & Psychology, 21, 241–259. https://doi.org/10.1177/0959354310391351 .

Michell, J. (2012). Alfred Binet and the concept of heterogeneous orders. Frontiers in Psychology, 3(261), 1–8. https://doi.org/10.3389/fpsyg.2012.00261 .

Petri, D., Mari, L., & Carbone, P. (2015). A structured methodology for measurement development. IEEE Transactions on Instrumentation and Measurement, 64, 2367–2379. https://doi.org/10.1109/TIM.2015.2399023 .

Pfeiffer, S. I., Reddy, L. A., Kletzel, J. E., Schmelzer, E. R., & Boyer, L. M. (2000). The practitioner’s view of IQ testing and profile analysis. School Psychology Quarterly, 15, 376–385. https://doi.org/10.1037/h0088795 .

R Development Core Team. (2017). R: a language and environment for statistical computing (version 3.3.3) [computer program]. Vienna: R Foundation for Statistical Computing.

Raiford, S. E. (2017). Essentials of WISC-V integrated assessment. Hoboken: Wiley.

Schneider, W. J. (2013). What if we took our models seriously? Estimating latent scores in individuals. Journal of Psychoeducational Assessment, 31, 186–201. https://doi.org/10.1177/0734282913478046 .

Schneider, W. J., & McGrew, K. S. (2012). The Cattell-Horn-Carroll model of intelligence. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary intellectual assessment (3rd ed., pp. 99–144). New York: Guilford.

Schrank, F. A., McGrew, K. S., & Mather, N. (2014). Woodcock-Johnson IV tests of cognitive abilities. Rolling Meadows: Riverside.

Sijtsma, K. (2012). Psychological measurement between physics and statistics. Theory & Psychology, 22, 786–809. https://doi.org/10.1177/0959354312454353 .

Sijtsma, K. (2013). Theory development as a precursor for test validity. In R. E. Millsap, L. A. van der Ark, D. M. Bolt, & C. M. Woods (Eds.), New developments in quantitative psychology: presentations from the 77th annual psychometric society meeting (pp. 267–274). New York: Springer.

Spearman, C. E. (1927). The abilities of man: their nature and measurement. New York: Blackburn Press.

Spearman, C. E. (1931). Our need of some science in place of the word ‘intelligence’. Journal of Educational Psychology, 22, 401–410. https://doi.org/10.1037/h0070599 .

Spearman, C. E. (1939). Thurstone’s work re-worked. Journal of Educational Psychology, 30, 1–16. https://doi.org/10.1037/h0061267 .

Spearman, C. E. (1946). Theory of general factor. British Journal of Psychology, 36, 117–131. https://doi.org/10.1111/j.2044-8295.1946.tb01114.x .

Thomson, G. H. (1927). The tetrad-difference criterion. British Journal of Psychology. General Section, 17, 235–255. https://doi.org/10.1111/j.2044-8295.1927.tb00426.x .

Thurstone, L. L. (1935). The vectors of mind: multiple-factor analysis for the isolation of primary traits. Chicago: University of Chicago Press.

Tomarken, A. J., & Waller, N. G. (2003). Potential problems with “well fitting” models. Journal of Abnormal Psychology, 112, 578–598. https://doi.org/10.1037/0021-843X.112.4.578 .

Wechsler, D. (1950). Cognitive, conative, and non-intellective intelligence. American Psychologist, 5, 78–83. https://doi.org/10.1037/h0063112 .

Wechsler, D. (1975). Intelligence defined and undefined: a relativistic appraisal. American Psychologist, 30, 135–139. https://doi.org/10.1037/h0076868 .

Wechsler, D. (1981). The psychometric tradition: developing the Wechsler adult intelligence scale. Contemporary Educational Psychology, 6, 82–85. https://doi.org/10.1016/0361-476X(81)90035-7 .

Wechsler, D. (2014). Wechsler intelligence scale for children-fifth edition administration and scoring manual. Bloomington: NCS Pearson.