Applied Psychological Measurement

Công bố khoa học tiêu biểu

* Dữ liệu chỉ mang tính chất tham khảo

Sắp xếp:  
The Effects of Referent Item Parameters on Differential Item Functioning Detection Using the Free Baseline Likelihood Ratio Test
Applied Psychological Measurement - Tập 33 Số 4 - Trang 251-265 - 2009
Gabriel E. Lopez Rivas, Stephen Stark, Oleksandr S. Chernyshenko

The purpose of this simulation study is to investigate the effects of anchor subtest composition on the accuracy of item response theory (IRT) likelihood ratio (LR) differential item functioning (DIF) detection (Thissen, Steinberg, & Wainer, 1988). Here, the IRT LR test was implemented with a free baseline approach wherein a baseline model was formed by freeing all items except a referent or anchor subset and examining the changes in fit with respect to a series of models wherein 1 item at a time was constrained in addition to the referent(s). The results clearly indicated that the composition of the anchor subtest is important for accurate DIF detection. It was found that using a single highly discriminating rather than a low discriminating referent greatly enhanced the power of the procedure. Moreover, in conditions involving small DIF or smaller sample sizes or both, power appeared to improve when a group of highly discriminating referents was used. These findings have implications for applied research involving short scales and small sample sizes.

Critical Values for Yen’s Q3: Identification of Local Dependence in the Rasch Model Using Residual Correlations
Applied Psychological Measurement - Tập 41 Số 3 - Trang 178-194 - 2017
Karl Bang Christensen, Guido Makransky, Mike Horton

The assumption of local independence is central to all item response theory (IRT) models. Violations can lead to inflated estimates of reliability and problems with construct validity. For the most widely used fit statistic Q3, there are currently no well-documented suggestions of the critical values which should be used to indicate local dependence (LD), and for this reason, a variety of arbitrary rules of thumb are used. In this study, an empirical data example and Monte Carlo simulation were used to investigate the different factors that can influence the null distribution of residual correlations, with the objective of proposing guidelines that researchers and practitioners can follow when making decisions about LD during scale development and validation. A parametric bootstrapping procedure should be implemented in each separate situation to obtain the critical value of LD applicable to the data set, and provide example critical values for a number of data structure situations. The results show that for the Q3 fit statistic, no single critical value is appropriate for all situations, as the percentiles in the empirical null distribution are influenced by the number of items, the sample size, and the number of response categories. Furthermore, the results show that LD should be considered relative to the average observed residual correlation, rather than to a uniform value, as this results in more stable percentiles for the null distribution of an adjusted fit statistic.

Effects of Local Item Dependence on the Fit and Equating Performance of the Three-Parameter Logistic Model
Applied Psychological Measurement - Tập 8 Số 2 - Trang 125-145 - 1984
Wendy M. Yen

Unidimensional item response theory (IRT) has be come widely used in the analysis and equating of edu cational achievement tests. If an IRT model is true, item responses must be locally independent when the trait is held constant. This paper presents several mea sures of local dependence that are used in conjunction with the three-parameter logistic model in the analysis of unidimensional and two-dimensional simulated data and in the analysis of three mathematics achievement tests at Grades 3 and 6. The measures of local depen dence (called Q2 and Q3) were useful for identifying subsets of items that were influenced by the same fac tors (simulated data) or that had similar content (real data). Item pairs with high Q2 or Q3 values tended to have similar item parameters, but most items with similar item parameters did not have high Q2 or Q3 values. Sets of locally dependent items tended to be difficult and discriminating if the items involved an accumulation of the skills involved in the easier items in the rest of the test. Locally dependent items that were independent of the other items in the test did not have unusually high or low difficulties or discrimina tions. Substantial unsystematic errors of equating were found from the equating of tests involving collections of different dimensions, but substantial systematic er rors of equating were only found when the two tests measured quite different dimensions that were presum ably taught sequentially.

Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis
Applied Psychological Measurement - Tập 14 Số 3 - Trang 271-282 - 1990
Jürgen Rost

A model is proposed that combines the theoret ical strength of the Rasch model with the heuristic power of latent class analysis. It assumes that the Rasch model holds for all persons within a latent class, but it allows for different sets of item parameters between the latent classes. An estima tion algorithm is outlined that gives conditional maximum likelihood estimates of item parameters for each class. No a priori assumption about the item order in the latent classes or the class sizes is required. Application of the model is illustrated, both for simulated data and for real data.

The Development of a Rasch-Type Loneliness Scale
Applied Psychological Measurement - Tập 9 Số 3 - Trang 289-299 - 1985
J. de Jong-Gierveld, Frans Kamphuls

This paper describes an attempt to construct a measuring instrument for loneliness that meets the cri teria of a Rasch scale. Rasch (1960, 1966) proposed a latent trait model for the unidimensional scaling of di chotomous items that does not suffer from the inade quacies of classical approaches. The resulting Rasch scale of this study, which is based on data from 1,201 employed, disabled, and jobless adults, consists of five positive and six negative items. The positive items assess feelings of belongingness, whereas the negative items apply to three separate aspects of miss ing relationships. The techniques for testing the as sumptions underlying the Rasch model are compared with their counterparts from classical test theory, and the implications for the methodology of scale con struction are discussed.

Estimation of Composite Reliability for Congeneric Measures
Applied Psychological Measurement - Tập 21 Số 2 - Trang 173-184 - 1997
Tenko Raykov

A structural equation model is described that permits estimation of the reliability index and coefficient of a composite test for congeneric measures. The method is also helpful in exploring the factorial structure of an item set, and its use in scale reliability estimation and development is illustrated. The modeling. estimator of composite reliability it yields does not possess the general underestimation property of Cronbach's coefficient a.

Reliability of Measurement and Power of Significance Tests Based on Differences
Applied Psychological Measurement - Tập 17 Số 1 - Trang 1-9 - 1993
Donald W. Zimmerman, Richard H. Williams, Bruno D. Zumbo

The power of significance tests based on differ ence scores is indirectly influenced by the reliability of the measures from which differences are obtained. Reliability depends on the relative magnitude of true score and error score variance, but statistical power is a function of the absolute magnitude of these components. Explicit power calculations reaffirm the paradox put forward by Overall & Woodward (1975, 1976)—that significance tests of differences can be powerful even if the reliability of the difference scores is 0. This anomaly arises because power is a function of observed score variance but is not a function of reliability unless either true score variance or error score variance is constant. Provided that sample size, significance level, directionality, and the alternative hypothesis associated with a significance test remain the same, power always increases when population variance decreases, independently of reliability.

The Individual Consistency of Acquiescence and Extreme Response Style in Self-Report Questionnaires
Applied Psychological Measurement - Tập 34 Số 2 - Trang 105-121 - 2010
Bert Weijters, Maggie Geuens, Niels Schillewaert

The severity of bias in respondents’ self-reports due to acquiescence response style (ARS) and extreme response style (ERS) depends strongly on how consistent these response styles are over the course of a questionnaire. In the literature, different alternative hypotheses on response style (in)consistency circulate. Therefore, nine alternative models are derived and fitted to secondary and primary data. It is found that response styles are best modeled as a tau-equivalent factor complemented with a time-invariant autoregressive effect. This means that ARS and ERS are largely but not completely consistent over the course of a questionnaire, a finding that has important implications for response style measurement and correction.

FACTOR 9.2
Applied Psychological Measurement - Tập 37 Số 6 - Trang 497-498 - 2013
Urbano Lorenzo‐Seva, Pere J. Ferrando
The Use of Structural Equation Models in Interpreting Regression Equations Including Suppressor and Enhancer Variables
Applied Psychological Measurement - Tập 3 Số 1 - Trang 123-135 - 1979
Robert M. McFatter

It is shown that the usual interpretation of "sup pressor" effects in a multiple regression equation assumes that the correlations among variables have been generated by a particular structural (causal) model, namely, Conger's (1974) two-factor model. A distinction is drawn between the technical definition of "suppression," which is more fittingly labelled enhancement, and suppression as the appropriate interpretation of a regression equation exhibiting enhancement when that equation has been gen erated by the two-factor model. It is demonstrated that a number of models can generate enhancement but cannot sensibly be interpreted in terms of the measuring, removing, or suppressing of irrelevant or invalid variance. How a regression equation is interpreted thus depends critically on the structural model deemed appropriate.

Tổng số: 14   
  • 1
  • 2