Understanding Rasch Model in Item Response Theory

statistics is good if it is good statistics n.w
1 / 26
Embed
Share

Explore the key concepts of the Rasch model in item response theory, including local independence, the INFIT test statistic, and solutions for handling local dependency issues. Learn how these concepts impact test reliability and individual ability estimates.

  • Rasch Model
  • Item Response Theory
  • Test Reliability
  • Local Independence
  • INFIT Statistic

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Statistics is good if it is good statistics Norman Verhelst, Zolt n Lukacsi and Martina Hule ov 18th EALTA conference Budapest, June 5 2022

  2. The two cases Binary items Rasch model A model is a mini-theory. It leads to strong results if it holds If it does not hold, then ??? Rasch model is a statistical theory Tests of the model are statistical The Rasch model is the null hypothesis We hope that the model is not rejected

  3. Case 1: Local independence In a population where the measured trait is constant, the correlation between any two items is zero Example: reading test with several items refering to the same passage The probability of giving a correct answer to item i is independent of the answer given to any other item j Example: a matching task

  4. Example: matching task Assign the correct country to the capital cities Country Iceland Portugal Roumania Bulgaria Estonia City Tallin Bucharest Sofia Lisbon If one chooses 'Roumenia' for 'Tallin', the answer for 'Bucharest' is wrong by necessity

  5. Effect of ignoring local dependency Reliability of the test is overestimated Individual ability estimates deformed (up to one SD; Zenisky e.a. (2002))

  6. Simple Solution Treat all items within the same passage as a single polytomous item Score = number of correct answers IRT-model: Partial Credit Model (Masters, 1982) No dependency no loss of information (Verhelst & Verstralen, 2008). Limitations Not too many items per passage Does not work for a single passage For the discussion: what to do if authenticity requires just one (very) long passage?

  7. Case 2: the INFIT test statistic Theory Wright & Masters (1982). Admit that they do not master the statistical details to satisfactorily explain the null distribution of the infit statistic Wu (1997): the standardized infit statistic is asymptotically (i.e. with large samples) N(0,1) distributed

  8. The null distribution Distribution of the test statistic given that the null hypothesis is true Synonym: sampling distribution Derived by mathematical statistics Examples: F-distribution (Analysis of variance); t- distribution , chi-square distribution, (standard)- normal distribution Intuitive meaning: what will happen with the test satistic if the experiment (observations, survey )is repeated indpendently many times?

  9. The rationale of the statistical test We reject H0 if the test statistic (in our study) has an extreme value in the null distribution null distribution -1.96 1.96 -3 -2 -1 0 1 2 3 test statistic

  10. How to check the claim? Simulation study Choose values for the item parameters and for the ability distribution Do the following R (R = 1000 or 10,000 ) times: Draw a sample of n artificial students who make the test (i.e. they generate a data matrix) and they do this complying to the Rasch model Estimate the parameters from the data set Compute the test statistic (infit) The R values of the statistic provide an estimate of the null distribution.

  11. Item parameters and distribution of = (0,1) 1000, N = 31 n k item parameters study 1 4 3 2 1 0 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0

  12. two null distributions claimed true (simulated) -3 -2 -1 0 1 2 3 test statistic two null distributions (cumulative) 100 cumulative percentage 75 50 claimed 25 true (simulated) 0 -3 -2 -1 0 1 2 3 test statistic

  13. two null distributions (cumulative) 100 cumulative percentage 75 50 25 0.51 0.67 0 -3 -2 -1 0 1 2 3 test statistic some quantiles claimed [N(0,1)] -1.64 -1.28 -0.67 0.00 0.67 1.28 1.64 simulated -1.23 -1.01 -0.51 0.00 0.51 1.01 1.23 p5 P10 P25 p50 p75 p90 p95

  14. some quantiles claimed [N(0,1)] -1.64 -1.28 -0.67 0.00 0.67 1.28 1.64 simulated -1.23 -1.01 -0.51 0.00 0.51 1.01 1.23 p5 P10 P25 p50 p75 p90 p95 Q-Q plot 2.5 Simulated null distribution 2.0 1.5 1.0 0.5 P95 P90 0.0 P10 -0.5 P5 -1.0 Q-Q plot -1.5 equality -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 claimed null distribution [ N(0, 1)]

  15. The Q-Q plot for b = 0 n = 1000 2.5 2.0 1.5 Simulated quantiles 1.0 (2.05, 1.81) 0.5 0.0 -0.5 -1.0 -1.5 equality b = 0 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Standard normal quantiles

  16. n = 1000 2.5 2.0 1.5 Simulated quantiles 1.0 0.5 (-2.05,-0.61) (2.05,0.70) 0.0 equality -3 -2 -1 0 -0.5 -1.0 -1.5 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Standard normal quantiles 96% of the probability mass in the true null distribution is contained in the interval (-0.61,0.70) instead of (-2.05, 2.05)

  17. n = 1000 2.5 2.0 1.5 Simulated quantiles 1.0 hard items 0.5 (-2.05,-0.61) (2.05,0.70) 0.0 equality -3 -2 -1 0 -0.5 -1.0 -1.5 -2.0 -2.5 n = 1000 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Standard normal quantiles 2.5 2.0 1.5 Simulated quantiles 1.0 easy items 0.5 0.0 equality +3 +2 +1 0 -0.5 -1.0 -1.5 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Standard normal quantiles

  18. 96% of the probability mass in the true null distribution is contained in the interval (-0.61,0.70) instead of (-2.05, 2.05) If H0 is true, basing the test on the claimed (0,1) null distribution, will reach significance almost never than claimed ('nominal') The power of the test, i.e. the probability of discovering non fitting items will go down self-deception N real will be (much) smaller

  19. Conclusion (1) For locally dependent items: Consider them as polytomous items (partial credit) Use the partial credit model

  20. Conclusion (2) The infit measure has no good statistical foundation. Its null distribution is not N(0,1); in fact it is different for different items Its use has no statistical value Advice: Don t use statistics Or use good statistics

  21. References Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174. Verhelst, N.D. & Glas, C.A.W. (1995). The one parameter logistic model. In: G.H. Fisher and I.W. Molenaar, Rasch Models (pp. 215-237). New-York: Springer Verlag Verhelst, N.D, & Verstralen, H.H.F.M. (2008). Some considerations on the partial credit model. Psicol gica, 29, 229-254. Wright, B.D. & Masters, G.N. (1982). Rating Scale Analysis: Rasch Measurement . Chicago: Mesa Press. Wu, M.L. (1997). The Development and Application of a Fit Test for Use with Marginal Maximum Likelihood Estimation and Generalised Item Response Models. Unpublished masters dissertation. University of Melbourne. Yong-Won Lee (2004). Examining passage-related local item dependence (LID) and measurement construct using Q3 statistics in an EFL reading comprehension test. Language Testing, 21 (1) 74 100. Zenisky, A.L., Hambleton, R.K, & Sireci , S.G. (2002). Identification and Evaluation of Local Item Dependencies in the Medical College Admissions Test. Journal of Educational Measurement, 31, 291-309.

  22. n = 1000 2.5 2.0 1.5 Simulated quantiles 1.0 0.5 (-2.05,-0.61) (2.05,0.70) 0.0 equality -3 -2 -1 0 -0.5 -1.0 -1.5 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 n = 10,000 Standard normal quantiles 2.5 2.0 1.5 simulated quantiles 1.0 0.5 0.0 equality b = -3 b = -2 b = -1 b = 0 -0.5 -1.0 -1.5 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 standard normal quantiles

  23. OPLM ( Verhelst & Glas, 1995) 2.5 2.0 1.5 1.0 simulated quatntiles 0.5 0.0 equalty 0.0 -3.0 -2.0 -1.0 -0.5 -1.0 -1.5 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Standard normal quantiles

  24. 2.5 2.0 1.5 Simulated quantiles ConQuest Infit 1.0 0.5 0.0 equality -3 -2 -1 0 -0.5 -1.0 -1.5 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Standard normal quantiles 2.5 2.0 1.5 Simulated quantiles 1.0 Facets Infit 0.5 0.0 equality -3 -2 -1 0 -0.5 -1.0 -1.5 -2.0 -2.5 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 Standard normal quantiles

  25. Infit and outfit (Wright & Masters, 1982) mean squares - Outfit: not weighted - Infit: weighted 2 1 n ( ) 1 n n n x = = 2 vi MS z vi vi out (1 ) = = 1 1 v v vi vi n 2 ( ) x vi vi = = MS 1 v in n (1 ) vi vi = 1 v

  26. The Wilson-Hilferty transformation (1931) 1/3 X df 2 df I f : then is approxi ate m normal with ly X 2 df m an: 1 e 9 2 df S D: 9 approx ime taly means very accurate in th is case

Related


More Related Content