Stability of estimation item parameter in IRT dichotomy considering the number of participants

Zulfa Safina Ibrahim, Universitas Negeri Yogyakarta, Indonesia
Heri Retnawati, Universitas Negeri Yogyakarta, Indonesia
Alfred Irambona, Burundi University, Burundi
Beatriz Eugenia Orantes Pérez, El Colegio de la frontera sur (ECOSUR), Mexico

Abstract


This research is related to item response theory (IRT) which is needed to measure the goodness of a test set, while item parameter estimation is needed to determine the technical properties of a test item. Stability of item parameter estimation is conducted to determine the minimum sample that can be used to obtain good item parameter estimation results. The purpose of this study is to describe the effect of the number of test takers on the stability of item parameter estimation with the Bayes method (expected a posteriori, EAP) on dichotomous data. This research is an exploratory descriptive research with a bootstrap approach using the EAP method. The EAP method is performed by modifying the likelihood and function to include prior information about the participant's 9 score. Bootstrapping on the original data is done to take bootstrap samples. with ten different sample sizes of 100, 150, 250, 300, 500, 700, 1,000, 1,500, 2,000, 2,500 were then replicated ten times and grain parameter estimation was performed. Each sample data with ten replications was calculated Root Mean Squared Difference (RMSD) value. The results showed that the 2PL model was chosen as the best model. The RMSD value obtained proves that many test participants affect the stability of item parameter estimation on dichotomous data with the 2PL model. The minimum sample to ensure the stability of item parameter estimates with the 2PL model is 1,000 test participants.


Keywords


stability; item parameter estimation; item response theory; EAP; bootstrapping

Full Text:

PDF

References


Akour, M., & Al-Omari, H. (2013). Empirical investigation of the stability of irt item-parameters estimation. International Online Journal of Educational Sciences, 5(2), 291–301. https://journals.indexcopernicus.com/search/article?articleId=608863

Baker, F. B. (2001). The basics of item response theory (2nd ed.). ERIC Clearinghouse on Assessment and Evaluation.

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459. https://doi.org/10.1007/BF02293801

Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29. https://doi.org/10.18637/jss.v048.i06

Crivelli, D., Spinosa, C., Angelillo, M. T., & Balconi, M. (2023). The influence of language comprehension proficiency on assessment of global cognitive impairment following Acquired Brain Injury: A comparison between MMSE, MoCA and CASP batteries. Applied Neuropsychology: Adult, 30(5), 546–551. https://doi.org/10.1080/23279095.2021.1966430

Custer, M. (2015). Sample size and item parameter estimation precision when utilizing the oneparameter “Rasch” model. Paper presented at the Annual Meeting of the Mid-Western Educational Research Association Evanston, Illinois October 21-24.

Davidson, R. & MacKinnon, J.G. (1993). Estimation and inference in econometrics. Oxford University Press.

Desjardins, C. D., & Bulut, O. (2018). Handbook of educational measurement and psychometrics using R. CRC Press Taylor & Francis Group.

Falani, I., Nisraeni, N., & Irdiyansyah, I. (2018). The ability of estimation stability and item parameter characteristics reviewed by Item Response Theory model. Proceedings of the International Conference on Education in Muslim Society (ICEMS 2017). Atlantis Press. https://doi.org/10.2991/icems-17.2018.34

Gong, Y. (Frank), Chen, M., & An, Z. (2024). Examining the impact of foreign language anxiety and language contact on oral proficiency: a study of Chinese as a second language learners. International Review of Applied Linguistics in Language Teaching. https://doi.org/10.1515/iral2023-0328

Guo, S., Wu, T., Zheng, C., & Chen, Y. (2021). Bayesian modal estimation for the one-parameter logistic ability-based guessing (1PL-AG) model. Applied Psychological Measurement, 45(3), 195213. https://doi.org/10.1177/0146621621990761

Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 147–200). Macmillan Publishing Co, Inc.; American Council on Education.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory library. Sage Publication.

Harwell, M., Stone, C. A., Hsu, T.-C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101125. https://doi.org/10.1177/014662169602000201

Kassambara, A., & Mundt, F. (2020). Package ‘factoextra’: Extract and visualize the results of multivariate data analyses. CRAN- R Package, 84. https://cran.r-project.org/package=factoextra

Khodi, A., Ponniah, L. S., Farrokhi, A. H., & Sadeghi, F. (2024). Test review of Iranian English language proficiency test: MSRT test. Language Testing in Asia, 14(1), 4. https://doi.org/10.1186/s40468-023-00270-0

Kim, A. A., Yumsek, M., Kemp, J. A., Chapman, M., & Gary Cook, H. (2023). Universal tools activation in English language proficiency assessments: A comparison of Grades 1–12 English learners with and without disabilities. Language Testing, 40(4), 877–903. https://doi.org/10.1177/02655322221149009

Mooney, C. Z. & Duval, R. D. (1993). Bootstrapping: A nonparametric approach to statistical inference. Sage Publication.

Neiriz, R. (2023). Eliciting interactive oral communication samples through a spoken dialogue system to measure interactional competence. Thesis, Iowa State University. https://doi.org/10.31274/td20240617-300

Retnawati, H. (2014). Teori respons butir dan penerapannya: Untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Nuha Medika.

Sabitova, K. (2023). Determination and assessment of students’ basic competence level in school education. Общество и Инновации, 4(10/S), 219–226. https://doi.org/10.47689/2181-1415vol4-iss10/S-pp219-226

Şahin, A., & Anıl, D. (2017). The effects of test length and sample size on item parameters in item response theory. Kuram ve Uygulamada Egitim Bilimleri, 17(1), 321–335. https://doi.org/10.12738/estp.2017.1.0270

Sarac, M., & Loken, E. (2023). Examining patterns of omitted responses in a large-scale English language proficiency test. International Journal of Testing, 23(1), 56–72. https://doi.org/10.1080/15305058.2022.2070756

Schleicher, I., Leitner, K., Juenger, J., Moeltner, A., Ruesseler, M., Bender, B., … Kreuder, J. G. (2017). Examiner effect on the objective structured clinical exam – a study at five medical schools. BMC Medical Education, 17(1), 71. https://doi.org/10.1186/s12909-017-0908-1

Stone, M., & Yumoto, F. (2004). The effect of sample size for estimating Rasch/IRT parameters with dichotomous items. Journal of Applied Measurement, 5(1), 48–61.

Susongko, P. (2021). The estimation stability comparison of participants’ abilities on scientific literacy test using Rasch and One-Parameter Logistic model. Journal of Physics: Conference Series, 1842(1), 012037. https://doi.org/10.1088/1742-6596/1842/1/012037

Swaminathan, H., Hambleton, R. K., Sireci, S. G., Xing, D., & Rizavi, S. M. (2003). Small sample estimation in dichotomous item response models: Effect of priors based on judgmental information on the accuracy of item parameter estimates. Applied Psychological Measurement, 27(1), 27–51. http://dx.doi.org/10.1177/0146621602239475

Yen, W. M. (1981). Using simulation results to choose a latent trait model. Applied Psychological Measurement, 5(2), 245–262. https://doi.org/10.1177/014662168100500212




DOI: https://doi.org/10.21831/reid.v10i1.73055

Refbacks

  • There are currently no refbacks.




Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




Find REID (Research and Evaluation in Education) on:

  

ISSN 2460-6995 (Online)

View REiD Visitor Statistics