A comparison of the stability of ability parameter estimation based on the maximum likelihood and Bayesian estimation: A case study of dichotomous scoring test results

Faradila Ilena Putri; Heri Retnawati; Elena Kardanova

doi:10.21831/reid.v11i1.89463

Authors

Faradila Ilena Putri Universitas Negeri Yogyakarta, Indonesia
Heri Retnawati Universitas Negeri Yogyakarta, Indonesia https://orcid.org/0000-0002-1792-5873
Elena Kardanova National Research University, Higher School of Economics (HSE University), Russian Federation https://orcid.org/0000-0003-2280-1258

DOI:

https://doi.org/10.21831/reid.v11i1.89463

Keywords:

ability estimation, Bayes method, maximum likelihood method, item response theory, dichotomous scoring test

Abstract

This research is related to Item Response Theory (IRT), which is essential for determining the best method for estimating participants' abilities on a test measuring English listening ability. This study aims to (1) determine the characteristics of the test device measuring English listening ability, (2) determine the effect of the length of the test on the stability of the ability estimation using the maximum likelihood (ML) method, (3) determine the effect of test length on the stability of the ability estimation using the Bayes method, and (4) compare the stability of the ability estimate between ML and Bayes. This research is an exploratory descriptive study using a simulation approach. The best model is selected to generate data. The result of the generation is the actual ability (θ) and the participant's response, which is estimated with the maximum likelihood and Bayes, which produces the estimated ability with 10 replications, and is compared with calculating the MSE (mean square error). The method with a smaller MSE is stable and has a better estimation method. The results show that (1) the 2PL model is the best, (2) the length of the test affects the stability of the ability estimation in the ML method and the most stable case when the test contains 46 items, (3) the length of the test affects the stability of the ability estimate in the Bayes method and it is most stable when the test contains 46 items, and (4) the Bayes method is better and more accurate for estimating ability.

References

Kassambara, A., & Mundt, F. (2016). Package ‘factoextra.’ https://doi.org/10.32614/CRAN.package.factoextra

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters. Psychometrika, 46(4), 443–459. https://doi.org/10.1007/BF02293801

de Ayala, R. J. (2010). The theory and practice of item response theory. Guilford Press.

Falani, I., & Kumala, S. A. (2017). Kestabilan estimasi parameter kemampuan pada model logistik item response theory ditinjau dari panjang tes. SAP (Susunan Artikel Pendidikan), 2(2), . https://doi.org/10.30998/sap.v2i2.2028

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory library. SAGE Publications.

Hammersley, J. M., & Handscomb, D. C. (1964). Monte Carlo methods. Springer Dordrecht. https://doi.org/10.1007/978-94-009-5819-7

Harwell, M., Stone, C. A., Hsu, T. -C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement 20(2), 101–125. https://doi.org/10.1177/014662169602000201

Hikamudin, E. (2017). Estimasi kemampuan siswa dalam ujian nasional menggunakan metode Bayes. Jurnal Penelitian Kebijakan Pendidikan, 10(2), 1-14, https://doi.org/10.24832/jpkp.v10i2.171

Insuk, K. (2007). A comparison of a Bayesian and Maximum Likelihood algorithms for estimation of a multilevel IRT model. Doctoral dissertation, The University of Georgia, Athens, Georgia. https://openscholar.uga.edu/record/8523/files/kim_insuk_200705_phd.pdf

Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis. Journal of Statistical Software, 25(1), 1–18. https://doi.org/10.18637/jss.v025.i01

Mahmud, J. (2017). Item Response Theory: A basic concept. Educational Research and Reviews, 12(5), 258–266. https://doi.org/10.5897/err2017.3147

Mahmud, J., Sutikno, M., & Naga, D. (2016). Variance difference between maximum likelihood estimation method and expected a posteriori estimation method viewed from number of test items. Educational Research and Reviews, 11(16), 1579–1589. https://academicjournals.org/journal/ERR/article-abstract/B2D124860158

Retnawati, H. (2014). Teori respons butir dan penerapannya: Untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Nuha Medika.

Retnawati, H. (2015). Perbandingan estimasi kemampuan laten antara metode maksimum likelihood dan metode Bayes. Jurnal Penelitian dan Evaluasi Pendidikan, 19(2), 145–155. https://doi.org/10.21831/pep.v19i2.5575

Yendra, R., & Noviadi, E. T. (2015). Perbandingan estimasi parameter pada distribusi eksponensial dengan menggunakan Metode Maksimum Likelihood dan Metode Bayesian. Jurnal Sains Matematika dan Statistika, 1(2), https://doi.org/10.24014/JSMS.V1I2.1960

A comparison of the stability of ability parameter estimation based on the maximum likelihood and Bayesian estimation: A case study of dichotomous scoring test results

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Citation Check

License

Similar Articles

SidebarMenu

Information

Developed By

Current Issue

visitor

template

accreditation

tools

crossref

similarity checker and reference manager

StatCounter

Keywords