A comparison of the stability of ability parameter estimation based on the maximum likelihood and Bayesian estimation: A case study of dichotomous scoring test results
DOI:
https://doi.org/10.21831/reid.v11i1.89463Keywords:
ability estimation, Bayes method, maximum likelihood method, item response theory, dichotomous scoring testAbstract
This research is related to Item Response Theory (IRT), which is essential for determining the best method for estimating participants' abilities on a test measuring English listening ability. This study aims to (1) determine the characteristics of the test device measuring English listening ability, (2) determine the effect of the length of the test on the stability of the ability estimation using the maximum likelihood (ML) method, (3) determine the effect of test length on the stability of the ability estimation using the Bayes method, and (4) compare the stability of the ability estimate between ML and Bayes. This research is an exploratory descriptive study using a simulation approach. The best model is selected to generate data. The result of the generation is the actual ability (θ) and the participant's response, which is estimated with the maximum likelihood and Bayes, which produces the estimated ability with 10 replications, and is compared with calculating the MSE (mean square error). The method with a smaller MSE is stable and has a better estimation method. The results show that (1) the 2PL model is the best, (2) the length of the test affects the stability of the ability estimation in the ML method and the most stable case when the test contains 46 items, (3) the length of the test affects the stability of the ability estimate in the Bayes method and it is most stable when the test contains 46 items, and (4) the Bayes method is better and more accurate for estimating ability.
References
Kassambara, A., & Mundt, F. (2016). Package ‘factoextra.’ https://doi.org/10.32614/CRAN.package.factoextra
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters. Psychometrika, 46(4), 443–459. https://doi.org/10.1007/BF02293801
de Ayala, R. J. (2010). The theory and practice of item response theory. Guilford Press.
Falani, I., & Kumala, S. A. (2017). Kestabilan estimasi parameter kemampuan pada model logistik item response theory ditinjau dari panjang tes. SAP (Susunan Artikel Pendidikan), 2(2), . https://doi.org/10.30998/sap.v2i2.2028
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory library. SAGE Publications.
Hammersley, J. M., & Handscomb, D. C. (1964). Monte Carlo methods. Springer Dordrecht. https://doi.org/10.1007/978-94-009-5819-7
Harwell, M., Stone, C. A., Hsu, T. -C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement 20(2), 101–125. https://doi.org/10.1177/014662169602000201
Hikamudin, E. (2017). Estimasi kemampuan siswa dalam ujian nasional menggunakan metode Bayes. Jurnal Penelitian Kebijakan Pendidikan, 10(2), 1-14, https://doi.org/10.24832/jpkp.v10i2.171
Insuk, K. (2007). A comparison of a Bayesian and Maximum Likelihood algorithms for estimation of a multilevel IRT model. Doctoral dissertation, The University of Georgia, Athens, Georgia. https://openscholar.uga.edu/record/8523/files/kim_insuk_200705_phd.pdf
Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis. Journal of Statistical Software, 25(1), 1–18. https://doi.org/10.18637/jss.v025.i01
Mahmud, J. (2017). Item Response Theory: A basic concept. Educational Research and Reviews, 12(5), 258–266. https://doi.org/10.5897/err2017.3147
Mahmud, J., Sutikno, M., & Naga, D. (2016). Variance difference between maximum likelihood estimation method and expected a posteriori estimation method viewed from number of test items. Educational Research and Reviews, 11(16), 1579–1589. https://academicjournals.org/journal/ERR/article-abstract/B2D124860158
Retnawati, H. (2014). Teori respons butir dan penerapannya: Untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Nuha Medika.
Retnawati, H. (2015). Perbandingan estimasi kemampuan laten antara metode maksimum likelihood dan metode Bayes. Jurnal Penelitian dan Evaluasi Pendidikan, 19(2), 145–155. https://doi.org/10.21831/pep.v19i2.5575
Yendra, R., & Noviadi, E. T. (2015). Perbandingan estimasi parameter pada distribusi eksponensial dengan menggunakan Metode Maksimum Likelihood dan Metode Bayesian. Jurnal Sains Matematika dan Statistika, 1(2), https://doi.org/10.24014/JSMS.V1I2.1960
Downloads
Published
How to Cite
Issue
Section
Citation Check
License
Copyright (c) 2025 REID (Research and Evaluation in Education)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The authors submitting a manuscript to this journal agree that, if accepted for publication, copyright publishing of the submission shall be assigned to REID (Research and Evaluation in Education). However, even though the journal asks for a copyright transfer, the authors retain (or are granted back) significant scholarly rights.
The copyright transfer agreement form can be downloaded here: [REID Copyright Transfer Agreement Form]
The copyright form should be signed originally and sent to the Editorial Office through email to reid.ppsuny@uny.ac.id
REID (Research and Evaluation in Education) by http://journal.uny.ac.id/index.php/reid is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.