Individual ability on high-stakes test: Choosing cumulative score or Rasch for scoring model

Muhammad Dhiyaul Khair; Sukaesi Marianti

doi:10.21831/pep.v28i1.71661

Individual ability on high-stakes test: Choosing cumulative score or Rasch for scoring model

Muhammad Dhiyaul Khair, Brawijaya University, Indonesia
Sukaesi Marianti, Brawijaya University, Indonesia

10.21831/pep.v28i1.71661

Abstract

In a test, a method is required to estimate an individual's ability based on their responses. Typically, this is done by summing the correct responses or calculating a cumulative score. An alternative method is the Rasch model. The objective of this study is to determine whether an individual's position, based on cumulative score estimates, remains unchanged or changes when compared with ability estimates using Rasch on dichotomous responses. The study uses open source data from the 2018 Program for International Student Assessment (PISA) by the Organization for Economic Co-operation and Development (OECD) and involves 317 Indonesian students. Ability analysis will be conducted on Math and Reading aspects using cumulative score and Rasch with dichotomous responses. The study will employ data analysis techniques such as Rasch, paired samples t-test, and descriptive statistical analysis. The cumulative score and Rasch results will be tested using paired samples t-test, and a comparison of the cumulative score and Rasch estimation results will be carried out using descriptive statistical analysis. The study results indicate that there are differences in individual positions based on ability estimates using cumulative score and Rasch. These differences are caused by variations in scores. Therefore, even if two individuals have the same cumulative score, they may have different Rasch estimates.

Keywords

ability; cumulative score; Rasch, dichotomous responses

References

Abedalaziz, N., & Leng, C. H. (2018). The Relationship between CTT and IRT approaches in analyzing item characteristics. MOJES: Malaysian Online Journal of Educational Sciences, 1(1), 64–70. http://jice.um.edu.my/index.php/MOJES/article/download/12857/8251

Amelia, R. N., & Kriswantoro, K. (2017). Implementation of Item Response Theory for Analysis of Test Items Quality and Students’ Ability in Chemistry. JKPK (Jurnal Kimia Dan Pendidikan Kimia), 2(1), Article 1. https://doi.org/10.20961/jkpk.v2i1.8512

Anastasi, A., & Urbina, S. (2002). Psychological testing. Prentice Hall.

Baker, F. B., & Kim, S. H. (2017). The Basics of Item Response Theory Using R. Springer International Publishing. https://doi.org/10.1007/978-3-319-54205-8

Bichi, A. A. (2016). Classical Test Theory: An introduction to linear modeling approach to test and item analysis. International Journal for Social Studies, 2(9), 27–33. https://www.academia.edu/download/53191152/CLASSICAL_TEST_THEORY_An_Introduction_to.pdf

Bichi, A. A., & Talib, R. (2018). Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development. International Journal of Evaluation and Research in Education, 7(2), 142–151.

Bichi, Talib, R., Atan, A., Ibrahim, H., & Yusof, S. (2019). Validation of a developed university placement test using classical test theory and Rasch measurement approach. International Journal of ADVANCED AND APPLIED SCIENCES, 6, 22–29. https://doi.org/10.21833/ijaas.2019.06.004

Champlain, A. F. D. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical Education, 44(1), 109–117. https://doi.org/10.1111/j.1365-2923.2009.03425.x

Columbia University. (2016, August 5). Rasch Modeling. Columbia University Mailman School of Public Health. https://www.publichealth.columbia.edu/research/population-health-methods/rasch-modeling

DeMars, C. (2010). Item response theory. Oxford University Press.

Fan, X. (1998). Item Response Theory and Classical Test Theory: An empirical comparison of their item/person Statistics. Educational and Psychological Measurement, 58(3), 357–381. https://doi.org/10.1177/0013164498058003001

Fernanda, J. W., & Hidayah, N. (2020). Analisis kualitas soal ujian statistika menggunakan Classical Test Theory dan Rasch Model. Square: Journal of Mathematics and Mathematics Education, 2(1), Article 1. https://doi.org/10.21580/square.2020.2.1.5363

Hambleton, R. K., & Jones, R. W. (1993). Comparison of Classical Test Theory and Item Response Theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47. https://doi.org/10.1111/j.1745-3992.1993.tb00543.x

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. SAGE.

Hayat, B., Putra, M. D. K., & Suryadi, B. (2020). Comparing item parameter estimates and fit statistics of the Rasch model from three different traditions. Jurnal Penelitian Dan Evaluasi Pendidikan, 24(1), Article 1. https://doi.org/10.21831/pep.v24i1.29871

Kemendikbud. (2019). Pendidikan Di Indonesia Belajar Dari Hasil PISA 2018. Pusat Penilaian Pendidikan Balitbang Kemendikbud. https://simpandata.kemdikbud.go.id/index.php/s/tLBwAm6zAGGbofK

Kiliç, A. F. (2019). Can Factor Scores be used instead of Total Score and Ability Estimation? International Journal of Assessment Tools in Education, 6(1), Article 1. https://doi.org/10.21449/ijate.442542

Linacre, J. M. (1994). Sample Size and Item Calibration or Person Measure Stability. https://www.rasch.org/rmt/rmt74m.htm

MacDonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of Item and Person Statistics based on Item Response Theory versus Classical Test Theory. Educational and Psychological Measurement, 62(6), 921–943. https://doi.org/10.1177/0013164402238082

Magno, C. (2009). Demonstrating the difference between Classical Test Theory and Item Response Theory using derived test data. The International Journal of Educational and Psychological Assessment, 1(1), 1–11. https://ssrn.com/abstract=1426043

Maulani, M. R., & Rahardjo, B. (2014). Teori pengukuran pendidikan menggunakan Classical Test Theory dan Item Response Theory. Competitive, 9(1), Article 1. https://ejurnal.ulbi.ac.id/index.php/competitive/article/view/257

OECD. (2017). What is PISA? In PISA 2015 Assessment and Analytical Framework: Science, Reading, Mathematic, Financial Literacy and Collaborative Problem Solving. OECD Publishing. https://read.oecd-ilibrary.org/education/pisa-2015-assessment-and-analytical-framework/what-is-pisa_9789264281820-2-en#page1.

OECD. (2019). 2018 Database—PISA. https://www.oecd.org/pisa/data/2018database/

Prieto, L., Alonso, J., & Lamarca, R. (2003). Classical test theory versus Rasch analysis for quality of life questionnaire reduction. Health and Quality of Life Outcomes. 10.1186/1477-7525-1-27

Progar, Š., & Sočan, G. (2008). An empirical comparison of Item Response Theory and Classical Test Theory. Horizons of Psychology, 17(3), 5–24. http://psiholoska-obzorja.si/arhiv_clanki/2008_3/progar_socan.pdf

Rasch. (n.d.). Rasch Dichotomous Model vs. One-Parameter Logistic Model 1-PL. https://www.rasch.org/rmt/rmt193h.htm.

Rasch. (2007). Raw Score-to-Measure (Scaled Score) Tables. https://www.rasch.org/rmt/rmt211j.htm

Rusch, T., Lowry, P. B., Mair, P., & Treiblmaier, H. (2017). Breaking free from the limitations of Classical Test Theory: Developing and measuring information systems scales using Item Response Theory. Information & Management, 54(2), 189–203. https://doi.org/10.1016/j.im.2016.06.005

Saifuddin, A. (2002). Tes Prestasi fungsi dan pengembangan pengukuran prestasi belajar. Pustaka Pelajar Offset.

Sarea, M. S., & Ruslan, R. (2019). Karakteristik butir soal: Classical Test Theory VS Item Respone Theory? DIDAKTIKA : Jurnal Kependidikan, 13(1), 1–16. https://doi.org/10.30863/didaktika.v13i1.296

Vincent, W., & Shanmugam, S. K. S. (2020). The role of classical test theory to determine the quality of classroom teaching test items. Pedagogia : Jurnal Pendidikan, 9(1), Article 1. https://doi.org/10.21070/pedagogia.v9i1.123

Waterbury, G. (2019). Missing Data and the Rasch Model: The Effects of Missing Data Mechanisms on Item Parameter Estimation. Journal of Applied Measurement, 20, 1–12. https://pubmed.ncbi.nlm.nih.gov/31120433/

Widoyoko, E. P. (2012). Teknik penyusunan instrumen penelitian. Pustaka Pelajar.

Xu, T., & Stone, C. A. (2012). Using IRT Trait Estimates Versus Summated Scores in Predicting Outcomes. Educational and Psychological Measurement, 72(3), 453–468. https://doi.org/10.1177/0013164411419846

DOI: https://doi.org/10.21831/pep.v28i1.71661