Individual ability on high-stakes test: Choosing cumulative score or Rasch for scoring model

Muhammad Dhiyaul Khair, Brawijaya University, Indonesia
Sukaesi Marianti, Brawijaya University, Indonesia


In a test, a method is required to estimate an individual's ability based on their responses. Typically, this is done by summing the correct responses or calculating a cumulative score. An alternative method is the Rasch model. The objective of this study is to determine whether an individual's position, based on cumulative score estimates, remains unchanged or changes when compared with ability estimates using Rasch on dichotomous responses. The study uses open source data from the 2018 Program for International Student Assessment (PISA) by the Organization for Economic Co-operation and Development (OECD) and involves 317 Indonesian students.  Ability analysis will be conducted on Math and Reading aspects using cumulative score and Rasch with dichotomous responses. The study will employ data analysis techniques such as Rasch, paired samples t-test, and descriptive statistical analysis. The cumulative score and Rasch results will be tested using paired samples t-test, and a comparison of the cumulative score and Rasch estimation results will be carried out using descriptive statistical analysis. The study results indicate that there are differences in individual positions based on ability estimates using cumulative score and Rasch. These differences are caused by variations in scores. Therefore, even if two individuals have the same cumulative score, they may have different Rasch estimates.


ability; cumulative score; Rasch, dichotomous responses


Abedalaziz, N., & Leng, C. H. (2018). The Relationship between CTT and IRT approaches in analyzing item characteristics. MOJES: Malaysian Online Journal of Educational Sciences, 1(1), 64–70.

Amelia, R. N., & Kriswantoro, K. (2017). Implementation of Item Response Theory for Analysis of Test Items Quality and Students’ Ability in Chemistry. JKPK (Jurnal Kimia Dan Pendidikan Kimia), 2(1), Article 1.

Anastasi, A., & Urbina, S. (2002). Psychological testing. Prentice Hall.

Baker, F. B., & Kim, S. H. (2017). The Basics of Item Response Theory Using R. Springer International Publishing.

Bichi, A. A. (2016). Classical Test Theory: An introduction to linear modeling approach to test and item analysis. International Journal for Social Studies, 2(9), 27–33.

Bichi, A. A., & Talib, R. (2018). Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development. International Journal of Evaluation and Research in Education, 7(2), 142–151.

Bichi, Talib, R., Atan, A., Ibrahim, H., & Yusof, S. (2019). Validation of a developed university placement test using classical test theory and Rasch measurement approach. International Journal of ADVANCED AND APPLIED SCIENCES, 6, 22–29.

Champlain, A. F. D. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical Education, 44(1), 109–117.

Columbia University. (2016, August 5). Rasch Modeling. Columbia University Mailman School of Public Health.

DeMars, C. (2010). Item response theory. Oxford University Press.

Fan, X. (1998). Item Response Theory and Classical Test Theory: An empirical comparison of their item/person Statistics. Educational and Psychological Measurement, 58(3), 357–381.

Fernanda, J. W., & Hidayah, N. (2020). Analisis kualitas soal ujian statistika menggunakan Classical Test Theory dan Rasch Model. Square: Journal of Mathematics and Mathematics Education, 2(1), Article 1.

Hambleton, R. K., & Jones, R. W. (1993). Comparison of Classical Test Theory and Item Response Theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. SAGE.

Hayat, B., Putra, M. D. K., & Suryadi, B. (2020). Comparing item parameter estimates and fit statistics of the Rasch model from three different traditions. Jurnal Penelitian Dan Evaluasi Pendidikan, 24(1), Article 1.

Kemendikbud. (2019). Pendidikan Di Indonesia Belajar Dari Hasil PISA 2018. Pusat Penilaian Pendidikan Balitbang Kemendikbud.

Kiliç, A. F. (2019). Can Factor Scores be used instead of Total Score and Ability Estimation? International Journal of Assessment Tools in Education, 6(1), Article 1.

Linacre, J. M. (1994). Sample Size and Item Calibration or Person Measure Stability.

MacDonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of Item and Person Statistics based on Item Response Theory versus Classical Test Theory. Educational and Psychological Measurement, 62(6), 921–943.

Magno, C. (2009). Demonstrating the difference between Classical Test Theory and Item Response Theory using derived test data. The International Journal of Educational and Psychological Assessment, 1(1), 1–11.

Maulani, M. R., & Rahardjo, B. (2014). Teori pengukuran pendidikan menggunakan Classical Test Theory dan Item Response Theory. Competitive, 9(1), Article 1.

OECD. (2017). What is PISA? In PISA 2015 Assessment and Analytical Framework: Science, Reading, Mathematic, Financial Literacy and Collaborative Problem Solving. OECD Publishing.

OECD. (2019). 2018 Database—PISA.

Prieto, L., Alonso, J., & Lamarca, R. (2003). Classical test theory versus Rasch analysis for quality of life questionnaire reduction. Health and Quality of Life Outcomes. 10.1186/1477-7525-1-27

Progar, Š., & Sočan, G. (2008). An empirical comparison of Item Response Theory and Classical Test Theory. Horizons of Psychology, 17(3), 5–24.

Rasch. (n.d.). Rasch Dichotomous Model vs. One-Parameter Logistic Model 1-PL.

Rasch. (2007). Raw Score-to-Measure (Scaled Score) Tables.

Rusch, T., Lowry, P. B., Mair, P., & Treiblmaier, H. (2017). Breaking free from the limitations of Classical Test Theory: Developing and measuring information systems scales using Item Response Theory. Information & Management, 54(2), 189–203.

Saifuddin, A. (2002). Tes Prestasi fungsi dan pengembangan pengukuran prestasi belajar. Pustaka Pelajar Offset.

Sarea, M. S., & Ruslan, R. (2019). Karakteristik butir soal: Classical Test Theory VS Item Respone Theory? DIDAKTIKA : Jurnal Kependidikan, 13(1), 1–16.

Vincent, W., & Shanmugam, S. K. S. (2020). The role of classical test theory to determine the quality of classroom teaching test items. Pedagogia : Jurnal Pendidikan, 9(1), Article 1.

Waterbury, G. (2019). Missing Data and the Rasch Model: The Effects of Missing Data Mechanisms on Item Parameter Estimation. Journal of Applied Measurement, 20, 1–12.

Widoyoko, E. P. (2012). Teknik penyusunan instrumen penelitian. Pustaka Pelajar.

Xu, T., & Stone, C. A. (2012). Using IRT Trait Estimates Versus Summated Scores in Predicting Outcomes. Educational and Psychological Measurement, 72(3), 453–468.



  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Find Jurnal Penelitian dan Evaluasi Pendidikan on:


ISSN 2338-6061 (online)    ||    ISSN 2685-7111 (print)

View Journal Penelitian dan Evaluasi Pendidikan Visitor Statistics