Analysis of the quality of test instrument and students’ accounting learning competencies at vocational school

Nur Ichsanuddin Achmad Kurniawan, Department of Educational Research and Evaluation, Graduate School of Universitas Negeri Yogyakarta, Indonesia
Sudji Munadi, Faculty of Engineering, Universitas Negeri Yogyakarta, Indonesia


The study is aimed at describing: (1) characteristics of the items of the national examination try-out test of the accounting subject matter in the 2015/2016 academic year on classical test theory and modern test theory; and (2) classification of students’ masteries in the learning of accounting. The study is explorative research. Analyses are conducted using the classical and modern test theories for item characteristics and descriptive quantitative for students’ masteries in accounting using the test set for the national examination try-out in the 2015/2016 academic year. A total of 414 students do the Package A test. Results show that (1) based on the classical test analyses, a number of 11 items (27.5%) belong to the “easy” category, 22 items (55%) “medium” category, and 7 items (17.5%) “difficult” category allowing a total of 19 (47.5%) to be categorized as good items; meanwhile, on the modern-theory analyses, a total of 34 items (85%) belong to the “good” category. (2) Around 38% of the students have competencies of the medium and low categories. Most students have difficulty in answering questions of the higher-order thinking levels. 



test item characteristics; accounting; learning competencies; Rasch model

Full Text:



Crocker, L. M., & Algina, J. (1986). Introduction to classical and modern test theory. Fort Worth, TX: Harcourt Brace Jovanovich.

Egan, K. L., Sireci, S. G., & Swaminathan, H. (1998). Effect of item bundling on the assessment of test dimensionality. In the paper presented at the annual meeting of the National Council on Measurement in Education. San Diego, CA.

Field, A. (2009). Discovering statistics using SPSS (3rd 3d.). London: Sage Publications.

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer Nijhoff.

Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (5th ed.). Upper Saddle River, NJ: Prentice Hall.

Kartowagiran, B. (2012). Penulisan butir soal. In the paper presented in Training on Writing and Analysis of Items for the Civil Servant-Rekinpeg Resource. Hotel Kawanua Aerotel, Jakarta.

Law of Republic of Indonesia No. 20 of 2003 on National Education System (2003).

Linn, R. L. (1989). Educational measurement. New York, NY: Macmillan.

Manoppo, Y., & Mardapi, D. (2014). Analisis metode cheating pada tes berskala besar. Jurnal Penelitian Dan Evaluasi Pendidikan, 18(1), 115–128. Retrieved from

Mardapi, D. (2012). Pengukuran, penilaian, dan evaluasi pendidikan. Yogyakarta: Nuha Medika.

Mardapi, D. (2014). Authentic assessment. In the paper presented at HEPI Conference. Denpasar, Bali.

Reckase, M. D. (1979). Unifactor latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics, 4(3), 207–230.

Regulation of the Minister of National Education No. 19 of 2005, on National Standard of Education (2005). Republic of Indonesia.

Retnawati, H. (2014). Teori respons butir dan penerapannya: Untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Yogyakarta: Nuha Medika.

Reynolds, C. R., Livingston, R. B., & Willson, V. L. (2009). Measurement and assessment in education (2nd ed.). Upper Saddle River, NJ: Pearson.

Smits, N., Cuijpers, P., & van Straten, A. (2011). Applying computerized adaptive testing to the CES-D scale: A simulation study. Psychiatry Research, 188(1), 147–155.

Stage, C. (2003). Classical test theory or item response theory: The Swedish experience. Santiago, Chile: Centro de Estudios Públicos.

Wiberg, M. (2004). Classical test theory vs. item response theory: An evaluation of the theory test in the Swedish driving-license test. Stockholm: Umea Universitet.

Williams, B., Onsman, A., & Brown, T. (2003). Exploratory factor analysis: A five-step guide for novices. Australasian Journal of Paramedicine, 8(3), 1–13. Retrieved from

Wright, B. D., & Masters, G. N. (2008). Rating scale analysis: Rasch measurement. Chicago, IL: Mesa Press.

Wu, Q., Zhang, Z., Song, Y., Zhang, Y., Zhang, Y., Zhang, F., … Miao, D. (2013). The development of mathematical test based on item response theory. International Journal of Advancements in Computing Technology, 5(10), 209–216.



  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Find Jurnal Penelitian dan Evaluasi Pendidikan on:


ISSN 2338-6061 (online)    ||    ISSN 2685-7111 (print)

View Journal Penelitian dan Evaluasi Pendidikan Visitor Statistics