The quality of teacher-made summative tests for Islamic education subject teachers in Palembang, Indonesia

Evy Ratna Kartika Waty, Universitas Sriwijaya, Indonesia
Yanti Karmila Nengsih, Universitas Sriwijaya, Indonesia
Ciptro Handrianto, Universitas Negeri Padang, Indonesia
M. Arinal Rahman, University of Szeged, Hungary


Criticism of exams can be used to gauge student achievement for graduation. This study examined the quality of Summative Tests (ST) created by senior high school teachers in Palembang, Indonesia, specifically focusing on the Islamic Education subject. The evaluation criteria included item validity, reliability, discrimination index, and disruptive effectiveness. The analysis involved 800 answer sheets from 20 teachers. Results indicated that while 20 teachers achieved high reliability, two struggled with poor reliability in terms of disruptive effectiveness. Respondent 11 faced challenges with only 28% valid items and a moderate Cronbach’s Alpha. Additionally, the disruptive effectiveness and discrimination index were poor. These findings suggest a need for teacher training to enhance skills in crafting and administering high-quality summative tests. The implications of these findings extend to improving teacher training and ensuring the effectiveness of summative assessments in gauging student achievements for graduation. The research contributes valuable insights into the complexities of teacher-created exams and offers a basis for enhancing the overall quality of education assessment practices.


Quality; Summative Test; Validity; Reliability; Item difficulty level; Item discrimination index; Item disruptive impact

Full Text:



Armiati, A., Subhan, M., Nasution, M. L., Al Aziz, S., Rani, M. M., Rifandi, R., & Harisman, Y. (2020). Profesionalisme guru dalam membuat soal higher order thinking skills. JNPM (Jurnal Nasional Pendidikan Matematika), 4(1), 75-84.

Blakemore, H. (2012). Emergent teacher-researchers: A reflection on the challenges faced when conducting research in the English classroom. English teaching: Practice and Critique, 11(2), 59–69.

Chan, K. K. (2018). The effect of teachers’ perceptions on the role of technology in assessment: The case of Macau. International Journal of Learning, Teaching and Educational Research, 17(2), 127–137.

Cobbold, C., & Wright, L. (2021). Use of Formative feedback to enhance summative performance. Anatolian Journal of Education, 6(1), 109–116.

Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of applied psychology, 78(1), 98-104.

Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Holt, Rinehart and Winston, 6277 Sea Harbor Drive, Orlando, FL 32887.

DeVellis, R. F., & Thorpe, C. T. (2021). Scale development: Theory and applications. Sage publications.

Downing, S. M. (2006). Twelve steps for effective test development. Handbook of test development, 3, 25. Routledge

Gorham, A., & Randall, J. (2022). Classical Test Theory. Routledge.

Hagen, T. (2020). Towards a more meaningful evaluation of university lecturers. New Zealand Journal of Educational Studies, 55(2), 379–386.

Haladyna, T. M., & Downing, S. M. (2011). Twelve steps for effective test development. In Handbook of test development (pp. 17-40). Routledge.

Handrianto, C., Jusoh, A. J., Goh, P. S. C., Rashid, N. A., & Saputra, E. (2021). Teachers` self-efficacy as a critical determinant of the quality of drug education among malaysian students. Journal of Drug and Alcohol Research. 10(3).

Handrianto, C., Jusoh, A. J., Rashid, N. A., Imami, M. K. W., Wahab, S., Rahman, M. A., & Kenedi, A. K. (2023). Validating and testing the teacher self-efficacy (TSE) scale in drug education among secondary school teachers. International Journal of Learning, Teaching and Educational Research, 22(6), 45-58.

Herawati, N. (2021). Kemampuan guru dalam membuat soal HOTS dalam ujian tengah semester. Primary: Jurnal Pendidikan Sekolah Dasar, 10 (6), 1689-1694.

Herlina, S., Rahman, M. A., Nufus, Z., Handrianto, C., & Masoh, K. (2021). The development of students’ learning autonomy using tilawati method at a madrasatul quran in south Kalimantan. Jurnal Pendidikan Agama Islam, 18(2), 431-450.

Jusuf, R., Sopandi, W., Wulan, A. R., & Sa’ud, U. S. (2019). Strengthening teacher competency through ICARE approach to improve literacy assessment of science creative thinking. International Journal of Learning, Teaching and Educational Research, 18(7), 70-83.

Kartowagiran, B., Wibawa, E. A., Alfarisa, F., & Purnama, D. N. (2019). Can student assessment sheets replace observation sheets? Jurnal Cakrawala Pendidikan, 38(1), 33–44.

Koswara, D., Dallyono, R., Suherman, A., & Hyangsewu, P. (2021). The analytical scoring assessment usage to examine Sundanese students’ performance in writing descriptive texts. Cakrawala Pendidikan, 40(3), 573-583.

Laverty, J. T., Bauer, W., Kortemeyer, G., & Westfall, G. (2012). Want to reduce guessing and cheating while making students happier? Give more exams! The Physics Teacher, 50(9), 540-543.

Linn, R. L. (2008). Measurement and assessment in teaching. Pearson Education India.

Livingston, S. A., & Zieky, M. J. (1982). Passing scores: A manual for setting standards of performance on educational and occupational tests.

Malone, M. E. (2010). Test review: Canadian academic English language (CAEL) assessment. Language Testing, 27(4), 631–636.

Marzano, R. J., Norford, J. S., & Ruyle, M. (2018). The new art and science of classroom assessment. Solution Tree. 555 North Morton Street, Bloomington, IN 47404.

Moeti, B. (2016). Perceptions of teacher counsellors on assessment of guidance and counselling in secondary schools. International Journal of Learning, Teaching and Educational Research, 15(6), 145–155.

Mokshein, S. E., Ishak, H., & Ahmad, H. (2019). The use of Rasch measurement model in English testing. Jurnal Cakrawala Pendidikan, 38(1), 16-32.

Nengsih, Y. K., Handrianto, C., Nurrizalia, M., Waty, E. R. K., & Shomedran, S. (2022). Media and resources development of android based interactive digital textbook in nonformal education. Journal of Nonformal Education, 8(2), 185-191.

Rahman, M. A., Novitasari, D., Handrianto, C., & Rasool, S. (2022). Assessment challenges in online learning during the covid-19 pandemic. Kolokium Jurnal Pendidikan Luar Sekolah, 10(1).

Ridho, U. (2018). Evaluasi dalam pembelajaran bahasa Arab. An Nabighoh, 20(01), 19-26.

Secolsky, C., & Denison, D. B. (Eds.). (2017). Handbook on measurement, assessment, and evaluation in higher education. Routledge.

Sinharay, S., Puhan, G., & Haberman, S. J. (2011). An NCME instructional module on subscores. Educational Measurement: Issues and Practice, 30(3), 29-40.

Supriyadi, E., Zamtinah, Z., Soenarto, S., & Hatmojo, Y. I. (2019). A character-based assessment model for vocational high schools. Jurnal Cakrawala Pendidikan, 38(2), 269-280.

Tarrant, M., Ware, J., & Mohammed, A. M. (2009). An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis. BMC medical education, 9(1), 1-8.

Tuckman, B. W. (1985). Evaluating instructional programs. Allyn and Bacon, Inc., Rockleigh, NJ 07647.

Van der Linden, W. J. (Ed.). (2017). Handbook of Item Response Theory: Volume 3: Applications. CRC press.

Wagemaker, H. (2020). Reliability and validity of international large-scale assessment: Understanding IEA’s comparative studies of student achievement (p. 277). Springer Nature.

Widiana, I. W., Tegeh, I. M., & Artanayasa, I. W. (2021). The project-based assessment learning model that impacts learning achievement and nationalism attitudes. Jurnal Cakrawala Pendidikan, 40(2), 389-401.

Wyatt-Smith, C., & Adie, L. (2018). Innovation and accountability in teacher education. Springer Singapore.



  • There are currently no refbacks.


Social Media:



 Creative Commons License
Jurnal Cakrawala Pendidikan by Lembaga Pengembangan dan Penjaminan Mutu Pendidikan UNY is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at

View Our Stats