Differential item functioning analysis of Arabic language exams across gender, study specialization, and geographic region in senior high schools
DOI:
https://doi.org/10.21831/reid.v11i1.85961Keywords:
differential item functioning (DIF), Arabic Language Assessment, Item Response Theory (IRT), Fairness in Testing, Senior High School Education in IndonesiaAbstract
This study aims to examine the fairness of Arabic language assessment instruments used in Muhammadiyah senior high schools by detecting the presence of Differential Item Functioning (DIF) in the Final Semester Summative Test (UAS) for 12th-grade students in the Special Region of Yogyakarta during the 2023/2024 academic year. Using a descriptive quantitative design, the research analyzed student response data from 1,157 participants across 25 schools. Data collection was conducted through documentation of test blueprints, item sheets, answer keys, and student responses. Analysis was performed using the Lord and Generalized Lord methods within the framework of Item Response Theory (IRT), focusing on three demographic variables: gender, study specialization (science vs. social studies), and school region (Yogyakarta City, Sleman, Bantul, and Kulon Progo). The Rasch model was identified as the most optimal model due to its superior fit and fulfillment of key psychometric assumptions, including unidimensionality and parameter invariance. The findings indicate that several items exhibit significant DIF across all examined variables. Eleven items showed gender-based DIF, with a higher number favoring male students. Twenty-three items demonstrated DIF by study specialization, and thirty-seven items displayed DIF based on school region, with students from Yogyakarta City benefiting the most. These results suggest that the test is not fully equitable and highlight the need for item revision to ensure fairness. The study contributes theoretically to the field of educational measurement and practically to the development of fairer evaluation practices in Islamic and language education settings.
References
Acar, T. (2012). Determination of a differential item functioning procedure using the hierarchical generalized linear model. Sage Open, 2(1), 1-8. https://doi.org/10.1177/2158244012436760
Alejandro, J. (2024). The role of language in thought formation and personality. International Journal of Multidisciplinary Sciences, 2(4), 356–367. https://doi.org/10.37329/ijms.v2i4.3759
Arslan, D., Tamul, Ö. F., Şahin, M. D., & Sak, U. (2023). Effects of gender norms on intelligence tests: Evidence from ASIS. Journal of Pedagogical Research, 7(5), 374–384. https://doi.org/10.33902/JPR.202323599
Azzizah, Y. (2015). Socio-economic factors on Indonesia education disparity. International Education Studies, 8(12), 218-230. https://doi.org/10.5539/ies.v8n12p218
Bakar, H. I. A. (2022). Implementation of Islamic values in ISMUBA curriculum to form a Rabbani generation at Muhammadiyah Sidareja High School. Journal of Islamic Education and Innovation, 3(2), 78–85. https://doi.org/10.26555/jiei.v3i2.6616
Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items. SAGE Publications.
Çelik, M., & Yeşim, Ö. Ö. (2020). Analysis of differential item functioning of PISA 2015 mathematics subtest subject to gender and statistical regions. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 11(3), 283–301. https://doi.org/10.21031/epod.715020
Danuwijaya, A. A., & Roebianto, A. (2020). Performance differences by gender in English reading test. Jurnal Penelitian dan Evaluasi Pendidikan, 24(2) 190-197. https://doi.org/10.21831/pep.v24i2.34344
Downey, R. G., & Stockdale, M. S. (1987). Computer programs to compute lord’s item bias statistic for a three-parameter ICC. Educational and Psychological Measurement, 47(3), 637–641. https://doi.org/10.1177/001316448704700313
Effiom, A. P. (2021). Test fairness and assessment of differential item functioning of mathematics achievement test for senior secondary students in Cross River state, Nigeria using item response theory. Global Journal of Educational Research, 20(1), 55–62. https://doi.org/10.4314/gjedr.v20i1.6
Fatimah, S., Rusilowati, A., Cahyono, E., & Rokhmaniyah. (2024). STEM learning in higher education: A comparative study of science curriculum in Singapore and Indonesia. International Journal of Scientific Multidisciplinary Research, 2(8), 1003–1030. https://doi.org/10.55927/ijsmr.v2i8.11048
Hope, D., Adamson, K., McManus, I. C., Chis, L., & Elder, A. (2018). Using differential item functioning to evaluate potential bias in a high stakes postgraduate knowledge based assessment. BMC Medical Education, 18(1), 64. https://doi.org/10.1186/s12909-018-1143-0
Jones, R. N. (2019). Differential item functioning and its relevance to epidemiology. Current Epidemiology Reports, 6(2), 174–183. https://doi.org/10.1007/s40471-019-00194-5
Khasawneh, M. A. S., & Khasawneh, Y. J. A. (2023). Achieving assessment equity and fairness: Identifying and eliminating bias in assessment tools and practices. Preprints, 2023060730. https://doi.org/10.20944/preprints202306.0730.v1
Liu, X., & Rogers, H. J. (2022). Treatments of differential item functioning: A comparison of four methods. Educational and Psychological Measurement, 82(2), 225–253. https://doi.org/10.1177/00131644211012050
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749. https://doi.org/10.1037/0003-066X.50.9.741
Mi’rotin, S., & Cholil, M. (2020). Analisis bias gender pada soal ujian Bahasa Arab di madrasah tsanawiyah. An Nabighoh: Jurnal Pendidikan dan Pembelajaran Bahasa Arab, 22(02), 191-210. https://doi.org/10.32332/an-nabighoh.v22i02.2232
Muttaqin, I., Bakheit, B. M., & Hasanah, M. (2024). Arabic language environment for Islamic boarding school student language acquisition: Capturing language input, interaction, and output. Al-Hayat: Journal of Islamic Education, 8(3), 891–907. https://doi.org/10.35723/ajie.v8i3.624
Nasution, F., & Tambunan, E. E. (2022). Language and communication. International Journal of Community Service (IJCS), 1(1), 01–10. https://doi.org/10.55299/ijcs.v1i1.86
Paek, I. (2018). Understanding differential item functioning and item bias in psychological instruments. Psychology and Psychotherapy: Research Study, 1(3). https://doi.org/10.31031/PPRS.2018.01.000514
Sari, R. R., & Hikmah, K. (2024). Implementation of Arabic language learning activities at the Muhammadiyah 2 Sidoarjo High School Boarding School. Al Mi’yar: Jurnal Ilmiah Pembelajaran Bahasa Arab dan Kebahasaaraban, 7(2), 1-9. https://doi.org/10.21070/ups.5350
Setiawan, A., Kassymova, G. K., Mbazumutima, V., & Agustyani, A. R. D. (2024). Differential item functioning of the region-based national examination equipment. REID (Research and Evaluation in Education), 10(1), 99–113. https://doi.org/10.21831/reid.v10i1.73270
Sopian, A., Abdurahman, M., Ali Tantowi, Y., Nur Aeni, A., & Maulani, H. (2025). Arabic language learning in a multicultural context at pesantren. Jurnal Pendidikan Islam, 11(1), 77–89. https://doi.org/10.15575/jpi.v11i1.44104
Sumin, S., Sukmawati, F., & Nurdin, N. (2022). Gender differential item functioning on the Kentucky Inventory of Mindfulness Skills instrument using logistic regression. REID (Research and Evaluation in Education), 8(1), 55–66. https://doi.org/10.21831/reid.v8i1.50809
Tierney, R. D. (2022). Fairness in educational testing and assessment. Routledge. https://doi.org/10.4324/9781138609877-REE35-1
Wahyuni, A. (2022). Detection of gender biased using DIF (Differential Item Functioning) analysis on item test of school examination Yogyakarta. Jurnal Evaluasi Pendidikan, 13(1), 46–49. https://doi.org/10.21009/jep.v13i1.26554
Waizah, N., & Herwani, H. (2021). Penilaian pengetahuan tertulis dalam kurikulum 2013. Tafkir: Interdisciplinary Journal of Islamic Education, 2(2), 207–228. https://doi.org/10.31538/tijie.v2i2.54
Wallin, G., Chen, Y., & Moustaki, I. (2024). DIF analysis with unknown groups and anchor items. Psychometrika, 89(1), 267–295. https://doi.org/10.1007/s11336-024-09948-7
Downloads
Published
How to Cite
Issue
Section
Citation Check
License
Copyright (c) 2025 REID (Research and Evaluation in Education)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The authors submitting a manuscript to this journal agree that, if accepted for publication, copyright publishing of the submission shall be assigned to REID (Research and Evaluation in Education). However, even though the journal asks for a copyright transfer, the authors retain (or are granted back) significant scholarly rights.
The copyright transfer agreement form can be downloaded here: [REID Copyright Transfer Agreement Form]
The copyright form should be signed originally and sent to the Editorial Office through email to reid.ppsuny@uny.ac.id
REID (Research and Evaluation in Education) by http://journal.uny.ac.id/index.php/reid is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.