Differential item functioning analysis of Arabic language exams across gender, study specialization, and geographic region in senior high schools

Authors

  • Anugrah Arya Bakti Universitas Negeri Yogyakarta, Indonesia
  • Marzuki Marzuki Universitas Negeri Yogyakarta, Indonesia
  • Zulfa Safina Ibrahim Universitas Negeri Yogyakarta, Indonesia
  • Rugaya Tuanaya Universitas Negeri Yogyakarta, Indonesia
  • Nur Yusra binti Yacob Universiti Teknologi Mara Shah Alam, Malaysia

DOI:

https://doi.org/10.21831/reid.v11i1.85961

Keywords:

differential item functioning (DIF), Arabic Language Assessment, Item Response Theory (IRT), Fairness in Testing, Senior High School Education in Indonesia

Abstract

This study aims to examine the fairness of Arabic language assessment instruments used in Muhammadiyah senior high schools by detecting the presence of Differential Item Functioning (DIF) in the Final Semester Summative Test (UAS) for 12th-grade students in the Special Region of Yogyakarta during the 2023/2024 academic year. Using a descriptive quantitative design, the research analyzed student response data from 1,157 participants across 25 schools. Data collection was conducted through documentation of test blueprints, item sheets, answer keys, and student responses. Analysis was performed using the Lord and Generalized Lord methods within the framework of Item Response Theory (IRT), focusing on three demographic variables: gender, study specialization (science vs. social studies), and school region (Yogyakarta City, Sleman, Bantul, and Kulon Progo). The Rasch model was identified as the most optimal model due to its superior fit and fulfilment of key psychometric assumptions, including unidimensionality and parameter invariance. The findings indicate that several items exhibit significant DIF across all examined variables. Eleven items showed gender-based DIF, with a higher number favoring male students. Twenty-three items demonstrated DIF by study specialization, and thirty-seven items displayed DIF based on school region, with students from Yogyakarta City benefiting the most. These results suggest that the test is not fully equitable and highlight the need for item revision to ensure fairness. The study contributes theoretically to the field of educational measurement and practically to the development of fairer evaluation practices in Islamic and language education settings.

References

Abu Bakar, H. I. (2022). Implementation of Islamic values in ISMUBA curriculum to form a Rabbani generation at Muhammadiyah Sidareja High School. Journal of Islamic Education and Innovation, 78–85. https://doi.org/10.26555/jiei.v3i2.6616

Alejandro, J. (2024). The Role of Language in Thought Formation and Personality. International Journal of Multidisciplinary Sciences, 2(4), 356–367. https://doi.org/10.37329/ijms.v2i4.3759

Arslan, D., Tamul, Ö. F., Şahin, M. D., & Sak, U. (2023). Effects of gender norms on intelligence tests: Evidence from ASIS. Journal of Pedagogical Research, 7(5), 374–384. https://doi.org/10.33902/JPR.202323599

Downey, R. G., & Stockdale, M. S. (1987). Computer Programs to Compute Lord’s Item Bias Statistic for a Three-Parameter ICC. Educational and Psychological Measurement, 47(3), 637–641. https://doi.org/10.1177/001316448704700313

Effiom, A. P. (2021). Test fairness and assessment of differential item functioning of mathematics achievement test for senior secondary students in Cross River state, Nigeria using item response theory. Global Journal of Educational Research, 20(1), 55–62. https://doi.org/10.4314/gjedr.v20i1.6

Fauziah Nasution, & Elissa Evawani Tambunan. (2022). Language and Communication. International Journal of Community Service (IJCS), 1(1), 01–10. https://doi.org/10.55299/ijcs.v1i1.86

Hope, D., Adamson, K., McManus, I. C., Chis, L., & Elder, A. (2018). Using differential item functioning to evaluate potential bias in a high stakes postgraduate knowledge based assessment. BMC Medical Education, 18(1), 64. https://doi.org/10.1186/s12909-018-1143-0

Khasawneh, M. A. S., & Khasawneh, Y. J. A. (2023). Achieving Assessment Equity and Fairness: Identifying and Eliminating Bias in Assessment Tools and Practices. https://doi.org/10.20944/preprints202306.0730.v1

Liu, X., & Jane Rogers, H. (2022). Treatments of Differential Item Functioning: A Comparison of Four Methods. Educational and Psychological Measurement, 82(2), 225–253. https://doi.org/10.1177/00131644211012050

Sari, R. R., & Hikmah, K. (2024). Implementation of Arabic Language Learning Activities at the Muhammadiyah 2 Sidoarjo High School Boarding School. https://doi.org/10.21070/ups.5350

Setiawan, A., Kassymova, G. K., Mbazumutima, V., & Agustyani, A. R. D. (2024). Differential Item Functioning of the region-based national examination equipment. REID (Research and Evaluation in Education), 10(1), 99–113. https://doi.org/10.21831/reid.v10i1.73270

Sumin, S., Sukmawati, F., & Nurdin, N. (2022). Gender differential item functioning on the Kentucky Inventory of Mindfulness Skills instrument using logistic regression. REID (Research and Evaluation in Education), 8(1), 55–66. https://doi.org/10.21831/reid.v8i1.50809

Tierney, R. D. (2022). Fairness in Educational Testing and Assessment. In Fairness in Educational Testing and Assessment. Routledge. https://doi.org/10.4324/9781138609877-REE35-1

Waizah, N., & Herwani, H. (2021). Penilaian Pengetahuan Tertulis Dalam Kurikulum 2013. Tafkir: Interdisciplinary Journal of Islamic Education, 2(2), 207–228. https://doi.org/10.31538/tijie.v2i2.54

Wallin, G., Chen, Y., & Moustaki, I. (2024). DIF Analysis with Unknown Groups and Anchor Items. Psychometrika, 89(1), 267–295. https://doi.org/10.1007/s11336-024-09948-7

Published

2025-07-15

How to Cite

Bakti, A. A., Marzuki, M., Ibrahim, Z. S., Tuanaya, R., & binti Yacob, N. Y. (2025). Differential item functioning analysis of Arabic language exams across gender, study specialization, and geographic region in senior high schools. REID (Research and Evaluation in Education), 11(1). https://doi.org/10.21831/reid.v11i1.85961

Issue

Section

Articles

Citation Check

Similar Articles

<< < 1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.