Nonoh Siti Aminah, PMIPA FKIP UNS, Indonesia


Penelitian ini bertujuan untuk menemukan: 1) akurasi estimasi parameter item pada test equating menggunakan metode Item Characteristic Curve (ICC). 2) sensitivitas metode linear yang terdiri atas Tucker - Levine score method dan Levine true score method applied to observed scores serta metode equipercentile yang terdiri atas metode Braun-Holland linear dan chained equipercentile. Data empiris yang digunakan yaitu respons siswa peserta  Ulangan Akhir Semester V Mata Pelajaran Ilmu Pengetahuan Alam (IPA) SMP Tahun Ajaran 2009/2010. Penyetaraan tes menggunakan anchor test  design. Anchor test bersifat external, anchor test berfungsi sebagai pengait antara tes  yang disetarakan. Item anchor berisi 10 item materi Fisika. Banyak item pada tes A 55  item, tes B 55 item dan tes C 50 item. Pola penyetaraan yang digunakan pola kelompok, sehingga banyak item hasil penyetaraan berjumlah 140 item terdiri atas 10 anchor item milik bersama, 45 item berasal dari tes A, 45 item berasal dari tes B, dan 40 item  berasal dari tes C. Hasil penelitian menunjukkan bahwa: 1) Estimasi parameter item pada penyetaraan  tes menggunakan metode Item Characteristic Curva (ICC) menghasilkan formula  indeks kesulitan item, 2) urutan sensitivitas metode penyetaraan dari  paling tinggi sampai paling rendah yaitu Tucker – Levine method, Levine method, Braun - Holland linear method. Chained Equipercentile Equating method.

Kata kunci: Test equating, anchor test, external anchor test, RMSD, RMSE



Abstract This study aims: 1) to find out the accuracy of item parameter estimates in test equating by means of the Item Characteristic Curve (ICC) method, and 2) to find out the sensivity of the linear methods consisting of the Tucker-Levine score method and the Levine true score method applied to observed scores and the equipercentile methods consisting of the Braun-Holland linear method the chained equipercentile equating method. The data were empirical data obtained from the response patterns of the junior high school students taking the final test of Natural Sciences in the odd semester of the academic year of 2009/2010. The test equating employed the external anchor test design. The anchor test served to unite the equated tests. The anchor test consisted of 10 physics items. The test A had 55  items, the test B had 55 items, and the test C had 50 items. The equating pattern employed the group pattern, so that in the equating there were 140 items, consisting of 10 common anchor items, 45 items from tests A, 45 items from tests B, and 40 items from tests C. The results of the study are as follows. 1) The item parameter estimate in the test score equating by means of the Item Characteristic Curve (ICC) method yields the formula for the item difficulty index, and 2) urutan sensitivitas metode penyetaraan dari  paling tinggi sampai paling rendah yaitu The order of the sensitivity of the equating methods from the highest to the lowest is Tucker- Levine method, Levine method, Braun-Holland linear method. Chained Equipercentile Equating method.

Keywords: test equating, anchor test, external anchor test, RMSD, RMSE


Test equating; anchor test; external anchor test; RMSD; RMSE

Full Text:



Allen, M.J. & Yen, W.M. (1979). Introduction to Measurement Theory. Monterey, CA: Brooks/Cole Publishing Company.

Avi. (2007). An NCME Instructional Module on Quality Control Procedures in the Scoring, Equating, and Reporting of Test Scores. Educational Measurement, 26(1), 36-43.

Brennan, R.L. (2006). Educational Measurement. Iowa City: American Council on Education and Praeger Publisher.

Brennan, R.L. & Kolen, M.J. (2004). Test Equating, Scaling, and Linking. Iowa City: American Council on Education and Springer Publisher.

Crocker, L., & Algina, J. (1986). Introduction to Classical and Modern Test Theory. New York: CBS College Publishing.

Djemari Mardapi. (2004). Penyusunan Tes Hasil Belajar. Yogyakarta: Program Pascasarjana Universitas Negeri Yogyakarta.

Djemari Mardapi. (2008). Teknik Penyusunan Instrumen Tes dan Non-Tes. Yogyakarta: Mitra Cendikia Press.

Dikdasmen Dikbud. (1999). Pengelolaan Pengujian bagi Guru Mata Pelajaran. Jakarta: Departemen Pendidikan dan Kebudayaan, Direktorat Jenderal Pendidikan Dasar dan Menengah, Direktorat Pendidikan Menengah Umum.

Embretson, S.E., & Reise, S.P. (2000). Item Response Theory for Psychologists. London: Lawrence Erlbaum Associates Publishers.

Ebel, R.L. & Friesbie, D.A. (1986). Essentials of Educational Measurement. New Jersey: Prentice-Hall, Inc.

Ekohariadi. (2009). Perkembangan Kemampuan Sains Siswa Indonesia Berusia 15 Tahun Berdasarkan Data Studi PISA. Makalah Seminar Mutu Pendidikan Dasar dan Menengah Hasil Penelitian Puspendik. Jakarta: Badan Penelitian dan Pengembangan Departemen Pendidikan Nasional.

Freund, J.E. (2004). Mathematical Statistics with Applications. Canada: Pearson Education, Inc.

Gary, S. (2005). Accuracy of Random Groups Equating with Very Small Samples. Journal of Educational Measurement, 42(4), 309-330.

Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of Item Response Theory. Newbury Park, CA: Sage Publication, New Inc.

Hambleton, R.K. & Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Boston, MA: Kluwer Inc.

Kolen, M.J. & Brennan, R.L. (1995). Test Equating. Iowa City: Springer.

Kolen, M.J. & Brennan, R.L. (2004). Test Equating, Scaling, and Linking. Iowa City: Springer.

Kolen, M.J. (2004). Linking Assessment: Concepts and History. Applied Psychological Measurement, 28(4), 219-226.

DOI: https://doi.org/10.21831/pep.v16i0.1107


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Find Jurnal Penelitian dan Evaluasi Pendidikan on:


ISSN 2338-6061 (online)    ||    ISSN 2685-7111 (print)

View Journal Penelitian dan Evaluasi Pendidikan Visitor Statistics