Simulation of low-high method in adaptive testing

Rukli Rukli, Universitas Muhammadiyah Makassar, Indonesia
Noor Azeam Atan, Universiti Teknologi Malaysia, Malaysia

Abstract


The era of disruption significantly engineered a classic testing system into an adaptive testing system where each test taker takes a unique test. However, the carrying capacity of the adaptive testing system engineering is experiencing obstacles in terms of the method of presenting the test questions. The study aims to introduce the low-high adaptive tracking method with the item response theory approach, where the difficulty level of the questions is adapted to the test takers' abilities. The number of test questions in the question bank is 400 questions. Data analysis used the Bilog-MG program. The range of the difficulty level of the questions and the ability level of the test takers was determined [-3.3]. The initialization of the ability of each test taker is set flexibly. The test taker's response uses the pattern of all wrong, all true, and normal responses. The research results show that the low-high method with the IRT approach matches the pattern of the method of giving adaptive test questions. The low-high method requires about 17 test questions to find the ability of the test takers. Another characteristic of the low-high method is that if the responses of the test takers' three to five questions are all correct, then the calculation of divergent abilities is positive, and if the responses of the test takers' three to five questions are all wrong, then the calculation of convergent abilities is negative. Teachers can use the low-high method to design and assemble adaptive tests in schools electronically and manually.


Keywords


low-high method; adaptive testing; item response theory; item difficulty level

Full Text:

PDF

References


Alnasraween, M. S., Al-Mughrabi, A. M., Ammari, R. M., & Alkaramneh, M. S. (2021). Validity and reliability of eight-grade digital culture test in light of item response theory. Cypriot Journal of Educational Sciences, 16(4), 1816-1835. https://doi.org/10.18844/cjes.v16i4.6034

Balachandran, A. T., Vigotsky, A. D., Quiles, N., Mokkink, L. B., Belio, M. A., & Glenn, J. M. K. (2021). Validity, reliability, and measurement error of a sit-to-stand power test in older adults: A pre-registered study. Experimental Gerontology, 145, 111202. https://doi.org/10.1016/j.exger.2020.111202

Bao, Y., Shen, Y., Wang, S., & Bradshaw, L. (2021). Flexible computerized adaptive tests to detect misconceptions and estimate ability simultaneously. Applied Psychological Measurement, 45(1), 3-21. https://doi.org/10.1177/0146621620965730

Boone, W. J., & Staver, J. R. (2020). Test information function (TIF). In Advances in Rasch analyses in the human sciences (pp. 39–55). Springer International Publishing. https://doi.org/10.1007/978-3-030-43420-5_4

Boussakuk, M., Bouchboua, A., El Ghazi, M., El Bekkali, M., & Fattah, M. (2021). Design of computerized adaptive testing module into our dynamic adaptive hypermedia sys tem. International Journal of Emerging Technologies in Learning, 16(18), 113–128. https://doi.org/10.3991/ijet.v16i18.23841

Chang, H. H. (2015). Psychometrics behind computerized adaptive testing. Psychometrika, 80, 120. https://doi.org/10.1007/s11336-014-9401-5

Chen, S. Y., Lei, P. W., & Liao, W. H. (2008). Controlling item exposure and test overlap on the fly in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 61(2), 471-492. https://doi.org/10.1348/000711007X227067

Choi, Y., & McClenen, C. (2020). Development of adaptive formative assessment system using computerized adaptive testing and dynamic bayesian networks. Applied Sciences (Switzerland), 10(22), 8196. https://doi.org/10.3390/app10228196

Cornelisz, I., Meeter, M., & van Klaveren, C. (2019). Educational equity and teacher discretion effects in high stake exams. Economics of Education Review, 73, 101908. https://doi.org/10.1016/j.econedurev.2019.07.002

DeMars, C. E. (2018). Classical test theory and item response theory. In P. Irwing, T. Booth, & D. J. Hughes (eds.), The Wiley handbook of psychometric testing (pp. 49–73). Wiley. https://doi.org/10.1002/9781118489772.ch2

Doong, S. H. (2009). A knowledge-based approach for item exposure control in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 34(4), 530-558. https://doi.org/10.3102/1076998609336667

Frey, B. B. (2018). Test information function. In The SAGE encyclopedia of educational research, measurement, and evaluation. SAGE Publications, Inc. https://doi.org/10.4135/9781506326139.n694

Halama, P., & Biescad, M. (2011). Measurement of psychotherapy change: Comparison of classical test score and IRT based score. Ceskoslovenska Psychologie: Casopis Pro Psychologickou Teorii a Praxi, 55(5), 400-411.

Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L.

Linn (Ed.), Educational measurement (3rd ed., pp. 147–200). Macmillan Publishing Co, Inc; American Council on Education.

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Springer Dordrecht. https://doi.org/10.1007/978-94-017-1988-9

Han, K. C. T. (2018). Components of the item selection algorithm in computerized adaptive testing. Journal of Educational Evaluation for Health Professions, 15, 7. https://doi.org/10.3352/jeehp.2018.15.7

Hoshino, T. (2001). A test information function for linear combinations of traits without nuisance traits in multidimensional item response theory. Japanese Journal of Educational Psychology, 49(4), 491-499. https://doi.org/10.5926/jjep1953.49.4_491

Huang, H. T. D., Hung, S. T. A., Chao, H. Y., Chen, J. H., Lin, T. P., & Shih, C. L. (2022). Developing and validating a computerized adaptive testing system for measuring the English proficiency of Taiwanese EFL university students. Language Assessment Quarterly, 19(2), 162-188. https://doi.org/10.1080/15434303.2021.1984490

Huebner, A., Wang, C., Daly, B., & Pinkelman, C. (2018). A continuous a-stratification index for item exposure control in computerized adaptive testing. Applied Psychological Measurement, 42(7), 523-537. https://doi.org/10.1177/0146621618758289

Hulin, C. L., Drasgow, F., & Parsons, C. K. (1983). Item response theory: Application to psychological measurement. Dow Jones-Irwin.

Hulin, C. L., Drasgow, F., & Komocar, J. (1982). Applications of item response theory to analysis of attitude scale translations. Journal of Applied Psychology, 67(6), 818–825. https://doi.org/10.1037/0021-9010.67.6.818

Kozierkiewicz-Hetmańska, A., & Poniatowski, R. (2014). An item bank calibration method for a computer adaptive test. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8397 LNAI (Part 1, pp. 375–383). https://doi.org/10.1007/978-3-319-05476-6_38

Lai, J. S., Beaumont, J. L., Nowinski, C. J., Cella, D., Hartsell, W. F., Han-Chih Chang, J., Manley, P. E., & Goldman, S. (2017). Computerized adaptive testing in pediatric brain tumor clinics. Journal of Pain and Symptom Management, 54(3), 289-297. https://doi.org/10.1016/j.jpainsymman.2017.05.008

Li, Z., Cai, Y., & Tu, D. (2020). A new approach to assessing shyness of college students using computerized adaptive testing: CAT-shyness. Journal of Pacific Rim Psychology, 14. https://doi.org/10.1017/prp.2020.15

Lin, Z., Chen, P., & Xin, T. (2021). The block item pocket method for reviewable multidimensional computerized adaptive testing. Applied Psychological Measurement, 45(1), 2236. https://doi.org/10.1177/0146621620947177

Linacre, J. M., & Wright, B. D. (1994). Dichotomous infit and outfit mean-square fit statistics. Rasch Measurement Transactions, 8(2). https://www.rasch.org/rmt/rmt82a.htm

Ma, W., Minchen, N., & de la Torre, J. (2020). Choosing between CDM and unidimensional IRT: The proportional reasoning test case. Measurement: Interdisciplinary Research and Perspectives, 18(2), 87-96. https://doi.org/10.1080/15366367.2019.1697122

Magee, J. F. (1964). Decision trees for decision-making. Harvard Business Review. https://hbr.org/1964/07/decision-trees-for-decision-making

Ozturk, N. B., & Dogan, N. (2015). Investigating item exposure control methods in computerized adaptive testing. Kuram ve Uygulamada Egitim Bilimleri, 15(1), 85-98. https://doi.org/10.12738/estp.2015.1.2593

Reise, S. P., Ainsworth, A. T., & Haviland, M. G. (2005). Item response theory: Fundamentals, applications, and promise in psychological research. Current Directions in Psychological Science, 14(2), 95–101. https://doi.org/10.1111/j.0963-7214.2005.00342.x

Revuelta, J., & Ponsoda, V. (1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35(4), 311-327. https://doi.org/10.1111/j.1745-3984.1998.tb00541.x

Rodríguez-Cuadrado, J., Delgado-Gómez, D., Laria, J. C., & Rodríguez-Cuadrado, S. (2020). Merged Tree-CAT: A fast method for building precise computerized adaptive tests based on decision trees. Expert Systems with Applications, 143, 113066. https://doi.org/10.1016/j.eswa.2019.113066

Rukli, R. (2018). Analysis of mathematical problem with group wise for computerized adaptive testing. Daya Matematis: Jurnal Inovasi Pendidikan Matematika, 6(1), 96-104. https://ojs.unm.ac.id/JDM/article/view/5600

Rukli, R., & Hartati. (2011). Penerapan sistim pendukung keputusan dalam sistem pengujian computerized adaptive testing. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 5(3), 71–81. https://doi.org/10.22146/ijccs.5215

Saidi, S. S., & Siew, N. M. (2019). Reliability and validity analysis of statistical reasoning test survey instrument using the Rasch measurement model. International Electronic Journal of Mathematics Education, 14(3), 535-546. https://doi.org/10.29333/iejme/5755

Sarabia, J. M., Roldan, A., Henríquez, M., & Reina, R. (2021). Using decision trees to support classifiers’ decision-making about activity limitation of cerebral palsy footballers. International Journal of Environmental Research and Public Health, 18(8), 4320. https://doi.org/10.3390/ijerph18084320

Senge, R., & Hullermeier, E. (2015). Fast Fuzzy Pattern tree learning for classification. IEEE Transactions on Fuzzy Systems, 23(6), 2024–2033. https://doi.org/10.1109/TFUZZ.2015.2396078

Seo, D. G., & Choi, J. (2020). Introduction to the LIVECAT web-based computerized adaptive testing platform. Journal of Educational Evaluation for Health Professions, 17. https://doi.org/10.3352/JEEHP.2020.17.27

Sineglazov, V. M., & Kusyk, A. V. (2018). Adaptive testing system based on the Fuzzy logic. Electronics and Control Systems, 2(56), 85-91. https://doi.org/10.18372/1990-5548.56.12941

Stenbeck, M., Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1992). Fundamentals of item response theory. SAGE Publications.

Triantafillou, E., Georgiadou, E., & Economides, A. A. (2008). CAT -MD: Computerized adaptive testing on mobile devices. International Journal of Web-Based Learning and Teaching Technologies (IJWLTT), 3(1), 13-20. https://doi.org/10.4018/jwltt.2008010102

Tripathi, A. M., Kasana, R., Bhandari, R., & Vashishtha, N. (2022). Online examination system. In Y. D. Zhang, T. Senjyu, C. So-In, & A. Joshi (eds.) Smart trends in computing and communications. Lecture notes in networks and systems (vol. 286). Springer Singapore. https://doi.org/10.1007/978-981-16-4016-2_67

van der Linden, W. J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63(2), 201–216. https://doi.org/10.1007/BF02294775

van der Linden, W. J. (2022). Review of the shadow-test approach to adaptive testing. Behaviormetrika, 49, 169–190. https://doi.org/10.1007/s41237-021-00150-y

van der Linden, W. J., & Veldkamp, B. P. (2004). Constraining item exposure in computerized adaptive testing with shadow tests. Journal of Educational and Behavioral Statistics, 29(3), 273291. https://doi.org/10.3102/10769986029003273

Veldkamp, B. P., & Matteucci, M. (2013). Bayesian computerized adaptive testing. Ensaio: aval. pol. públ. Educ., 21(78). https://doi.org/10.1590/S0104-40362013005000001

Victor, V. M., Farias, J. S., Freire, V., Ruela, A. S., & Delgado, K. V. (2020). ALICAT: A customized approach to the item selection process in computerized adaptive testing. Journal of the Brazilian Computer Society, 26, 4. https://doi.org/10.1186/s13173-020-00098-z

Xu, L., Jin, R., Huang, F., Zhou, Y., Li, Z., & Zhang, M. (2020). Development of computerized adaptive testing for emotion regulation. Frontiers in Psychology, 11, 561358. https://doi.org/10.3389/fpsyg.2020.561358

Zhang, C., Wang, T., Zeng, P., Zhao, M., Zhang, G., Zhai, S., Meng, L., Wang, Y., & Liu, D. (2021). Reliability, validity, and measurement invariance of the general anxiety disorder scale among Chinese medical university students. Frontiers in Psychiatry, 12, 648755. https://doi.org/10.3389/fpsyt.2021.648755




DOI: https://doi.org/10.21831/reid.v10i1.66922

Refbacks

  • There are currently no refbacks.




Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




Find REID (Research and Evaluation in Education) on:

  

ISSN 2460-6995 (Online)

View REiD Visitor Statistics