Comparison of methods for detecting anomalous behavior on large-scale computer-based exams based on response time and responses

Deni Hadiana; Bahrul Hayat; Burhanuddin Tola

doi:10.21831/reid.v6i2.31260

Authors

Deni Hadiana Universitas Negeri Jakarta, Indonesia
Bahrul Hayat Universitas Islam Negeri Syarif Hidayatullah Jakarta, Indonesia
Burhanuddin Tola Universitas Negeri Jakarta, Indonesia

DOI:

https://doi.org/10.21831/reid.v6i2.31260

Keywords:

anomalous index (IA), rapid guessing (TC), threshold, reliability, validity

Abstract

This study aims to determine the anomalous index (indeks anomali or IA) that considers both response time and responses and compares it with response time effort (RTE) or rapid guessing (tebakan cepat or TC) on various thresholds. Response time and responses from 732 examinees are in natural science subjects consist of 40 multiple choice items with four answer choices. Response time and responses are analyzed to obtain descriptive statistics related to them, calculate the TC and IA index using two methods of the threshold, the first method (M1) is a visualization of identification, and the second method (M2) is based on the amount of time spent responding to each item related to the complexity of items, as proposed by Nitko. The performance of the IA and TC scores is compared related to validity and reliability. The coefficient alpha of IAM1 score 0.84, the coefficient alpha of IAM2 0.82. Both values of the alpha coefficient have fulfilled the reliability requirements of the index determination. The IA proposed in this study has a high correlation with ERP, which is commonly used to determine the solution behavior's magnitude and rapid guessing. The correlation value of IAM1 with TCM1 0.86, the correlation value of IAM2 with TCM2 0.89, and this high correlation value shows that there is a strong relationship between IA and TC. Determination of threshold time uses three categories of multiple choices item that reveal IA and TC distributions that are close to normal distribution so that it reflects natural empirical conditions.

References

Cizek, G. J., & Wollack, J. A. (2016). Handbook of quantitative methods for detecting cheating on tests (1st ed.). Routledge. https://doi.org/10.4324/9781315743097

Fox, J.-P., Entink, R. K., & Van der Linden, W. (2007). Modeling of responses and response times with the package cirt. Journal of Statistical Software, 20(7). https://doi.org/10.18637/jss.v020.i07

Georgiadou, E., Triantafillou, E., & Economides, A. A. (2006). Evaluation parameters for computer-adaptive testing. British Journal of Educational Technology, 37(2), 261–278. https://doi.org/10.1111/j.1467-8535.2005.00525.x

Hauser, C., & Kingsbury, G. G. (2009, November 4). Individual score validity in a Modest-Stakes adaptive educational testing setting [Paper presentation]. The Annual Meeting of the National Council on Measurement in Education, Sandiego, CA. https://www.nwea.org/resources/individual-score-validity-modest-stakes-adaptive-educational-testing-setting/

Kong, X. J., Wise, S. L., & Bhola, D. S. (2007). Setting the response time threshold parameter to differentiate solution behavior from Rapid-Guessing behavior. Educational and Psychological Measurement, 67(4), 606–619. https://doi.org/10.1177/0013164406294779

Lee, Y.-H., & Chen, H. (2011). A review of recent response-time analyses in educational testing. In Psychological Test and Assessment Modeling (Vol. 53, Issue 3). http://www.psychologie-aktuell.com/fileadmin/download/ptam/3-2011_20110927/06_Lee.pdf

Lewis, C., Lee, Y.-H., & Davier, A. A. Von. (2014). Test security for multistage tests: A quality control perspective. In N. Kingston & A. Clark (Eds.), Test Fraud (Statistical Detection and Methodology) (1st ed.). Routledge. https://doi.org/https://doi.org/10.4324/9781315884677

Van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31(2), 181–204. https://doi.org/10.3102/10769986031002181

Lindsey, J. K. (2004). Statistical analysis of stochastic processes in time. Cambridge University Press. https://doi.org/10.1017/CBO9780511617164

Marianti, S., Fox, J.-P., Avetisyan, M., Veldkamp, B. P., & Tijmstra, J. (2014). Testing for aberrant behavior in response time modeling. Journal of Educational and Behavioral Statistics, 39(6), 426–451. https://doi.org/10.3102/1076998614559412

Meijer, R.R., & Sotaridona, L. (2006). Detection of advance item knowledge using response times in computer adaptive testing (LSAC research report series No. CT 03-03). Law School Admission Council.

Meijer, Rob R. (1996). Person-Fit research: An introduction. Applied Measurement in Education, 9(1), 3–8. https://doi.org/10.1207/s15324818ame0901_2

Meijer, Rob R. (2003). Diagnosing item score patterns on a test using item response theory-based person-fit statistics. Psychological Methods, 8(1), 72–87. https://doi.org/10.1037/1082-989X.8.1.72

Naga, D. S. (2013). Teori sekor pada pengukuran mental (2nd ed.). Nagarami Citrayasa.

Widiatmo, H., & Wright, D. B. (2015, April). Comparing two item response models that incorporate response times [Paper presentation]. National Council on Measurement in Education Annual Meeting, California, Illionis, USA. https://www.researchgate.net/publication/283711098_Comparing_Two_Item_Response_Models_That_Incorporate_Response_Times

Wise, S. L. (2006). An investigation of the differential effort received by items on a Low-Stakes Computer-Based Test. Applied Measurement in Education, 19(2), 95–114. https://doi.org/10.1207/s15324818ame1902_2

Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of examinee motivation in Computer-Based Tests. Applied Measurement in Education, 18(2), 163–183. https://doi.org/10.1207/s15324818ame1802_2

Wulansari, A. D. (2019). Model logistik dalam IRT dengan variabel random waktu respon untuk tes terkomputerisasi [Doctoral Dissertation, Universitas Negeri Yogyakarta]. Eprints UNY. http://eprints.uny.ac.id/id/eprint/66079

Wulansari, A. D., Kumaidi, & Hadi, S. (2019). Two parameter logistic model with Lognormal Response Time for Computer-Based Testing. International Journal of Emerging Technologies in Learning (IJET), 14(15), 138–158. https://doi.org/10.3991/ijet.v14i15.10580

Comparison of methods for detecting anomalous behavior on large-scale computer-based exams based on response time and responses

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Citation Check

License

Similar Articles

SidebarMenu

Information

Developed By

Current Issue

visitor

template

accreditation

tools

crossref

similarity checker and reference manager

StatCounter

Keywords