Construction of an instrument for evaluating the teaching process in higher education: Content & construct validity

Risky Setiawan; Wagiran Wagiran; Yasir Alsamiri

doi:10.21831/reid.v10i1.63483

Construction of an instrument for evaluating the teaching process in higher education: Content & construct validity

Risky Setiawan, Universitas Negeri Yogyakarta, Indonesia
Wagiran Wagiran, Universitas Negeri Yogyakarta, Indonesia
Yasir Alsamiri, University of Hail, Saudi Arabia

10.21831/reid.v10i1.63483

Abstract

This study aims to reveal the content validity, the construct validity, and the reliability of the instrument for evaluating the teaching process in higher education. This research is development research applying the ADDIE model from Molenda. The indicators evaluated consist of context, inputs, processes, and products. The sample consisted of eight faculties, each representing by three study programs. The total sample is 1200 students. Data analysis uses three stages: content validity test analysis using the V-Aiken method involving six panelists or experts; construct validity test using Confirmatory Factor Analysis (CFA). Quantitative descriptive analysis and interpretive qualitative analysis used the Miles & Huberman method. The results showed that the developed evaluation instrument had good proof of the validity of the content, with an average V-Aiken score of 0.752, which was in the high category. UNY's evaluation instrument developed through the instrument already meets the validity of an exemplary construct of a good loading factor value (> 0.3) and has a composite reliability score above 0.7 and Cronbach's alpha above 0.6. The results of the analysis show that all empirical test criteria indicate the data is fit against the developed model.

Keywords

teaching; validity; evaluation; construct; content

References

Allen, I. E., & Seaman, J. (2010). Class differences: Online education in the United States, 2010. Babson Survey Research Group.

Aman, A., Setiawan, R., Prasojo, L. D., & Mehta, K. (2021). Evaluation of hybrid learning in college using CIPP model. Jurnal Penelitian Dan Evaluasi Pendidikan, 25(2), 218–231.

Anculle-Arauco, V., Krüger-Malpartida, H., Arevalo-Flores, M., Correa-Cedeño, L., Mass, R., Hoppe, W., & Pedraz-Petrozzi, B. (2024). Content validation using Aiken methodology through expert judgment of the first Spanish version of the Eppendorf Schizophrenia Inventory (ESI) in Peru: A brief qualitative report. Spanish Journal of Psychiatry and Mental Health, 17(2), 110–113. https://doi.org/https://doi.org/10.1016/j.rpsm.2022.11.004

Aprilia, N. H. (2021). Perbedaan tingkat disiplin belajar siswa antara yang ikut dengan yang tidak ikut ekstrakurikuler paskibra pada siswa MAN 11 Jakarta Selatan DKI Jakarta. Universitas Muhammadiyah Jakarta.

Arslan, S. S., Demir, N., & Karaduman, A. A. (2020). Turkish version of the Mastication Observation and Evaluation (MOE) instrument: A reliability and validity study in children. Dysphagia, 35(2), 328–333. https://doi.org/10.1007/s00455-019-10035-8

Bliss, D. Z., Gurvich, O. V., Hurlow, J., Cefalu, J. E., Gannon, A., Wilhems, A., Wiltzen, K. R., Gannon, E., Lee, H., Borchert, K., & Trammel, S. H. (2018). Evaluation of validity and reliability of a revised Incontinence-Associated Skin Damage Severity Instrument (IASD.D.2) by 3 groups of nursing staff. Journal of Wound, Ostomy & Continence Nursing, 45(5), 449–455. https://doi.org/10.1097/WON.0000000000000466

Boita, J., Bolejko, A., Zackrisson, S., Wallis, M. G., Ikeda, D. M., Van Ongeval, C., van Engen, R. E., Mackenzie, A., Tingberg, A., Bosmans, H., Pijnappel, R., Sechopoulos, I., & Broeders, M. (2021). Development and content validity evaluation of a candidate instrument to assess image quality in digital mammography: A mixed-method study. European Journal of Radiology, 134, 109464. https://doi.org/10.1016/j.ejrad.2020.109464

Cedzich, C., Geib, T., Grünbaum, F. A., Stahl, C., Velázquez, L., Werner, A. H., & Werner, R. F. (2018). The Topological Classification of One-Dimensional Symmetric Quantum Walks. Annales Henri Poincaré, 19(2), 325–383. https://doi.org/10.1007/s00023-017-0630-x

del Rio, A., Serrano, J., Jimenez, D., Contreras, L. M., & Alvarez, F. (2024). Multisite gaming streaming optimization over virtualized 5G environment using Deep Reinforcement Learning techniques. Computer Networks, 244, 110334. https://doi.org/https://doi.org/10.1016/j.comnet.2024.110334

Divayana, D. G. H., Sappaile, B. I., Pujawan, I. G. N., Dibia, I. K., Artaningsih, L., Sundayana, I. M., & Sugiharni, G. A. D. (2017). An evaluation of instructional process of expert system course program by using mobile technology-based CSE-UCLA Model. International Journal of Interactive Mobile Technologies (IJIM), 11(6), 18–31. https://doi.org/10.3991/ijim.v11i6.6697

Divayana, D. G. H., Suyasa, P. W. A., & Adiarta, A. (2020). Content validity determination of the countenance-tri kaya parisudha model evaluation instruments using lawshe’s CVR formula. Journal of Physics: Conference Series, 1516(1), 012047. https://doi.org/10.1088/1742-6596/1516/1/012047

Ebert-May, D., Derting, T. L., Hodder, J., Momsen, J. L., Long, T. M., & Jardeleza, S. E. (2011). What we say is not what we do: Effective evaluation of faculty professional development programs. BioScience, 61(7), 550–558. https://doi.org/10.1525/bio.2011.61.7.9

Faddar, J., Vanhoof, J., & Maeyer, S. De. (2017). School self-evaluation instruments and cognitive validity. Do items capture what they intend to? School Effectiveness and School Improvement, 28(4), 608–628. https://doi.org/10.1080/09243453.2017.1360363

Fajardo, Z. I. E., Ramírez, R. A. N., & Álvarez, M. D. G. (2020). Instrumento alternativo para la evaluación del proceso enseñanza- aprendizaje en la educación básica general. PUBLICACIONES, 50(2), 121–132. https://doi.org/10.30827/publicaciones.v50i2.13948

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. SAGE Publication.

Hayden, J., Keegan, M., Kardong-Edgren, S., & Smiley, R. A. (2014). Reliability and validity testing of the Creighton Competency Evaluation Instrument for use in the NCSBN National Simulation Study. Nursing Education Perspectives, 35(4), 244–252. https://doi.org/10.5480/13-1130.1

Huber-Carol, C., Balakrishnan, N., Nikulin, M. S., & Mesbah, M. (2012). Goodness-of-fit tests and model validity. Springer Science & Business Media.

Huda, N., Sukarmin, Y., & Dimyati, D. (2022). Multiple Intelligence-based basketball performance assessment in high school. Jurnal Penelitian Dan Evaluasi Pendidikan, 26(2), 201–216.

Jara, M., & Mellar, H. (2010). Quality enhancement for e-learning courses: The role of student feedback. Computers & Education, 54(3), 709–714. https://doi.org/https://doi.org/10.1016/j.compedu.2009.10.016

Javaid, M., Haleem, A., Singh, R. P., & Sinha, A. K. (2024). Digital economy to improve the culture of industry 4.0: A study on features, implementation and challenges. Green Technologies and Sustainability, 2(2), 100083. https://doi.org/https://doi.org/10.1016/j.grets.2024.100083

Kirkpatrick, D. L. (1994). Evaluating training program—The four levels. Berrett-Koehler Pablishers.

Kola Olayiwola, R., Tuomi, V., Strid, J., & Nahan-Suomela, R. (2024). Impact of Total quality management on cleaning companies in Finland: A Focus on organisational performance and customer satisfaction. Cleaner Logistics and Supply Chain, 10. 100139. https://doi.org/https://doi.org/10.1016/j.clscn.2024.100139

Li, L., & Dolman, A. J. (2023). On the reliability of composite analysis: an example of wet summers in North China. Atmospheric Research, 292, 106881. https://doi.org/https://doi.org/10.1016/j.atmosres.2023.106881

Luo, L., Ai, D., Qiao, H., Peng, C., Sun, C., Qi, Q., Jin, T., Zhou, M., & Xu, X. (2023). Evaluation of systematic frequency shift and uncertainty of an optical clock based on Bayesian hierarchical model. Optics Communications, 545, 129745. https://doi.org/https://doi.org/10.1016/j.optcom.2023.129745

Mardapi, D. (2008). Teknik penyusunan instrumen tes dan nontes (A. Setiawan (ed.)). Parama Publisihing.

Marsh, H. W., Nagengast, B., Morin, A. J. S., Parada, R. H., Craven, R. G., & Hamilton, L. R. (2011). Construct validity of the multidimensional structure of bullying and victimization: An application of exploratory structural equation modeling. Journal of Educational Psychology, 103(3), 701–732. https://doi.org/10.1037/a0024122

Maynard, B. R., Solis, M. R., Miller, V. L., & Brendel, K. E. (2017). Mindfulness-based intervenions for improving cogniion, academic achievement, behavior, and socioemoional funcioning of primary and secondary school students. Campbell Systematic Reviews, 13(1), 1–144. https://doi.org/10.4073/csr.2017.5

Mensch, S. M., Echteld, M. A., Evenhuis, H. M., & Rameckers, E. A. A. (2016). Construct validity and responsiveness of Movakic: An instrument for the evaluation of motor abilities in children with severe multiple disabilities. Research in Developmental Disabilities, 59, 194–201. https://doi.org/10.1016/j.ridd.2016.08.012

Mertens, D. M. (2000). Institutionalizing evaluation in the United States of America. In R. Stockmann (Ed.), Evaluationsforschung (pp. 41–56). VS Verlag für Sozialwissenschaften. https://doi.org/10.1007/978-3-322-92229-8_2

Murad, M. H., Chu, H., Wang, Z., & Lin, L. (2024). Hierarchical models that address measurement error are needed to evaluate the correlation between treatment effect and control group event rate. Journal of Clinical Epidemiology, 170. 111327. https://doi.org/https://doi.org/10.1016/j.jclinepi.2024.111327

Pan, G., Shankararaman, V., Koh, K., & Gan, S. (2021). Students’ evaluation of teaching in the project-based learning programme: An instrument and a development process. The International Journal of Management Education, 19(2), 100501. https://doi.org/10.1016/j.ijme.2021.100501

Pandiyan, P., Saravanan, S., Usha, K., Kannadasan, R., Alsharif, M. H., & Kim, M.-K. (2023). Technological advancements toward smart energy management in smart cities. Energy Reports, 10. 648–677. https://doi.org/https://doi.org/10.1016/j.egyr.2023.07.021

Pate, K., Powers, K., Coffman, M. J., & Morton, S. (2022). Improving Self-Efficacy of Patients With a New Ostomy With Written Education Materials: A Quality Improvement Project. Journal of PeriAnesthesia Nursing, 37(5), 620–625. https://doi.org/https://doi.org/10.1016/j.jopan.2021.11.020

Qi, S., Liu, L., Kumar, B. S., & Prathik, A. (2022). An English teaching quality evaluation model based on Gaussian process machine learning. Expert Systems, 39(6). https://doi.org/10.1111/exsy.12861

Qomari, R. (1970). Pengembangan instrumen evaluasi domain afektif. INSANIA : Jurnal Pemikiran Alternatif Kependidikan, 13(1), 87–109. https://doi.org/10.24090/insania.v13i1.287

Reitz, O. E. (2014). The job embeddedness instrument: An evaluation of validity and reliability. Geriatric Nursing, 35(5), 351–356. https://doi.org/10.1016/j.gerinurse.2014.04.011

Remijn, L., Speyer, R., Groen, B. E., van Limbeek, J., & Sandeng, M. W. G. N. der. (2014). Validity and reliability of the Mastication Observation and Evaluation (MOE) instrument. Research in Developmental Disabilities, 35(7), 1551–1561. https://doi.org/10.1016/j.ridd.2014.03.035

Roldán-Merino, J., Farrés-Tarafa, M., Estrada-Masllorens, J. M., Hurtado-Pardos, B., Miguel-Ruiz, D., Nebot-Bergua, C., Insa-Calderon, E., Grané-Mascarell, N., Bande-Julian, D., Falcó-Pergueroles, A. M., Lluch-Canut, M.-T., & Casas, I. (2019). Reliability and validity study of the Spanish adaptation of the “Creighton Simulation Evaluation Instrument (C-SEI).” Nurse Education in Practice, 35, 14–20. https://doi.org/10.1016/j.nepr.2018.12.007

Saaty, T. L. (2007). The analytic hierarchy and analytic network measurement processes: Applications to decisions under Risk. European Journal of Pure and Applied Mathematics, 1(1), 122–196. https://doi.org/10.29020/nybg.ejpam.v1i1.6

Saito, T., Izawa, K. P., Omori, Y., & Watanabe, S. (2016). Functional independence and difficulty scale: Instrument development and validity evaluation. Geriatrics & Gerontology International, 16(10), 1127–1137. https://doi.org/10.1111/ggi.12605

Sánchez, D., Chala, A., Alvarez, A., Payan, C., Mendoza, T., Cleeland, C., & Sanabria, A. (2016). Psychometric validation of the M. D. Anderson Symptom Inventory–Head and neck module in the Spanish Language. Journal of Pain and Symptom Management, 51(6), 1055–1061. https://doi.org/10.1016/j.jpainsymman.2015.12.320

Santoso, S. (2017). Menguasai statistik dengan SPSS 24. Elex Media Komputindo.

Setiawan, R. (2019). A comparison of score equating conducted using Haebara and Stocking Lord method for polytomous data in national examination of Indonesia by Risky Setiawan.

Setiawan, R., Hadi, S., & Aman, A. (2024). Psychometric properties of learning environment diagnostics instrument. Journal of Education and Learning (EduLearn), 18(3), 690–698.

Setiawan, R., Mardapi, D., Aman, A., & Budi, U. (2020). Multiple intelligences-based creative curriculum: The best practice. European Journal of Educational Research, 9(2), 611–627. https://doi.org/10.12973/eu-jer.9.2.611

Sihombing, R. U., Naga, D. S., & Rahayu, W. (2019). A Rasch model measurement analysis on science literacy test of Indonesian students: Smart way to improve the learning assessment. Indonesian Journal of Educational Review, 6(1), 44–55.

Sokhanvar, Z., Salehi, K., & Sokhanvar, F. (2021). Advantages of authentic assessment for improving the learning experience and employability skills of higher education students: A systematic literature review. Studies in Educational Evaluation, 70. 101030. https://doi.org/https://doi.org/10.1016/j.stueduc.2021.101030

Stufflebeam, D. L., & Shinkfield, A. J. (2012). Systematic evaluation: A self-instructional guide to theory and practice. Springer Science & Business Media.

Walton, M. B., Cowderoy, E., Lascelles, D., & Innes, J. F. (2013). Evaluation of construct and criterion validity for the ‘Liverpool Osteoarthritis in Dogs’ (LOAD) clinical metrology instrument and comparison to two other instruments. PLoS ONE, 8(3), e58125. https://doi.org/10.1371/journal.pone.0058125

Ward, D. S., Mazzucca, S., McWilliams, C., & Hales, D. (2015). Use of the environment and policy Evaluation and Observation as a Self-Report Instrument (EPAO-SR) to measure nutrition and physical activity environments in child care settings: validity and reliability evidence. International Journal of Behavioral Nutrition and Physical Activity, 12(1), 124. https://doi.org/10.1186/s12966-015-0287-0

Widyoko, S. E. P. (2009). Evaluasi program pembelajaran (instructional program evaluation) (pp. 1–16).

Wu, H.-Y., & Lin, H.-Y. (2012). A hybrid approach to develop an analytical model for enhancing the service quality of e-learning. Computers & Education, 58(4), 1318–1338. https://doi.org/https://doi.org/10.1016/j.compedu.2011.12.025

Y. Kuo, B. Belland, & Yu-Tung Kuo. (n.d.). Learning through {Blogging}: Students. {Perspectives} in {Collaborative} {Blog}-{Enhanced} {Learning} {Communities}.

Yangari, M., & Inga, E. (2021). Educational innovation in the evaluation processes within the flipped and blended learning models. Education Sciences, 11(9), 487. https://doi.org/10.3390/educsci11090487

Zampirolli, F. A., Goya, D., Pimentel, E. P., & Kobayashi, G. (2018). Evaluation process for an introductory programming course using blended learning in engineering education. Computer Applications in Engineering Education, 26(6), 2210–2222. https://doi.org/10.1002/cae.22029

DOI: https://doi.org/10.21831/reid.v10i1.63483