HOTS checker: Quick reviewing cognitive levels of learning outcomes using large language models

Dwi Soca Baskara; Hardika Hardika; Dio Lingga Purwodani; Nabil Muttaqin

doi:10.21831/jitp.v11i2.67174

HOTS checker: Quick reviewing cognitive levels of learning outcomes using large language models

Dwi Soca Baskara, Universitas Negeri Malang, Indonesia
Hardika Hardika, Universitas Negeri Malang, Indonesia
Dio Lingga Purwodani, Universitas Negeri Malang, Indonesia
Nabil Muttaqin, Universitas Negeri Malang, Indonesia

10.21831/jitp.v11i2.67174

Abstract

The development of tools for efficient and effective assessment of learning outcomes is crucial in education. However, identifying the appropriate cognitive levels for learning outcomes can be challenging for educators. This study proposes to develop a tool to address this challenge by combining the strengths of large language models (LLMs) and Bloom's taxonomy. The tool can benefit educators by providing them with a streamlined reviewing process and enhancing their ability to assess learning outcomes. This research referred to prototype development models by Pressman. The research stages included communication, quick plan, modeling and quick design, construction of prototype, delivery, and feedback. The validation process involved assessing the tool's accuracy, consistency, and potential to be implemented in real educational settings by educators. The overall score obtained from the validation process is 76.92%, with the highest results coming from the categories of the tool's potentiality. It demonstrates its potential as a valuable educational tool. The insights gained from the expert validation serve as a crucial guidepost for future iterations of the tool, aligning them more closely with the goals of enhancing learning outcomes in educational settings

Keywords

Learning outcomes; Large language models; Artificial intelligence; Cognitive levels; Bloom’s taxonomy; Learning design

References

Abulhul, Z. (2021). Teaching strategies for enhancing student learning. Journal of Practical Studies in Education, 2(3), 1–4. https://doi.org/10.46809/jpse.v2i3.22

Albatti, H. (2023). A review of intended learning outcomes of English lessons and learning motivation. AWEJ: Arab World English Journal, 14(2), 205–220. https://doi.org/10.24093/awej/vol14no2.15

Alhazmi, A. K., Zafar, H., & Al-Hammadi, F. (2015). Framework for integrating outcome-based assessment in online assessment: Research in progress. 2015 Science and Information Conference (SAI), 217–221. https://doi.org/10.1109/SAI.2015.7237147

Amien, M. S., & Hidayatullah, A. (2023). Assessing students’ metacognitive strategies in e-learning and their role in academic performance. Jurnal Inovasi Teknologi Pendidikan, 10(2), 158–166. https://doi.org/10.21831/jitp.v10i2.60949

Arcas, B. A. y. (2022). Do large language models understand us? Daedalus, 151(2), 183–197. https://www.jstor.org/stable/48662035

Bowman, R. F. (2022). Cornerstones of productive teaching and learning. The Clearing House: A Journal of Educational Strategies, Issues and Ideas, 95(2), 57–63. https://doi.org/10.1080/00098655.2022.2033671

Caines, A., Benedetto, L., Taslimipoor, S., Davis, C., Gao, Y., Andersen, O., Yuan, Z., Elliott, M., Moore, R., Bryant, C., Rei, M., Yannakoudakis, H., Mullooly, A., Nicholls, D., & Buttery, P. (2023). On the application of large language models for language teaching and assessment technology. AIED2023 Workshop: Empowering Education with LLMs - the Next-Gen Interface and Content Generation, 1–25. https://doi.org/10.48550/arXiv.2307.08393

Chang, E., Demberg, V., & Marin, A. (2021). Jointly improving language understanding and generation with quality-weighted weak supervision of automatic labeling. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 818–829. https://doi.org/10.18653/v1/2021.eacl-main.69

Charoensap, K., & Saeheaw, T. (2022). Customer experiences identification process using Bloom taxonomy and customer knowledge management. 2022 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), 122–127. https://doi.org/10.1109/ECTIDAMTNCON53731.2022.9720405

Chen, S., Gao, S., & He, J. (2023). Evaluating factual consistency of summaries with large language models. Computation and Language, 1–12. https://doi.org/10.48550/arXiv.2305.14069

Crichton, S., & Kinsel, E. (2003). Learning plans as support for the development of learner identity: A case study in rural Western Canada. Journal of Adult and Continuing Education, 8(2), 213–226. https://doi.org/10.7227/JACE.8.2.7

Elkins, S., Kochmar, E., Cheung, J. C. K., & Serban, I. (2023). How useful are educational questions generated by large language models? AIED Late Breaking Results 2023, 536–542. https://doi.org/10.48550/arXiv.2304.06638

Gilbert, H., Sandborn, M., Schmidt, D. C., Spencer-Smith, J., & White, J. (2023). Semantic compression with large language models. 2023 Tenth International Conference on Social Networks Analysis, Management and Security (SNAMS), 1–8. https://doi.org/10.1109/SNAMS60348.2023.10375400

Goel, N., Deshmukh, K., Patel, B. C., & Chacko, S. (2021). Tools and rubrics for assessment of learning outcomes. In Assessment Tools for Mapping Learning Outcomes With Learning Objectives (pp. 1–44). IGI Global. https://doi.org/0.4018/978-1-7998-4784-7.ch013

Goštautaitė, D. (2019). Principal component analysis and Bloom taxonomy to personalise learning. Proceedings of EDULEARN19 Conference, 2910–2920. https://doi.org/10.21125/EDULEARN.2019.0780

Harden, R. M. (2002). Learning outcomes and instructional objectives: is there a difference? Med Teach, 24(2), 151–155. https://doi.org/10.1080/0142159022020687

Hyder, I., & Bhamani, S. (2016). Bloom’s taxonomy (cognitive domain) in higher education settings: Reflection brief. Journal of Education and Educational Development, 3(2), 288–300. https://jmsnew.iobmresearch.com/index.php/joeed/article/view/198

Koh, J. H. L. (2022). Designing for designerly ways of knowing: Creating learning design futures in higher education. In Design Praxiology and Phenomenology. Springer. https://doi.org/10.1007/978-981-19-2806-2_11

Krathwohl, D. R. (2002). A revision of bloom’s taxonomy: An overview. Theory Into Practice, 41(4), 212–218. https://doi.org/10.1207/s15430421tip4104_2

Ma, Z., Dou, Z., Zhu, Y., Zhong, H., & Wen, J.-R. (2021). One chatbot per person: Creating personalized chatbots based on implicit user profiles. SIGIR, 1–10. https://doi.org/10.1145/3404835.3462828

Maxwell, G. S. (2021). Defining and assessing desired learning outcomes. In The Enabling Power of Assessment (pp. 1–399). Springer. https://doi.org/10.1007/978-3-030-63539-8_3

Mehany, M. S. H. M., & Gebken, R. (2021). Assessing the importance and cognition level of ACCE’s student learning outcomes: Industry, educator, and student perceptions. International Journal of Construction Education and Research, 17(4), 333–351. https://doi.org/10.1080/15578771.2020.1777487

Mulcare, D. M., & Shwedel, A. (2017). Transforming Bloom’s taxonomy into classroom practice: A practical yet comprehensive approach to promote critical reading and student participation. Journal of Political Science Education, 13(2), 121–137. https://doi.org/10.1080/15512169.2016.1211017

Muse, H., Bulathwela, S., & Yilmaz, E. (2023). Pre-training with scientific text improves educational question generation (student abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 16288–16289. https://doi.org/10.1609/aaai.v37i13.27004

Olsen, D. R., Keith, J., & Rosenberg, J. (2019). The quest for increased learning: Systematically aligning and assessing learning outcomes. 2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI), 314–319. https://doi.org/10.1109/IIAI-AAI.2019.00070

Pallagani, V., Muppasani, B., Murugesan, K., Rossi, F., Srivastava, B., Horesh, L., Fabiano, F., & Loreggia, A. (2023). Understanding the capabilities of large language models for automated planning. Computer Science, 1–12. https://doi.org/10.48550/arXiv.2305.16151

Park, H. W., Grover, I., Spaulding, S., Gomez, L., & Breazeal, C. (2019). A model-free affective reinforcement learning approach to personalization of an autonomous social robot companion for early literacy education. Proceedings of the AAAI Conference on Artificial Intelligence, 687–694. https://doi.org/10.1609/aaai.v33i01.3301687

Pressman, R., & Maxim, B. (2020). Software engineering: A practitioner’s approach. McGraw Hill.

Pujawan, I. G. N., Rediani, N. N., Antara, I. G. W. S., Putri, N. N. C. A., & Bayu, G. W. (2022). Revised Bloom taxonomy-oriented learning activities to develop scientific literacy and creative thinking skills. Jurnal Pendidikan IPA Indonesia, 11(1), 47–60. https://doi.org/10.15294/jpii.v11i1.34628

Raj, H., Rosati, D., & Majumdar, S. (2022). Measuring reliability of large language models through semantic consistency. NeurIPS 2022 ML Safety Workshop, 1–7. https://doi.org/10.48550/arXiv.2211.05853

Ramanathan, C. (2022). Technologies for teaching learning process, its evaluations and assessments. In Development of Employability Skills Through Pragmatic Assessment of Student Learning Outcomes (pp. 1–18). IGI Global. https://doi.org/10.4018/978-1-6684-4210-4.ch003

Ramírez, M. D. V., & Gerena, L. (2010). Bilingual education from learner perspectives. In Handbook of Research on Bilingual and Intercultural Education (pp. 1–19). IGI Global. https://doi.org/10.4018/978-1-7998-2588-3.ch017

Ratner, N., Levine, Y., Belinkov, Y., Ram, O., Magar, I., Abend, O., Karpas, E., Shashua, A., Leyton-Brown, K., & Shoham, Y. (2023). Parallel context windows for large language models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 6383–6402. https://doi.org/10.18653/v1/2023.acl-long.352

Rosson, M. B. (2014). Learning by design. In Innovative Practices in Teaching Information Sciences and Technology (pp. 75–83). Springer. https://doi.org/10.1007/978-3-319-03656-4_8

Sahu, P., Cogswell, M., Gong, Y., & Divakaran, A. (2022). Unpacking large language models with conceptual consistency. ICLR, 1–14. https://doi.org/10.48550/arXiv.2209.15093

Sarsa, S., Denny, P., Hellas, A., & Leinonen, J. (2022). Automatic generation of programming exercises and code explanations using large language models. ICER ’22: Proceedings of the 2022 ACM Conference on International Computing Education Research, 27–43. https://doi.org/10.1145/3501385.3543957

Schunk, D. H. (2012). Learning theories: An educational perspective. Pearson.

Sideeg, A. (2016). Bloom’s taxonomy, backward design, and Vygotsky’s zone of proximal development in crafting learning outcomes. International Journal of Linguistics, 8(2), 158–186. https://doi.org/10.5296/ijl.v8i2.9252

Sobral, S. R. (2021). Bloom’s taxonomy to improve teaching-learning in introduction to programming. International Journal of Information and Education Technology, 11(3), 148–153. https://doi.org/10.18178/ijiet.2021.11.3.1504

Stevani, M., & Tarigan, K. E. (2023). Evaluating English textbooks by using Bloom’s taxonomy to analyze reading comprehension question. SALEE: Study of Applied Linguistics and English Education, 4(1), 1–18. https://doi.org/10.35961/salee.v0i0.526

Tamkin, A., Brundage, M., Clark, J., & Ganguli, D. (2021). Understanding the capabilities, limitations, and societal impact of large language models. Computation and Language, 1–8. https://doi.org/10.48550/arXiv.2102.02503

Troitschanskaia, O. Z., Schlax, J., Jitomirski, J., Happ, R., Thees, C. K., Brückner, S., & Pant, H. A. (2019). Ethics and fairness in assessing learning outcomes in higher education. Higher Education Policy, 32(1), 537–556. https://doi.org/10.1057/s41307-019-00149-x

Wei, X., Saab, N., & Admiraal, W. (2021). Assessment of cognitive, behavioral, and affective learning outcomes in massive open online courses: A systematic literature review. Computers & Education, 163(1), 1–24. https://doi.org/10.1016/j.compedu.2020.104097

Whitelock, D., & Rienties, B. (2016). #Design4learning: Designing for the future of higher education. Journal of Interactive Media in Education, 2016(1), 1–3. https://doi.org/10.5334/JIME.417

Xiao, L., & Shan, X. (2023). PatternGPT: A pattern-driven framework for large language model text generation. Computation and Language, 1–14. https://doi.org/10.48550/arXiv.2307.00470

Zamir, S., & Jan, H. (2023). Assessment of papers of English of Sukkur BISE Sindh, Pakistan: An exploration of the reflection of Bloom’s taxonomy. Voyage Journal of Education Studies, 3(1), 220–240. https://doi.org/10.58622/vjes.v3i1.41

Zhang, H. (2021). Transfer training from smaller language model a preprint. ArXiv, 1–7.

Zorluoğlu, S. L., & Güven, Ç. (2020). Analysis of 5th grade science learning outcomes and exam questions according to revised Bloom taxonomy. Journal of Educational Issues, 6(1), 58–69. https://doi.org/10.5296/jei.v6i1.16197

DOI: https://doi.org/10.21831/jitp.v11i2.67174