HOTS checker: Quick reviewing cognitive levels of learning outcomes using large language models

Dwi Soca Baskara, Universitas Negeri Malang, Indonesia
Hardika Hardika, Universitas Negeri Malang, Indonesia
Dio Lingga Purwodani, Universitas Negeri Malang, Indonesia
Nabil Muttaqin, Universitas Negeri Malang, Indonesia


The development of tools for efficient and effective assessment of learning outcomes is crucial in education. However, identifying the appropriate cognitive levels for learning outcomes can be challenging for educators. This study proposes to develop a tool to address this challenge by combining the strengths of large language models (LLMs) and Bloom's taxonomy. The tool can benefit educators by providing them with a streamlined reviewing process and enhancing their ability to assess learning outcomes. This research referred to prototype development models by Pressman. The research stages included communication, quick plan, modeling and quick design, construction of prototype, delivery, and feedback. The validation process involved assessing the tool's accuracy, consistency, and potential to be implemented in real educational settings by educators. The overall score obtained from the validation process is 76.92%, with the highest results coming from the categories of the tool's potentiality. It demonstrates its potential as a valuable educational tool. The insights gained from the expert validation serve as a crucial guidepost for future iterations of the tool, aligning them more closely with the goals of enhancing learning outcomes in educational settings


Learning outcomes; Large language models; Artificial intelligence; Cognitive levels; Bloom’s taxonomy; Learning design

Full Text:



Abulhul, Z. (2021). Teaching strategies for enhancing student learning. Journal of Practical Studies in Education, 2(3), 1–4.

Albatti, H. (2023). A review of intended learning outcomes of English lessons and learning motivation. AWEJ: Arab World English Journal, 14(2), 205–220.

Alhazmi, A. K., Zafar, H., & Al-Hammadi, F. (2015). Framework for integrating outcome-based assessment in online assessment: Research in progress. 2015 Science and Information Conference (SAI), 217–221.

Amien, M. S., & Hidayatullah, A. (2023). Assessing students’ metacognitive strategies in e-learning and their role in academic performance. Jurnal Inovasi Teknologi Pendidikan, 10(2), 158–166.

Arcas, B. A. y. (2022). Do large language models understand us? Daedalus, 151(2), 183–197.

Bowman, R. F. (2022). Cornerstones of productive teaching and learning. The Clearing House: A Journal of Educational Strategies, Issues and Ideas, 95(2), 57–63.

Caines, A., Benedetto, L., Taslimipoor, S., Davis, C., Gao, Y., Andersen, O., Yuan, Z., Elliott, M., Moore, R., Bryant, C., Rei, M., Yannakoudakis, H., Mullooly, A., Nicholls, D., & Buttery, P. (2023). On the application of large language models for language teaching and assessment technology. AIED2023 Workshop: Empowering Education with LLMs - the Next-Gen Interface and Content Generation, 1–25.

Chang, E., Demberg, V., & Marin, A. (2021). Jointly improving language understanding and generation with quality-weighted weak supervision of automatic labeling. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 818–829.

Charoensap, K., & Saeheaw, T. (2022). Customer experiences identification process using Bloom taxonomy and customer knowledge management. 2022 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON), 122–127.

Chen, S., Gao, S., & He, J. (2023). Evaluating factual consistency of summaries with large language models. Computation and Language, 1–12.

Crichton, S., & Kinsel, E. (2003). Learning plans as support for the development of learner identity: A case study in rural Western Canada. Journal of Adult and Continuing Education, 8(2), 213–226.

Elkins, S., Kochmar, E., Cheung, J. C. K., & Serban, I. (2023). How useful are educational questions generated by large language models? AIED Late Breaking Results 2023, 536–542.

Gilbert, H., Sandborn, M., Schmidt, D. C., Spencer-Smith, J., & White, J. (2023). Semantic compression with large language models. 2023 Tenth International Conference on Social Networks Analysis, Management and Security (SNAMS), 1–8.

Goel, N., Deshmukh, K., Patel, B. C., & Chacko, S. (2021). Tools and rubrics for assessment of learning outcomes. In Assessment Tools for Mapping Learning Outcomes With Learning Objectives (pp. 1–44). IGI Global.

Goštautaitė, D. (2019). Principal component analysis and Bloom taxonomy to personalise learning. Proceedings of EDULEARN19 Conference, 2910–2920.

Harden, R. M. (2002). Learning outcomes and instructional objectives: is there a difference? Med Teach, 24(2), 151–155.

Hyder, I., & Bhamani, S. (2016). Bloom’s taxonomy (cognitive domain) in higher education settings: Reflection brief. Journal of Education and Educational Development, 3(2), 288–300.

Koh, J. H. L. (2022). Designing for designerly ways of knowing: Creating learning design futures in higher education. In Design Praxiology and Phenomenology. Springer.

Krathwohl, D. R. (2002). A revision of bloom’s taxonomy: An overview. Theory Into Practice, 41(4), 212–218.

Ma, Z., Dou, Z., Zhu, Y., Zhong, H., & Wen, J.-R. (2021). One chatbot per person: Creating personalized chatbots based on implicit user profiles. SIGIR, 1–10.

Maxwell, G. S. (2021). Defining and assessing desired learning outcomes. In The Enabling Power of Assessment (pp. 1–399). Springer.

Mehany, M. S. H. M., & Gebken, R. (2021). Assessing the importance and cognition level of ACCE’s student learning outcomes: Industry, educator, and student perceptions. International Journal of Construction Education and Research, 17(4), 333–351.

Mulcare, D. M., & Shwedel, A. (2017). Transforming Bloom’s taxonomy into classroom practice: A practical yet comprehensive approach to promote critical reading and student participation. Journal of Political Science Education, 13(2), 121–137.

Muse, H., Bulathwela, S., & Yilmaz, E. (2023). Pre-training with scientific text improves educational question generation (student abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 16288–16289.

Olsen, D. R., Keith, J., & Rosenberg, J. (2019). The quest for increased learning: Systematically aligning and assessing learning outcomes. 2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI), 314–319.

Pallagani, V., Muppasani, B., Murugesan, K., Rossi, F., Srivastava, B., Horesh, L., Fabiano, F., & Loreggia, A. (2023). Understanding the capabilities of large language models for automated planning. Computer Science, 1–12.

Park, H. W., Grover, I., Spaulding, S., Gomez, L., & Breazeal, C. (2019). A model-free affective reinforcement learning approach to personalization of an autonomous social robot companion for early literacy education. Proceedings of the AAAI Conference on Artificial Intelligence, 687–694.

Pressman, R., & Maxim, B. (2020). Software engineering: A practitioner’s approach. McGraw Hill.

Pujawan, I. G. N., Rediani, N. N., Antara, I. G. W. S., Putri, N. N. C. A., & Bayu, G. W. (2022). Revised Bloom taxonomy-oriented learning activities to develop scientific literacy and creative thinking skills. Jurnal Pendidikan IPA Indonesia, 11(1), 47–60.

Raj, H., Rosati, D., & Majumdar, S. (2022). Measuring reliability of large language models through semantic consistency. NeurIPS 2022 ML Safety Workshop, 1–7.

Ramanathan, C. (2022). Technologies for teaching learning process, its evaluations and assessments. In Development of Employability Skills Through Pragmatic Assessment of Student Learning Outcomes (pp. 1–18). IGI Global.

Ramírez, M. D. V., & Gerena, L. (2010). Bilingual education from learner perspectives. In Handbook of Research on Bilingual and Intercultural Education (pp. 1–19). IGI Global.

Ratner, N., Levine, Y., Belinkov, Y., Ram, O., Magar, I., Abend, O., Karpas, E., Shashua, A., Leyton-Brown, K., & Shoham, Y. (2023). Parallel context windows for large language models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 6383–6402.

Rosson, M. B. (2014). Learning by design. In Innovative Practices in Teaching Information Sciences and Technology (pp. 75–83). Springer.

Sahu, P., Cogswell, M., Gong, Y., & Divakaran, A. (2022). Unpacking large language models with conceptual consistency. ICLR, 1–14.

Sarsa, S., Denny, P., Hellas, A., & Leinonen, J. (2022). Automatic generation of programming exercises and code explanations using large language models. ICER ’22: Proceedings of the 2022 ACM Conference on International Computing Education Research, 27–43.

Schunk, D. H. (2012). Learning theories: An educational perspective. Pearson.

Sideeg, A. (2016). Bloom’s taxonomy, backward design, and Vygotsky’s zone of proximal development in crafting learning outcomes. International Journal of Linguistics, 8(2), 158–186.

Sobral, S. R. (2021). Bloom’s taxonomy to improve teaching-learning in introduction to programming. International Journal of Information and Education Technology, 11(3), 148–153.

Stevani, M., & Tarigan, K. E. (2023). Evaluating English textbooks by using Bloom’s taxonomy to analyze reading comprehension question. SALEE: Study of Applied Linguistics and English Education, 4(1), 1–18.

Tamkin, A., Brundage, M., Clark, J., & Ganguli, D. (2021). Understanding the capabilities, limitations, and societal impact of large language models. Computation and Language, 1–8.

Troitschanskaia, O. Z., Schlax, J., Jitomirski, J., Happ, R., Thees, C. K., Brückner, S., & Pant, H. A. (2019). Ethics and fairness in assessing learning outcomes in higher education. Higher Education Policy, 32(1), 537–556.

Wei, X., Saab, N., & Admiraal, W. (2021). Assessment of cognitive, behavioral, and affective learning outcomes in massive open online courses: A systematic literature review. Computers & Education, 163(1), 1–24.

Whitelock, D., & Rienties, B. (2016). #Design4learning: Designing for the future of higher education. Journal of Interactive Media in Education, 2016(1), 1–3.

Xiao, L., & Shan, X. (2023). PatternGPT: A pattern-driven framework for large language model text generation. Computation and Language, 1–14.

Zamir, S., & Jan, H. (2023). Assessment of papers of English of Sukkur BISE Sindh, Pakistan: An exploration of the reflection of Bloom’s taxonomy. Voyage Journal of Education Studies, 3(1), 220–240.

Zhang, H. (2021). Transfer training from smaller language model a preprint. ArXiv, 1–7.

Zorluoğlu, S. L., & Güven, Ç. (2020). Analysis of 5th grade science learning outcomes and exam questions according to revised Bloom taxonomy. Journal of Educational Issues, 6(1), 58–69.



  • There are currently no refbacks.

Copyright (c) 2024 Dwi Soca Baskara, Hardika, Dio Lingga Purwodani, Nabil Muttaqin

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Our journal indexed by:


View Journal Statistics