Large language models are effective for summarizing student feedback
ORAL
Abstract
Student feedback of instruction is important for instructors as it provides a means for understanding learning needs, preferences, and challenges. Yet, for courses with a large number of students, such as introductory physics courses, reading through the feedback can be a time-consuming task, a potential barrier for already time-strapped instructors to regularly collect feedback in their courses. Large language models (LLMs) offer a potential solution as they are effective at summarizing large volumes of text. Combined with their ease of use through chat interfaces, instructors could easily and quickly summarize large amounts of feedback. In this study, we compared the performance of four popular LLMs (ChatGPT, Claude, Gemini, and Llama) to actual course instructors to summarize end-of-semester teaching evaluations from three instructors and eight course offerings at a large university located in the southeastern United States. In general, we find that LLMs identify similar trends in the evaluations as the human summarizers do, though we did find some differences in the detected themes and that some models perform better than the others on this task. Our work then suggests that LLMs are a useful tool for quickly extracting insights from student feedback. We end by providing best practices for instructors interested in using LLMs in their courses to summarize student feedback.
–
Presenters
-
Nicholas Young
University of Georgia
Authors
-
Nicholas Young
University of Georgia
-
Christopher Overton
University of Georgia
-
Ania Majewska
University of Georgia
-
Hina Shaikh
University of Georgia
-
Nandana Weliweriya
University of Georgia