Assessing introductory physics students using large open item bank created using GPT-3 and Wolfram Alpha
ORAL
Abstract
Most assessments in introductory physics are based on secure assessment items, which are supposed to be revealed to the test takers only at the time of the assessments. However, the rise of resource sharing websites such as Chegg is making it increasingly difficult and expensive to maintain item security.
This pilot study explores a different assessment method based on large open isomorphic item banks. Isomorphic items are created by systematically varying features of a template problem, which can be done efficiently by training natural language model GPT-3 to write problem text, and using Wolframalpha to generate solutions. One problem bank of 45 isomorphic problems was created and open to students for practice, and on a following exam one problem was randomly chosen from the bank. A second problem which was also isomorphic but not included in the bank was also given on the same exam. The correct rate of the first problem was 23% and the second was 19%, with a correlation coefficient of 0.47. I will also compare Item response theory properties of those items, and the relation between student practice strategy and performance. This new assessment scheme has three major potential advantages. First, it will significantly reduce incentives for students to use websites such as Chegg. Second, item banks can be openly shared among instructors, leading to lower creation cost, higher quality, and comparable outcomes. Finally, assessments can be administered asynchronously, more frequently, and allow multiple attempts.
This pilot study explores a different assessment method based on large open isomorphic item banks. Isomorphic items are created by systematically varying features of a template problem, which can be done efficiently by training natural language model GPT-3 to write problem text, and using Wolframalpha to generate solutions. One problem bank of 45 isomorphic problems was created and open to students for practice, and on a following exam one problem was randomly chosen from the bank. A second problem which was also isomorphic but not included in the bank was also given on the same exam. The correct rate of the first problem was 23% and the second was 19%, with a correlation coefficient of 0.47. I will also compare Item response theory properties of those items, and the relation between student practice strategy and performance. This new assessment scheme has three major potential advantages. First, it will significantly reduce incentives for students to use websites such as Chegg. Second, item banks can be openly shared among instructors, leading to lower creation cost, higher quality, and comparable outcomes. Finally, assessments can be administered asynchronously, more frequently, and allow multiple attempts.
–
Publication: None
Presenters
-
Zhongzhou Chen
University of Central Florida
Authors
-
Zhongzhou Chen
University of Central Florida