Abstract
Grading and providing personalized feedback on short-answer questions is time consuming. Professional incentives often push instructors to rely on multiple-choice assessments instead, reducing opportunities for students to develop critical thinking skills. Using large-language-model (LLM) assistance, we augment the productivity of instructors grading short-answer questions in large classes. Through a randomized controlled trial across four undergraduate courses and almost 300 students in 2023/2024, we assess the effectiveness of AI-assisted grading and feedback in comparison to human grading. Our results demonstrate that AI-assisted grading can mimic what an instructor would do in a small class.
BibTeX Citation
@article{Heinrichetal2025,
author = {Heinrich, Tobias and Baily, Spencer and Chen, Kuan-Wu and DeOliveira, Jack and Park, Sanghoon and Wang, Navida Chun-Han},
journal = {PLOS ONE},
volume = {20},
number = {8},
pages = {e0328041},
title = {{{AI-assisted}} Grading and Personalized Feedback in Large Political Science Classes: {{Results}} from Randomized Controlled Trials},
year = {2025. 8. 19.},
doi = {10.1371/journal.pone.0328041}}