Abstract
Traditional grading of and feedback for short-answer questions is tremendously time consuming for instructors of large political science classes. Other professional obligations may make instructors shift to inferior learning tools, such as multiple choice questions. We study the potential of artificial intelligence-assisted grading using large language models to augment the productivity of instructor so that a large class could be like a smaller class. Through a randomized controlled trial across four undergraduate courses and almost 300 students in 2023/2024, we assess the effectiveness of AI-assisted grading and feedback in comparison to purely human grading. Our results demonstrate that AI-assisted grading can largely mimic what an instructor would do in a small class.