AI vs. Real Students: A Wake-Up Call for Educational Integrity
June 28, 2024A recent study by the University of Reading has revealed that artificial intelligence (AI) can outperform real students in university exams. This limited study highlights significant implications for the integrity of educational assessments and calls for a re-evaluation of current practices in the face of advancing AI technology.
Researchers created 33 fictitious students and used ChatGPT, an AI tool, to generate answers for undergraduate psychology exams. On average, the AI-generated responses scored half a grade higher than those of actual students. Surprisingly, 94% of these AI submissions did not raise any concerns among markers, indicating that the AI-generated content was nearly undetectable.
“This is particularly worrying as AI submissions robustly gained higher grades than real student submissions,” the study stated. The findings suggest that students could cheat using AI and achieve better grades than their peers who do not cheat.
Associate Prof Peter Scarfe and Prof Etienne Roesch, who led the study, emphasized the urgency of addressing this issue. Dr. Scarfe noted, “Many institutions have moved away from traditional exams to make assessment more inclusive. Our research shows it is of international importance to understand how AI will affect the integrity of educational assessments. We won’t necessarily go back fully to handwritten exams – but the global education sector will need to evolve in the face of AI.”
The study involved submitting AI-generated answers and essays for first-, second-, and third-year modules without the markers’ knowledge. AI outperformed real students in the first two years, but humans scored better in the third-year exams. This discrepancy suggests that AI currently struggles with more abstract reasoning, which is more prevalent in advanced coursework.
As the largest and most robust blind study of its kind to date, these findings have significant implications. The low detection rate of AI-generated content, coupled with its higher average scores, underscores the need for new strategies in assessment design and academic integrity.
Academics have already voiced concerns about AI’s influence in education. For instance, Glasgow University recently reintroduced in-person exams for one course to mitigate the potential misuse of AI. Additionally, a Guardian report earlier this year found that most undergraduates used AI programs to aid their essays, but only 5% admitted to pasting unedited AI-generated text into their assessments.
These findings serve as a wake-up call for educators worldwide. As AI continues to advance, educational institutions must adapt their assessment methods to ensure fairness and integrity. This may involve a combination of traditional and innovative approaches, such as in-person exams and advanced AI detection tools, to safeguard against academic dishonesty.
Conclusively, the study from the University of Reading highlights a critical issue in modern education. AI’s ability to produce high-quality, undetectable exam answers poses a significant challenge to the integrity of academic assessments. Educators and institutions must take proactive measures to address this issue and ensure that the evolution of education keeps pace with technological advancements.