
By Karen Sloan
June 5 (Reuters) - The latest generation of generative artificial intelligence can ace most law school final exams, a new study has found.
OpenAI’s newest model, called o3, earned grades ranging from A+ to B on eight spring finals given by faculty at the University of Maryland Francis King Carey School of Law, researchers found in a new paper published on SSRN.
Those high grades represent a significant improvement from previous studies done on earlier versions of ChatGPT, also from OpenAI, which scored B's, C's, and even one D's when researchers had them take law school finals in 2022 and 2023, according to the paper.
Studies conducted earlier by other researchers had also found that ChatGPT earned “mediocre” grades on law school finals and that although it improved the speed of legal writing, it did not improve the quality. Researchers also have found that AI can pass the bar exam.
However, generative AI looks to be catching up to actual high-performing law students, based on the latest study. Unlike ChatGPT, which immediately generates text in response to a user’s query, o3 is what is known as a reasoning model. This means that it generates tentative answers and multiple approaches to questions after internally evaluating and revising those responses, after which it produces the final text for the user.
The study’s authors — seven law professors from University of Maryland — graded the final answers from o3 on the same curve they use for their students. The program's answers earned an A+ in Constitutional Law, Professional Responsibility, and Property. Its answers got an A in Income Taxation, and an A- in Criminal Procedure. It scored a B+ in Secured Transactions and Torts, and a B in Administrative Law. The's answers program did well on both multiple choice questions and essays, the study found.
However, there were some limitations on o3's answers. The program’s relatively low grade in administrative law was attributable to the fact that o3 did not know about the 2024 U.S. Supreme Court opinion in Loper Bright Enterprises v. Raimondo, which overturned the Chevron doctrine, which was central to administrative law. That ruling had come shortly after the o3’s knowledge cutoff date.
The o3 program performed worse on one final when given access to the professor’s notes — an unanticipated outcome the researchers attributed to the program being “distracted” by too much text.
OpenAI did not immediately respond to a request for comment on Thursday about the study's findings.
The study’s authors wrote that they are already contemplating an updated experiment to determine how much of a cheating threat AI poses by instructing the program to make occasional spelling and grammar mistakes, so that those exams will be difficult to distinguish from those completed by real students.
Read more:
ChatGPT passes law school exams despite 'mediocre' performance
AI improves legal writing speed, not quality - study