Back to Stories

Legal AI past the productivity question



May 28, 2026 - 2 min read

The legal profession has moved past the question of whether AI will be used and into the question of what its use leaves behind. The productivity claim is now empirically settled. Schwarcz et al. (2026), in the Journal of Law and Empirical Analysis, ran a randomised controlled trial in which law students completed six legal tasks using a RAG-grounded tool (Vincent AI), a reasoning model (o1-preview), or no AI. They report productivity gains of 38% to 140% across five tasks, with the RAG tool producing roughly the same hallucination rate as no AI. The reasoning model introduced fresh hallucinations even as it improved analytical depth.

Once vendor tools are tested at scale the reliability picture sharpens. Magesh et al. (2025), in the Journal of Empirical Legal Studies, conducted the first preregistered evaluation of proprietary legal AI tools. Lexis+ AI hallucinated 17% of the time, Westlaw AI-Assisted Research 33%, GPT-4 43%. The residual errors are often subtle: real cases mischaracterised, inapplicable authorities cited, doctrine paraphrased into something the source does not say.

A second risk is distributional. Cofone & Khern-am-nuai (2025), in the Indiana Law Journal, returned to the COMPAS recidivism dataset with causal inference methods and found that the algorithm does not merely reproduce the racial bias in its training data, it worsens it by roughly 20%. They also show that fairness constraints can correct distortions from flawed outcome variables, inverting the dominant framing in which fairness and accuracy trade off. Whether AI enlarges access to justice rather than concentrates it depends on how it is introduced. Chien & Kim (2025), in the Loyola of Los Angeles Law Review, ran the first field study of 91 legal aid lawyers. Organic uptake skewed male despite a majority-female workforce, and participants assigned to concierge support reported statistically better outcomes than the control.

The regulatory frame is the piece that lags. Rangone & Megale (2025), in the European Journal of Risk Regulation, argue that the EU AI Act classifies the administration of justice as high-risk but addresses law and rule-making itself only obliquely, distinguishing core, quasi-core, and ancillary applications without binding obligations for the upstream activities where AI is now being introduced. Read across the five studies the direction is consistent; residual hallucination in vendor-validated tools, bias amplification in predictive systems, uneven diffusion in legal aid, and a regulatory framework that meets the technology one step downstream of where it is being deployed.


Scan the QR code to view this story on your mobile device.


Legal AI hallucinationsAlgorithmic bias amplificationAccess to justiceEU AI ActVendor tool reliability