Tuesday, 12 May 2026 / Published in Artificial Intelligence, Law

When AI Writes the Law: The Risks and Limits of Large Language Models in the Justice System

This article is part of the Hardwiring Justice series on Artificial Intelligence and the Justice System. This is Part 6 in the series examining how AI is shaping policing, prosecution, defense practice, and the courts.*

AI Generated Legal Writing Is Expanding Rapidly

As this series has explained, artificial intelligence (AI) has entering the justice system through many doors, but the most visible is perhaps the rapid adoption of large language models (LLMs).^[1] Judges, lawyers, court staff, and law enforcement agencies are experimenting with systems capable of drafting briefs, summarizing records, generating memos, and even producing proposed judicial orders.^[2] The appeal is obvious: LLMs can process enormous volumes of text and generate readable prose in seconds.^[3] For courts facing overwhelming caseloads and resource constraints, the efficiency gains are real and worth taking seriously.

Understanding what these systems actually do, however, is essential to using them well. Large language models are not reasoning engines.^[4] They are language engines.^[5] Their function is to generate text based on statistical patterns learned from massive collections of written material.^[6] They do not understand facts, weigh evidence, apply legal standards, or exercise judgment.^[7] They predict the next word in a sequence based on statistical probability.^[8] Practitioners who understand that distinction are better positioned to deploy these tools effectively mand to catch problems before they reach a filing or an opinion.

The Risk of Evidentiary Laundering

One risk worth knowing by name is evidentiary laundering.^[9] LLMs can rephrase complex technical or scientific information in ways that strip away the uncertainty embedded in the original material.^[10]Error rates, confidence intervals, methodological limitations, and probabilistic reasoning can disappear during summarization.^[11] What begins as cautious analysis may emerge as confident declarative prose. Evidence law is built around the careful evaluation of reliability, methodology, and limitations. Lawyers and judges who understand this dynamic can build verification habits that catch the problem before it affects how evidence is weighed.

AI Errors in Judicial Opinions

Recent incidents illustrate what happens when those habits are absent. In July 2025, federal courts in New Jersey and Mississippi withdrew published rulings after lawyers discovered errors traceable to unvetted AI research.^[12] In case of In re CorMedix Inc. Securities Litigation, the opinion cited authorities for propositions they did not support, misstated case outcomes, and attributed statements to defendants that were never alleged.^[13] Counsel flagged the errors in a letter to the court; the judge withdrew the opinion and its accompanying order within twenty-four hours.^[14]

The Mississippi case unfolded the same week.^[15] Both courts responded by adopting internal policies requiring that any AI-assisted research or drafting be independently verified before appearing in an opinion, a straightforward safeguard that practitioners across the system would do well to adopt.

Sanctions against attorneys who filed AI-generated submissions without verifying citations have followed in multiple jurisdictions, and courts have begun treating unverified AI use as a distinct category of professional misconduct. The practical lesson is clear: AI-assisted drafting requires a verification step that is at least as rigorous as the drafting itself. The underlying stakes go beyond individual liability. The legal system depends on confidence in the reliability of sources, precedents, and legal reasoning. Practitioners who build verification into their AI workflows protect not only their clients but the credibility of the system they work in.

Judicial Policies on AI Use

The judicial context raises additional considerations. When chambers staff rely on AI-generated summaries or legal analysis without careful review, errors can migrate into orders and opinions that carry the authority of the court. Because judicial writing shapes precedent, even small inaccuracies can propagate. Courts that have moved earliest to establish clear internal AI policies, defining permissible uses, requiring verification, and assigning responsibility for review, are best positioned to capture the efficiency benefits while managing the risks.

AI Cannot Replace Legal Judgment

Used with clear eyes, LLMs offer genuine value: faster document review, quicker synthesis of large records, reduced time on routine administrative drafting. The goal is not to avoid these tools but to use them in ways that match their actual capabilities. Large language models generate language.^[16] They do not evaluate truth, measure scientific reliability, or apply evidentiary standards. Workflows built on that understanding tend to work. Workflows that treat LLM output as a substitute for legal judgment tend not to.

The justice system depends on more than well-written text. It depends on reasoning, accountability, and transparency about how decisions are made. LLM’s cannot be allowed to decide. Keeping that boundary clear is what allows the efficiency these tools offer to serve justice rather than quietly undermine it.

* This article was edited with the assistance of AI in the form of a large language model. It was used solely for grammar and editing support. All substantive content and conclusions reflect human authorship.

^[1] Adam Allen Bent, Large Language Models: AI’s Legal Revolution, 44 Pace L. Rev. (2023).

^[2] Id.

^[3] Id.

^[4]^[4] Gary Marcus, BREAKING: LLM “Reasoning” Continues to Be Deeply Flawed, The Road to AI We Can Trust (Feb. 10, 2026).

^[5] Id.

^[6] Eljas Linna & Tuula Linna, Judicial Requirements for Generative AI in Legal Reasoning (Aug. 2025).

^[7] Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio & Mehrdad Farajtabar, The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity (June 2025).

^[8] Id.

^[9] Mohammad Beigi, Sijia Wang, Ying Shen, Zihao Lin, Adithya Kulkarni, Jianfeng He, Feng Chen, Ming Jin, Jin-Hee Cho, Dawei Zhou, Chang-Tien Lu & Lifu Huang, Rethinking the Uncertainty: A Critical Review and Analysis in the Era of Large Language Models (Oct. 26, 2024).

^[10] Id.

^[11]Id.

^[12] How AI Misled Two US Courts and the Urgent Case for AI Rules in Judging, THE AI FORUM (Aug. 28, 2025).

^[13] Id.

^[14] Id.

^[15] Id.

^[16] AI Demystified: Introduction to Large Language Models: LLMs for the Non-Technical (AI Simplified Series), Stanford University University Information Technology (last modified Dec. 13, 2024), https://uit.stanford.edu/service/techtraining/ai-demystified/llm.

When AI Writes the Law: The Risks and Limits of Large Language Models in the Justice System