Lexos v. Overstock: Five Lawyers, Zero Verification

2026-02-04

The “Human in the Loop” Is a Myth

February 2026

On February 2, Judge Julie A. Robinson of the District of Kansas sanctioned five attorneys for filing a brief containing hallucinated case law. The incident went viral immediately. Not because a lawyer used ChatGPT - we have seen that before - but because of the sheer scale of the failure. Lead counsel, senior partner, managing partner, associate, local counsel; five pairs of eyes all missed the problem.

The truly troubling part of Lexos v. Overstock is not the technology. It is the psychology. It is the systematic collapse of the “human in the loop” theory of verification. Every layer of defense that the legal profession relies on failed in the face of plausible-sounding text.

This is why we spend so much of our time on solving this problem. Relying on humans to spot hallucinations buried in 95% valid prose is a gamble.

The Simulation of Judgment

A revealing detail in the court’s opinion is the specific prompt the attorney used. He did not ask ChatGPT to find cases. He didn’t ask it to research the standard for striking expert testimony.

He instructed ChatGPT: “Taking the role of a judge, write an order that denies the motion to strike with caselaw support…”

This is the prompt of a user who fundamentally misunderstands the tool. He did not ask for information retrieval. He asked for a simulation. He asked a prediction engine to generate a specific outcome (“denies the motion”) and the engine obliged by generating the necessary reality to achieve it.

In our research & development we use personas in AI all the time to test and to achieve certain outcomes, but asking a generalist tool - like ChatGPT, Claude or Gemini - with no citator safeguards built-in, to play a specific role is a recipe for hallucination.

When you ask an LLM to play a role, it plays the role. A judge denying a motion needs authority. If the authority doesn’t exist, the model invents it, because the model’s imperative is to do what it has been asked to do, not to second guess the reason for and context in which you might be asking.

The attorney claimed to have written “approximately 95%” of the brief. But that 5% of AI-generated content contained the only things that mattered: the citations, the case quotations, the legal authority.

The Plausibility Trap

The brief cited Liquid Dynamics Corp. v. Vaughan Co. for the proposition that an expert’s incorrect claim construction goes to weight, not admissibility.

This is the most dangerous kind of hallucination. Liquid Dynamics is a real Federal Circuit patent case. But the court actually held the opposite: it affirmed the exclusion of an expert specifically because they relied on an impermissible claim construction.

The AI reversed the holding to satisfy the user’s prompt.

Then there was Hockett v. City of Topeka. The brief cited it for the “drastic remedy” of exclusion. It included a parenthetical quotation. It looked convincing.

But Hockett v. City of Topeka does not exist. The judge noted that she presided over the case number cited—but it was a Social Security appeal with a different name entirely. Yes, this sounds like an awkward interaction with a judge; in fact it’s the kind of thing that wouuld make a lawyer’s blood run cold.

The Failure of Eyes-On Review

The conventional wisdom in legal tech is that AI is safe as long as a human reviews the output. Lexos dismantles this defense.

Five attorneys signed these documents.

The drafting attorney, with thirty years of experience, admitted he did not check the citations. He blamed a “scrambled state of mind” due to a family health crisis.

The associate made “minor edits” but assumed the senior attorney’s work was sound.

Two partners signed without reading the document at all.

But the most critical failure was local counsel.

Local counsel are the final gatekeepers. They sign to give a filing the imprimatur of the court’s bar. This local counsel read the brief in its entirety. He testified that “none of the quotes or citations drew [his] attention, so [he] did not cite check.”

This is a phenomenon we observe constantly. When a document is well-formatted, grammatically correct, and tonally appropriate, the human brain switches off its skepticism. The prose was fluent. The citations looked like citations. The arguments sounded like arguments.

He didn’t check because the document didn’t look like it needed checking. The simulation was too good.

Verification as Infrastructure

The court’s remedy was predictable: it ordered the lawyers to implement “stricter internal review procedures.”

With respect, this solves nothing. Technology helped create this problem, and I firmly believe technology will solve it.

You cannot policy your way out of this issue, or just tell lawyers to “be more careful.” When the cost of generating text drops to zero, and the quality of that text approaches human parity, the cognitive load of verifying it becomes unmanageable for a human reader.

Verification must be infrastructure.

MotionValidator runs case application checks as part of every validation

This is the premise of MotionValidator. Instead of asking a human to scan a brief and see if the citations “look right,” we strip the document to its assertions. We dismantle the document using advanced natural language processing methods, and we ensure that what it says accurately reflects the cases it cites. We can do that with confidence because we understand how to use the technology we have in the correct way.

Does Hockett v. City of Topeka exist? A database lookup would have flagged this instantly.
Does Liquid Dynamics support the proposition that claim construction errors go to weight? Our adversarial use of powerful reasoning models would have flagged that this was an incorrect application of the case.
Are the quotes real? A simple string match would have failed.

In a world where one tired lawyer using a role-playing prompt can infect a federal court filing with fiction, relying on “eyes-on review” alone is professional negligence.

When a document passes all of MotionValidator’s checks - existence, treatment, application, authority - the system generates a “Citation Verification Report” (CVR): a downloadable PDF documenting exactly what was verified and when. If a brief cites a case, the CVR will demonstrate how your citation matches specific language in the case you cited - this approach of “show don’t tell” increases confidence.

My instinct is that as courts begin requiring disclosure of AI use in document preparation, having timestamped, verifiable proof of citation checking will shift from good practice to necessity.

MotionValidator’s Citation Verification Report

The Standard Has Shifted

The Lexos opinion is a warning shot. The court sanctioned the lawyers not just for using AI, but for abdicating their duty of inquiry. The judge noted that given the “well-publicized risks” of AI hallucinations, failing to verify is no longer an innocent mistake. It is sanctionable conduct.

The attorney told the court he was a “novice” with AI. But professional rules of conduct in just about every jurisdiction I can think of require lawyers - even if unfamiliar with a tool or concept - to be diligent enough not to mislead the court (intentionally or otherwise).

If you are filing a brief you didn’t write, whether it was drafted by a colleague, opposing counsel, or an LLM, you need to know if the law in it is real. You cannot rely on the fact that it sounds like a lawyer wrote it.

That is the confidence trap. And as five lawyers in Kansas just discovered, the price of falling into it is getting higher.

You can read the full 2 February 2026 opinion in Lexos v. Overstock here.

Learn more about MotionValidator at motionvalidator.com.

Luke Cohen is the CEO of Citational, and the former Head of AI R&D for Legalzoom