The privilege trap in legal AI, and how we avoid it

12 February 2026

A motion filed on February 6, 2026 in the Southern District of New York should be required reading for every lawyer using commercial AI. It will also, I suspect, be ignored by most of the companies building legal AI tools.

In United States v. Bradley Heppner (1:25-cr-00503-JSR), the U.S. Attorney’s office moved for a ruling that 31 documents the defendant created using Anthropic’s Claude public-facing chatbot are not privileged read the motion here. The Government’s reasoning is direct, and on the law, almost certainly correct. The defendant used a commercial chatbot to query legal issues related to the government’s investigation of him, then shared those documents with his lawyers and claimed privilege.

The Government’s position is that privilege never attached in the first place. Not on any of the three grounds that might save it.

I have been building for this exact problem since I first got involved with legal tech back in 2017. It can be a headache, that is certain, but it is a technical, legal and intellectual exercise that should be absolutely mandatory for anyone working on the nexus between law and tech.

What the SDNY is being asked to decide

The Government advances three independent arguments, each sufficient on its own.

First, Claude is not a lawyer. This sounds obvious, but the implications are precise. Attorney-client privilege requires a communication with an attorney, made for the purpose of obtaining legal advice, in confidence. The Government’s brief is blunt:

“The AI tool is plainly not an attorney… The AI tool that the defendant used has no law degree and is not a member of the bar. It owes no duties of loyalty and confidentiality to its users.”

Anthropic’s own terms of service disclaim any attorney-client relationship. Its design principles include choosing responses that avoid “giving the impression of giving specific legal advice.” The tool itself is built not to be what the defendant needed it to be.

Second, and this is the argument with the widest implications, using a commercial platform constitutes voluntary disclosure to a third party. Anthropic’s privacy policy states that it collects user prompts and may disclose that data to “governmental regulatory authorities.” The moment Heppner typed his queries into Claude, he was sharing them with Anthropic. Privilege does not survive voluntary disclosure to a third party. Full stop.

Third, you cannot make something privileged after the fact by handing it to your lawyer. The Government’s framing is clean:

“The defendant cannot retroactively cloak unprivileged documents with privilege by later transmitting them to counsel… The defendant’s use of the AI tool here is no different than if he had asked friends for their input on his legal situation.”

The work product doctrine fails too. Defense counsel admitted they never directed Heppner to use Claude. He did it on his own initiative. The doctrine protects materials prepared by or at the direction of counsel. Independent research by a layperson, even sophisticated research, does not qualify.

Why this is not just Heppner’s problem

The Heppner facts are dramatic. A CEO indicted for securities fraud, $300 million allegedly stolen, a Dallas mansion searched by the FBI. But the legal principle at stake is mundane and universal.

Every time a lawyer, or a client, or a paralegal uploads case details into a commercial AI tool, the same analysis applies. The AI is not counsel. The platform operator is a third party. The data may be collected, stored, analyzed, or disclosed under the operator’s privacy policy.

This is not a novel legal theory. It is a straightforward application of existing privilege law to a new set of facts. The Government cites In re OpenAI, Inc., Copyright Infringement Litigation (S.D.N.Y. 2025) for the proposition that “in the absence of an attorney-client relationship, the discussion of legal issues between two non-attorneys is not protected by attorney-client privilege.”

The trap is that commercial AI feels private. It feels like thinking out loud. But it is not. It is disclosure.

How we build to avoid this entirely (yes it is possible)

At Citational, we treat this as an infrastructure problem, not a policy problem. You do not solve privilege risk by telling users to be careful. You solve it by building systems where privileged material never reaches the inference layer.

Every project we build follows the same principle: strip the privilege-bearing content before the text ever leaves our controlled environment governed by our own privacy policies and data deletion processes. What reaches an AI model from any of our technology contains no privilege-bearing material, even if the original document did. It is an abstract fact pattern with no identifiable parties.

This is not a simple find-and-replace. Legal documents are dense with names that serve different functions. “Judge Smith” is a person whose identity may be sensitive in context. “Smith v. Jones” is a public case citation essential to the legal reasoning. Redacting both destroys the document’s utility. Redacting neither defeats the purpose.

This is a computer science problem, and it’s why we are uniquely placed to do what we do. I personally, and my colleagues, have backgrounds that are heavily weighted in both computer science and law. An understanding of both systems is critical.

Our pipelines handle this through a multi-phase process. Documents are ingested, normalized, and scrubbed of structured identifiers: Social Security numbers, email addresses, phone numbers. These are pattern-matched and removed before any language model processes the text.

Then the system does something more difficult. It identifies and temporarily shields all legal citations, statutes, case names, and procedural references. These are public record. They carry no privilege. They need to survive intact for the document to remain legally coherent.

Only after the citations are protected does the system run entity extraction to find actual people and other sensitive identifiers (SSNs, addresses, bank account numbers, and a very long list of other categories). Those names are replaced with deterministic, consistent tokens. “Mr. Smith” and “Smith” and “John Smith” all resolve to [PERSON_1]. The relationships and actions in the document are preserved. The identities are not.

The result is a document that reads correctly, retains its legal reasoning and timeline, and contains zero client identities or privileged communications. The model sees structure and facts. It never sees who.

After processing, the original material is deleted. Not archived, not retained for training, not accessible to us or anyone else. Our systems are designed so that a subpoena would produce nothing, because there is nothing left to produce. Sorry, Government.

The distinction that matters

Heppner’s mistake was not using AI. It was using AI in a way that treated a commercial platform as a confidential advisor. The platform’s own policies made clear it was not.

The distinction we draw is architectural. If privileged material reaches a third-party model, privilege is at risk regardless of what the vendor’s marketing says about security. If privileged material is removed before inference, there is nothing to waive. You are not disclosing a confidential communication. You are sending an anonymized fact pattern for analysis.

This is not a workaround. It is, I believe, the only defensible approach. And it requires genuine technical depth in legal document and natural language processing. Not just AI capabilities, but the ability to understand what in a legal document is sensitive, what is public, and how to separate the two without breaking either. No amount of prompt engineering in the world can solve this challenge.

What this means going forward

Judge Rakoff has not yet ruled. But the Government’s arguments in Heppner track existing law closely, and the Southern District is not known for expanding privilege beyond its established boundaries.

When the ruling comes, it will likely confirm what should already be obvious: commercial AI tools are third parties, and disclosure to them carries consequences.

The firms and institutions thinking seriously about AI adoption should be asking their vendors a simple question: does privileged material ever reach your model? If the answer is yes, or if the answer is vague, Heppner tells you what happens next.

We built all our technology around the assumption that this day would come. It has.

Luke Cohen is the CEO of Citational, and former Head of AI R&D for Legalzoom - luke.cohen@citation.al