← Back to Blog

Why Trusting One AI Is Like Asking One Doctor

May 20, 2026 · 7 min read

In 2023, a New York lawyer named Steven Schwartz submitted a legal brief that cited six court cases in support of his argument. The opposing counsel couldn't find any of them. Neither could the judge.

They didn't exist. Schwartz had used ChatGPT to research case law. ChatGPT had invented the citations — complete with case names, docket numbers, and plausible-sounding legal reasoning. Schwartz hadn't checked.

The judge fined him $5,000. He faced disciplinary proceedings. The case became one of the most widely reported AI failures of the year.

The Same Year, Air Canada's Chatbot Cost the Airline Money in Court

A passenger named Jake Moffatt contacted Air Canada's AI chatbot after his grandmother died. He wanted to know if he could get a bereavement fare discount retroactively. The chatbot told him yes — he had 90 days after travel to apply.

That was wrong. Air Canada's actual policy required the discount to be arranged before travel. When Moffatt applied for the refund, the airline refused.

He sued. The British Columbia Civil Resolution Tribunal ruled in his favor. Air Canada had to pay $812 CAD in damages and fees. The airline had tried to argue the chatbot was "a separate legal entity" responsible for its own statements. The tribunal rejected this.

Two very different situations. One pattern: a single AI source gave confident, specific, wrong information — and someone acted on it.

Why Single AI Sources Fail

AI models are not databases. They don't retrieve facts — they generate text that sounds like facts. The distinction matters enormously when the stakes are real.

Every model has blind spots shaped by its training data, its architecture, and the way it was fine-tuned. Claude tends toward caution. GPT-4o tends toward synthesis. Gemini challenges assumptions. DeepSeek prioritizes execution. None of them is wrong exactly — they're optimized differently.

When you ask one AI a question, you get one optimization. When that optimization happens to match a blind spot, you get a confident wrong answer.

The lawyer got a confident wrong answer about case law. The airline's chatbot gave a confident wrong answer about policy. In both cases, nobody checked.

What Cross-Validation Actually Does

If Schwartz had run his case law research through four models simultaneously, at least one of them would likely have flagged that the citations couldn't be verified. The disagreement itself would have been a signal to dig deeper.

This is the core logic behind AI Roundtable. Not that four AIs are always right — they're not. But when three models agree and one doesn't, that disagreement is information. It tells you something worth investigating before you act.

A single confident answer is easy to trust. Four answers with visible disagreement forces you to think.

The Doctor Analogy

If your doctor told you that you needed surgery, most people would get a second opinion before going under the knife. Not because they distrust their doctor — but because the stakes are high enough that one perspective isn't enough.

AI is no different. The tool is powerful. The confidence it projects is real. But confidence is not accuracy, and a single source — however sophisticated — is still a single source.

The Samsung engineers trusted one tool. The lawyer trusted one source. The airline trusted one chatbot.

Cross-validation isn't paranoia. It's what you do when the answer actually matters.

Leave a comment below