Skip to main content
Decoder

The surprising case for AI judges

73 min episode · 3 min read
·

Episode

73 min

Read time

3 min

Topics

Artificial Intelligence

AI-Generated Summary

Key Takeaways

  • Justice System Accessibility Crisis: 92% of Americans cannot afford legal help for their problems, including most small and medium businesses. State courts adjudicate 3-4 million cases annually in Michigan alone, with the majority of parties unrepresented. This creates a fundamental access problem where the modal case type in state courts is consumer debt, with most cases dismissed due to procedural failures rather than merits.
  • AI Arbitrator Architecture: The system employs approximately 20 specialized agents operating across the arbitration process. Front-end agents parse claims, organize arguments, identify legal elements, and verify understanding with parties through iterative feedback loops. Reasoning agents then analyze the dispute summary, while a human arbitrator remains in the loop throughout, ultimately issuing the final award after reviewing and modifying the AI-generated draft as needed.
  • Procedural Fairness Through Transparency: The system addresses trust by ensuring parties feel heard through explicit confirmation loops. Agents present their understanding of claims, elements, evidence, and legal frameworks, then ask parties for verification. This iterative process continues until both parties confirm accurate understanding, creating transparency that human judges rarely provide. Research shows parties grow trust in institutions when they understand the process and reasoning, even when losing.
  • Error Rates in Human Judgment: Appellate courts reverse lower court decisions at significant rates, indicating systemic human error. DNA exoneration data reveals wrongful conviction rates of 3-5% in criminal cases, demonstrating judges and juries make substantial mistakes. McCormack argues this error rate is acceptable for shooting free throws but unacceptable for landing planes, and the justice system should operate more like the latter with AI assistance.
  • Governed AI Systems Required: Generic large language models like ChatGPT would hallucinate if used directly for arbitration. The AAA system requires governance through training on specific dispute types, grounding in relevant legal frameworks, and continuous human oversight. The platform launched narrowly with documents-only construction cases because the team could build a properly governed agentic system using their historical case library and construction arbitrator collaboration for that specific domain.

What It Covers

Bridget McCormack, former Michigan Supreme Court Chief Justice and current American Arbitration Association CEO, discusses the AI arbitrator system launched in November 2024. The platform uses multiple AI agents to resolve documents-only construction disputes, with one active case. McCormack argues AI can increase access to justice while addressing concerns about hallucinations, bias, and trust in automated legal systems.

Key Questions Answered

  • Justice System Accessibility Crisis: 92% of Americans cannot afford legal help for their problems, including most small and medium businesses. State courts adjudicate 3-4 million cases annually in Michigan alone, with the majority of parties unrepresented. This creates a fundamental access problem where the modal case type in state courts is consumer debt, with most cases dismissed due to procedural failures rather than merits.
  • AI Arbitrator Architecture: The system employs approximately 20 specialized agents operating across the arbitration process. Front-end agents parse claims, organize arguments, identify legal elements, and verify understanding with parties through iterative feedback loops. Reasoning agents then analyze the dispute summary, while a human arbitrator remains in the loop throughout, ultimately issuing the final award after reviewing and modifying the AI-generated draft as needed.
  • Procedural Fairness Through Transparency: The system addresses trust by ensuring parties feel heard through explicit confirmation loops. Agents present their understanding of claims, elements, evidence, and legal frameworks, then ask parties for verification. This iterative process continues until both parties confirm accurate understanding, creating transparency that human judges rarely provide. Research shows parties grow trust in institutions when they understand the process and reasoning, even when losing.
  • Error Rates in Human Judgment: Appellate courts reverse lower court decisions at significant rates, indicating systemic human error. DNA exoneration data reveals wrongful conviction rates of 3-5% in criminal cases, demonstrating judges and juries make substantial mistakes. McCormack argues this error rate is acceptable for shooting free throws but unacceptable for landing planes, and the justice system should operate more like the latter with AI assistance.
  • Governed AI Systems Required: Generic large language models like ChatGPT would hallucinate if used directly for arbitration. The AAA system requires governance through training on specific dispute types, grounding in relevant legal frameworks, and continuous human oversight. The platform launched narrowly with documents-only construction cases because the team could build a properly governed agentic system using their historical case library and construction arbitrator collaboration for that specific domain.
  • B2B Agentic Commerce Implications: Walmart already uses agents to negotiate and execute contracts, with estimates suggesting 40% of B2B contracts may be agent-negotiated by 2027. This creates demand for automated dispute resolution when agents make mistakes. McCormack positions the AI arbitrator as necessary infrastructure for agentic commerce, arguing the system is only as reliable as the process for fixing breakdowns in automated contract execution.

Notable Moment

McCormack recounts confronting a Michigan probate judge who publicly criticized her on a statewide listserv for statements made at a forum. When she called to explain she had not attended that event, providing an alibi, the judge simply insisted she had been there. His refusal to acknowledge factual reality demonstrates how human judges can be unreliable in ways that properly audited AI systems cannot.

Know someone who'd find this useful?

You just read a 3-minute summary of a 70-minute episode.

Get Decoder summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from Decoder

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best Tech Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into Decoder.

Every Monday, we deliver AI summaries of the latest episodes from Decoder and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime