Skip to content
Module 02 / 08· Demystify the machine

How the Machine Learns (and Why It Invents Cases)

Open the black box: see why an LLM predicts the next plausible word, not the true one — and why that makes it fabricate citations with total confidence.

2 hoursHands-on labBYOD · AI lab

The hook

A chatbot knows exactly what an Indian citation looks like — the party names, the reporter, the year, the page. What it does not have is any connection to the database that would tell it whether that case exists. So it builds you a beautiful, plausible, entirely fictional precedent — and hands it over with the same fluent confidence it uses for the truth.

What you'll be able to do

  • Demystify how an LLM is built and how it generates text, in plain terms a non-quantitative law student can explain back: training data leads to patterns leads to predictions.
  • Explain hallucination mechanically — why a model that predicts the next plausible word will fabricate a confident-looking citation it has no way to check.
  • Master the generative-vs-retrieval-grounded distinction that is the direct antidote to the Mata error, and understand GIGO, bias, and knowledge cutoffs as sources of failure.

In short

This module opens the black box. Students learn how a large language model is trained on data, learns statistical patterns, and then generates text by predicting the next plausible word rather than retrieving a true fact. From that single mechanism the module derives why hallucination is not a bug but a predictable behaviour: the model knows what a citation looks like but has no connection to any case database, so it invents one confidently. The payoff is the generative-vs-retrieval-grounded distinction — the conceptual difference between a chatbot and a grounded legal-research engine — which tells students exactly where to verify hardest.

The AI bridge

Knowing how an LLM fails tells you where to verify hardest. Because the model predicts plausible text rather than retrieving true facts, citations and specific authorities are exactly where it is most dangerous — and that is precisely why a grounded legal-research engine, which locks provenance to real sources, differs fundamentally from a chatbot. The skill is to choose tools and verify outputs based on the mechanism, not the fluency.

In this module

  • 01

    Training data leads to patterns leads to predictions: an LLM is not a database of facts but a system that learned the statistical shape of language; route this through the lawyer's intuition that you can imitate a contract's form without knowing the underlying transaction.

  • 02

    The core mechanism: it predicts the next plausible word, not the true one. Fluency and truth are different things, and the model optimises for the former — which is precisely why a confident answer is not evidence (the course's confidence-vs-correctness spine).

  • 03

    Why it invents cases: a chatbot knows what a citation looks like — the structure of party names, reporter, year, page — but has no connection to the databases, so when asked for authority on a niche point it fabricates one that is formally perfect and substantively fictional. This is the Mata mechanism made concrete.

  • 04

    Generative vs retrieval-grounded — the load-bearing distinction. A generative chatbot composes plausible text; a grounded legal-research engine retrieves and cites real sources. Knowing which kind of tool you are holding tells you whether its citations can be trusted at all.

  • 05

    GIGO and bias: the model reflects its training data, so garbage or skewed inputs produce garbage or skewed outputs — the same flaw that makes human legal reasoning inherit the biases of the corpus it learned from.

  • 06

    Knowledge cutoffs and gaps: the model's training stops at a date and has blind spots, so it will confidently answer about recent judgments, niche jurisdictions, or local statutes it never saw — a critical trap for current Indian law. This tells you where to verify hardest.

  • 07

    The North-Star payoff: understanding how the machine fails by design tells you where the duty to verify bites hardest, and why citation-first, grounded research — not a generic chatbot — is the responsible default for legal work.

The interactive demos

Every idea is a Mirror Move

Run it on the room, show it inside the machine, prove it live on a real AI, then name the skill.

Watch it invent the law

On us

Frame the moment for the room: ask students to predict, by show of hands or a quick poll, whether a generic chatbot asked for Indian case law on a deliberately niche point will (a) refuse, (b) give a real citation, or (c) confidently produce something. Commit before the reveal.

In the machine

Explain the mechanism behind the prediction: because the model predicts the next plausible word and knows what an Indian citation looks like but has no connection to case databases, the statistically likely output is a fluent, formally perfect, fabricated citation — not a refusal.

Live AI

Live, ask a generic chatbot for Indian case law on a niche point and watch it invent plausible citations. Then run the same query against Indian Kanoon (or another grounded tool) and contrast: one composes authority, the other retrieves it.

The skill

Citation-first research locks provenance. Start from a grounded tool that ties every proposition to a real, checkable source, and treat a generic chatbot's citations as unverified until confirmed against the actual database.

The lab

Fabrication in the Wild

Working on their own laptops, each student elicits a hallucinated citation or quote from a generic model — for example by asking for authority on a niche Indian legal point — then attempts to verify it against a real source such as Indian Kanoon or a grounded tool, and fails. Students document the verification attempt and the tells: the formally perfect but uncheckable citation, the confident tone, the absent provenance, the mismatch when the source is searched.

Deliverable

A short documented record of the hallucination: the prompt used, the fabricated citation or quote the model produced, the verification steps taken, the point at which verification failed, and the tells that marked the output as fabricated.

Key sources & cases

  • Janelle Shane, You Look Like a Thing and I Love You (2019)

    The accessible, funny on-ramp to how AI actually works and fails: giraffing, the AI-generated pickup-line title, recipes calling for 'broken glass', and the tank/sunny-day shortcut. Used to make the training-and-prediction mechanism intuitive for non-quantitative students.

  • Hannah Fry, Hello World (2018)

    Plain-language framing of how algorithms and machine learning work and where they go wrong; supports the demystification of the black box.

  • The Mata mechanism (NYSBA / Goldberg Segalla analysis)

    The analysis of why ChatGPT fabricates citations — that it knows the form of a citation but has no connection to legal databases. Anchors the 'why it invents cases' content and ties back to Module 1's Mata v. Avianca.

Readings

  • Janelle Shane, You Look Like a Thing and I Love You (2019) — giraffing, the broken-glass recipes, and the tank/sunny-day shortcut as intuitive illustrations of how training produces failure.
  • Hannah Fry, Hello World (2018) — how algorithms and machine learning work and where they go wrong.
  • The NYSBA / Goldberg Segalla analysis of the Mata v. Avianca mechanism — why ChatGPT fabricates citations (knows the form, lacks the database connection).

Next module

Module 03 / 08

The Lawyer's Blind Spots: Cognitive Bias, Statistics, and the Sycophantic Bot

debias

Bring a credit-bearing AI course to your students

A 16-hour course that treats using AI well as a professional duty — one a council can approve, and a graduate can defend in court.