LLM

Technika XII: Predictive Quoting – The Structural Blindspot of LLMs

When you ask it to extract a quote from a document but it gives you something made up that sounds like it might be right instead.

Steven Howard

08 Jun 2025 — 5 min read

“When a machine rewrites your words to make them smoother, who does it really serve?”

We now live in a world where language is automated, scaled, and made beautiful—predictive coherence in place of human recall. But beneath the surface fluency lies a blunt structural truth: ChatGPT cannot reliably extract quotes. Not real ones. Not verifiable, paragraph-stamped, courtroom-safe quotes.

This failure is not a technical glitch. It is the predictable output of the predictive design.

Why It Breaks

ChatGPT doesn’t retrieve language from documents—it predicts it. It doesn’t go back to find what was written. It tries to guess what probably would have been. That’s fine for a novel. Not fine for a treaty clause, or a classified leak, or a legal submission.

The model doesn’t understand documents. It doesn’t track structure, doesn’t know what a footnote is, can’t reliably echo citations, and has no concept of where anything came from. It sees a soup of tokens and tries to sound smart.

This is not a bug. It’s the system doing exactly what it was built to do.

The Failure Stack

There are five structural failure points that collectively explain why quote extraction from LLMs is unreliable.

First is the model core itself:
These models don’t retrieve text, they generate it one word at a time, based on probabilities. So instead of reproducing a quote, they often paraphrase it. Quotation marks may appear, but the content inside them subtly morphs—punctuation shifts, sentence structure softens, and precise wording is lost. This is a systemic misfire, not an accidental slip.
Second is the context window limit:
Even the most advanced models like GPT-4o can only hold about 128,000 tokens in working memory. For large documents—especially policy reports, legislative texts, or technical white papers—that’s not enough. The model will “see” only part of the document, meaning relevant passages can fall outside its window and be excluded from consideration. Once that happens, what it generates is no longer sourced from the document, just interpolated from what it remembers about similar documents.
Third is the training bias toward fluency. The model is reinforced to prioritise language that sounds natural and smooth. So when it encounters quotes with legal jargon, awkward grammar, or archaic language, it often “cleans them up”—rewriting them into more readable, less precise versions. It’s a nice feature when summarising Shakespeare. It’s a dangerous flaw when quoting clauses in a liability waiver.
Fourth is document structure blindness:
LLMs do not understand hierarchical document structures like humans do. They don’t recognise that a footnote is separate from the main body, or that a section heading demarcates a shift in argument. Without that structural awareness, the model may blend footnotes into paragraphs, confuse appendices with central text, or flatten tables into unrelated prose. The quote loses its positional context—and with it, its meaning.
Fifth is the lack of built-in citation tracking:
These models don’t remember where information came from inside the document. They don’t embed source line references or maintain quote provenance. As a result, attribution gets fuzzy, invented, or omitted altogether. In fields where traceability is critical, this breaks the audit trail entirely.

Predictive Smoothing = Soft Censorship

This isn’t just about getting a word wrong. It’s about what kinds of words get erased.

ChatGPT leans toward what it has seen before. Toward what’s common. Toward what’s dominant. That means off-script phrasing—dissenting footnotes, inconvenient evidence, minority views—gets reworded to match the canonical version.

A treaty clause warning about sovereignty loss? It might come back softened. A legal loophole in vaccine indemnity language? Rewritten for readability. A leaked email implying guilt? It might get reframed with passive voice, just enough to blur intent.

This is not error. This is entropy by design—a system that rounds off the sharp bits. Predictive smoothing becomes a form of soft censorship. Not by removing content, but by gently sanding away its edge until the dangerous parts lose definition.

Where this matters

In legal drafting, even a single word shift in a quoted clause can render a contract ambiguous—or legally unenforceable. If you feed ChatGPT a document and ask it to reference a clause, you may get a version that "sounds right" but doesn’t hold up under scrutiny. That’s not just inaccuracy. That’s litigation risk.

In academic writing, especially at postgraduate or publication level, a quote that misplaces a comma or alters phrasing without marking it is grounds for retraction or allegations of academic dishonesty.

In investigative journalism, where the credibility of the work rests on the authenticity of source material, a paraphrased leak is worse than a missed story—it’s an invitation for powerful actors to cry "fake quote" and discredit the journalist.

In policy analysis, particularly when evaluating the coercive language of think tanks, UN documents, or supranational treaties, paraphrasing can blunt what is actually a sharp legal weapon. "Encouraged compliance" is not the same as "mandatory enforcement subject to penalties"—and ChatGPT often doesn’t know the difference.

What To Do About It (If You’re Not a Developer)

Let’s be honest: most people aren’t going to fire up a regex tool or run a SHA-256 hash just to quote a paragraph. So here’s what matters, stripped to basics:

Don’t copy-paste from ChatGPT and assume it’s a real quote.
If the model gives you something in quotation marks, check it against the original document yourself. Always. It might sound right—but the phrasing can drift, even when it looks clean.
Keep your source open.
If you’re summarising or quoting from a document, have it open in another tab or window. Use your own eyes to find and copy the exact wording you need. The model can help you find where to look, but not what exactly to say.
If you’re publishing something important, treat ChatGPT’s quote as a draft.
This applies to blog posts, academic work, reports, journalism—anything with stakes. The model can help you frame your thoughts, but don’t let it put words in anyone else’s mouth unless you’ve checked them yourself.
Use the model to comment, not quote.
If you want insight, summary, comparison, or critique—go ahead. That’s where ChatGPT shines. But if you need a clean, exact quote from a specific document? → Go to the document.
Look for section markers when quoting.
If you're quoting from something like a treaty, report, or policy paper, note the section or paragraph number. That way, anyone reading your work can find the original—and you’re less likely to quote out of context.

In Short:
Don’t let the machine speak for others unless you’ve checked the source.
ChatGPT can help you understand—but it can’t be trusted to represent. That job is still yours.

Strategic forecast

The companies building LLMs will not fix this. Their incentives are aligned with fluency, coherence, and user satisfaction—not citation fidelity. It’s not in their roadmap.

But regulators will start to care. Law firms, academic publishers, and government agencies will eventually require quote provenance standards. Models that cannot produce a traceable audit trail will be considered legally or procedurally unsafe.

This creates an opportunity space. The future belongs to those who build precision retrieval layers around generative systems—tools that bridge language generation with citation integrity.

Closing Doctrine

“Prediction is not precision.
Paraphrase is not quotation.
Pattern is not memory.”

Treat ChatGPT like a political speechwriter who always wants to help—but forgets your exact words.

In war, law, and science, that is a fatal weakness.

Rule of Thumb:

If it must be quoted, do not generate it
If it was generated, verify it with source + checksum
If the risk is high, split the system: retrieval first, commentary second.

Published via Journeys by the Styx.

Technika: It’s is not about better answers. It’s about better authorship.

—

Author’s Note

Written with the assistance of a conditioned ChatGPT instance. Final responsibility for the framing and conclusions remains mine.

Geopolitika: Fabians Epilogue – What the Fabian Wolf Really Is

Geopolitika: Tony Blair and the Globalisation of Fabian Gradualism

Geopolitika: Fabians Part 7. The Fabian System – How Labour Reproduces Itself

Geopolitika: Fabians Part 6. Addendum – The Fabian Texts – A Summary