Analytical Framework — Claim Tracking & Truth-Density

PostedMay 17, 2026

UpdatedMay 30, 2026

ByPU Publish

0 out of 5 stars

5 Stars		0%
4 Stars		0%
3 Stars		0%
2 Stars		0%
1 Stars		0%

Purpose

Extract, classify, and track factual claims in the speech corpus to:

Build a database of verifiable claims with truth-status from authoritative fact-checkers

Track claim density (claims per minute, claims per 1,000 words)

Track repetition of false claims (how often does each false claim recur?)

Identify novel claims appearing in the corpus

Compare truth-density across speakers and over time

Provide source-cited evidence base for civic-education and TRC-documentation use

Claim Types

Following the fact-checking literature, classify claims by:

Verifiability

Verifiable factual claims (numerical, historical, attributable, specific): “Crime is up X%”, “I won by Y votes”

Causal claims (more difficult to verify but possible): “X caused Y”

Counterfactuals: “If X had happened, Y would have happened”

Predictions / future claims: “In two months, X will happen” (verify after)

Subjective evaluations: “X is the worst Y” (often not verifiable in strict sense)

Misattributions: “X said Y” (verify quote attribution)

Rhetorical questions / non-claims: Often not coded as claims

Domain

Economic (jobs, GDP, inflation, deficit, taxes, trade)

Crime and immigration

Foreign policy and military

Election integrity

Personal credentials and biography

Opponents’ statements and actions

Scientific or medical

Other

Truth-Status Coding

Use a calibrated scale aligned with major fact-checking organizations:

Code	Label	Description

—	—	—

`true`	True	Fully accurate

`mostly_true`	Mostly true	Accurate with minor missing context

`half_true`	Half true	Partially accurate with significant context missing

`mostly_false`	Mostly false	Inaccurate with some accurate elements

`false`	False	Inaccurate

`pants_on_fire`	Pants on Fire	Inaccurate and absurd

`unverifiable`	Unverifiable	Cannot be verified

`out_of_context`	Out of context	Selectively-presented true statement

`predicted`	Prediction	Future-tense; verify when due

`subjective`	Subjective	Not amenable to fact-check

For each coded claim, record:

Source fact-check organization (PolitiFact, FactCheck.org, Washington Post Fact Checker, AP Fact Check, Snopes)

URL of fact-check article

Date of fact-check

Excerpt of fact-check finding

Claim Extraction Approach

Manual Curation

For high-stakes published analysis:

Editor reads speech transcript

Identifies candidate claims using verifiability criteria

Searches fact-check databases for prior coding

If novel, flags for new fact-check

Records claim + truth-status + source

Semi-Automated Extraction

For high-volume tracking:

Use LLM-assisted claim extraction: Prompt Claude with clear coding instructions to identify candidate factual claims from a transcript

Human review: All LLM-extracted claims reviewed by editor before publication

Match to existing fact-checks: Use semantic similarity (embedding-based) to match claims to existing fact-checks; flag near-matches for review

Track novel claims: Claims with no near-match get flagged for manual fact-check

Database Schema



{
  "claim_id": "...",
  "speech_id": "...",
  "speaker": "Donald J. Trump",
  "date": "2025-XX-XX",
  "text": "[direct quote of the claim]",
  "context": "[surrounding context]",
  "domain": "election_integrity",
  "type": "verifiable_factual_claim",
  "truth_status": "false",
  "fact_check_source": "PolitiFact",
  "fact_check_url": "...",
  "fact_check_excerpt": "[summary of fact-check finding]",
  "fact_check_date": "2025-XX-XX",
  "first_appearance_in_corpus": "2020-11-04",
  "repetition_count": 47,
  "audiences_targeted": ["rally", "interview", "social_post"]
}

Repetition Tracking

For each false claim, track:

First appearance: When did it first appear in the corpus?

Total repetition count: How many times has it appeared?

Repetition over time: Time-series of appearances

Audience pattern: Which audiences receive it?

Variant patterns: Does the claim morph over time?

This produces a “false-claim repetition database” useful for:

Editorial work: “How often has Trump repeated this specific false claim?”

Civic education: Public-facing tools showing which false claims persist

TRC documentation: Evidence base for any future TRC documentation work

Legal matters: When a false claim is at issue in litigation, the repetition record may be relevant evidence

Truth-Density Metric

For each speech (or speech segment), compute:

Claim count: Total verifiable factual claims

Falsity count: Claims coded as false or pants_on_fire

Truth-density: True+mostly_true / total verifiable claims

Falsity-density: False+pants_on_fire / total verifiable claims

Claims per 1,000 words: Normalized claim density

Track these metrics over time; they provide an objective input to the rhetorical and trajectory analyses.

Cross-Source Verification

For any published analysis citing claim-tracking results:

Confirm the fact-check is current and from a reputable source

Note any change in fact-check status (rare but happens)

Acknowledge fact-checker biases: All fact-checkers have editorial perspectives; no source is neutral. Cite multiple where they agree.

Distinguish between “false” and “misleading”: A literally-true statement can be misleading; flag separately

Implementation Stack

Anthropic Claude for LLM-assisted claim extraction (with structured prompts; require evidence quotes for each claim)

Sentence-transformers for semantic similarity matching to existing fact-checks

Fact-check databases: Cached copies of PolitiFact, FactCheck.org, Washington Post Fact Checker, AP Fact Check

Human editorial review before publication of any claim-tracking output

Limitations

Claim-extraction is hard: Claims often nested in subordinate clauses, quoted speech, sarcasm. LLM extraction misses subtle claims and over-extracts on ambiguous ones.

Fact-checker disagreement: Different fact-checkers may rate the same claim differently; flag and document.

Claim context matters: A literally-true claim can be misleading; a literally-false claim can be metaphorical. Editorial judgment is required.

Predictions and counterfactuals: Difficult to fact-check until verifiable.

Volume limits: At Trump-era volume (~25k+ rally events, statements, posts), exhaustive claim tracking is infeasible. Sampling strategy required.

Topics

Accountability

AI Skills

Analytical Frameworks

Civic Engagement

General Knowledge

Information Sources

Investigative Tools

Knowledge Base Context

Legal

Other

Patriot University Operations

Speech Analysis

Truth and Reconciliation

Voting Rights

Learning Paths

Learning Paths — By Role

Learning Paths — By Use Case

People

Convicted and Indicted

Federal Legislators

Media — Fox News Hosts

Media — Other

Political Operatives

State Legislators and Officials

Trump Administration Officials

Trump Family and Associates

Accountability Profiles

Press Freedom

About Us

Patriot University Documentation

Patriot University Team

Analytical Framework — Claim Tracking & Truth-Density

0 out of 5 stars

Purpose

Claim Types

Verifiability

Domain

Truth-Status Coding

Claim Extraction Approach

Manual Curation

Semi-Automated Extraction

Database Schema

Repetition Tracking

Truth-Density Metric

Cross-Source Verification

Implementation Stack

Limitations

See Also

0 out of 5 stars

Please Share Your Feedback

How Can We Improve This Article?

Related Articles

Corpus Construction & Pipeline

Speech Corpus & Rhetorical Analysis — Overview