PunditScore — Contribution Verification Methodology
Version 1.0 · 12 Jun 2026
Applies to every community-submitted pick. No submission appears on the verified
board until it has passed ALL stages. Stages 1–2 run instantly at intake;
stages 3–6 run in the verification agent (automated) with human review on flags.
Stage 1 — Intake validation (client + server, instant)
Required fields: person/panel/model name (3–60 chars), organisation (2–60),
pick (from official team list), source URL, article publication date.
URL rules: must be http(s); no URL shorteners or redirectors (bit.ly, t.co,
tinyurl, goo.gl, ow.ly, is.gd, buff.ly, rebrand.ly, lnkd.in) — destination must be visible; no IP-address hosts; no data:/javascript: schemes.
Length caps enforced server-side regardless of client.
Rate limit: max 3 submissions per visitor per day; max 50/day globally
before queue pauses for review (flood control).
Stage 2 — Content safety screen (instant, automated)
Profanity & slur filter across ALL free-text fields (name, organisation,
market, note) — multilingual blocklist + obfuscation patterns (f*ck, f.u.c.k). Hard reject; nothing abusive is ever stored in the shared queue.
Spam patterns: repeated URLs, promotional language ("bonus", "promo code",
"free bets"), contact details, all-caps shouting. Reject or hold.
Impersonation check: if the named person is a private individual rather than
a public expert/pundit/model, reject (the board lists public predictions only).
No submission is publicly displayed pre-verification. The public site shows
only an aggregate count of pending items. This removes the incentive to use the form as a graffiti wall.
Stage 3 — Source authenticity (agent, minutes)
Fetch the URL. Must resolve 200 (after redirects within the same registered
domain only).
Domain tiering:
TIER A (auto-trust domain): the ~360 outlets in our 48-country media database + official broadcasters list. TIER B (known but unlisted): legitimate outlet not in our list → proceed, flag for human spot-check. TIER C (unknown/blog/UGC platform): hold for human review. Personal blogs, forums and social posts are accepted ONLY for predictions BY the poster themselves if they are a verifiable public expert (e.g. a pundit's own X/Instagram post), never as second-hand reports.
Snapshot: submit the URL to the Wayback Machine at verification time and
store the snapshot URL alongside the entry. The archived copy is the record; later edits to the article cannot alter what we verified.
Stage 4 — Claim verification (agent, minutes)
The fetched page must contain BOTH the named person/model AND the picked
team, in a prediction context (an LLM extraction confirms "X predicts/picks/ tips TEAM to win the 2026 World Cup" semantics, in the page's language).
Mismatch handling: if the page shows a DIFFERENT pick than submitted →
correct it to the page's pick (the source is the truth, not the submitter) and verify the corrected entry.
AI forecast attribution: if the prediction is model-generated, the entry is
listed under the PUBLISHER with the model named as instrument. Submissions attributing predictions to "ChatGPT/Grok/Claude" without a publisher are corrected or rejected.
Video sources: accepted with a timestamp (mm:ss) supplied; agent stores the
timestamp; human review confirms the quote before verified status.
Stage 5 — Date verification (anti-cheat, agent)
Extract the publication timestamp from, in priority order:
(a) JSON-LD datePublished, (b) article:published_time meta, (c) visible byline date, (d) earliest Wayback Machine capture.
Cross-check: the submitted publication date must match the extracted date
(±1 day tolerance for timezones). Mismatch → use extracted date, flag.
Cheating rules:
Extracted date in the FUTURE or unextractable by any method → REJECT.
Published BEFORE tournament kick-off (11 Jun 2026, 19:00 ET) → eligible
for the VERIFIED badge.
Published AFTER kick-off → listed only with a permanent POST-KICKOFF label
and excluded from pre-tournament consensus stats and the accuracy leaderboard (predictions made with results in hand score nothing).
Pages whose earliest Wayback capture is materially later than the claimed
publication date are held for human review (backdating suspicion).
Edit detection: on the nightly re-verification pass (link checker), if a
source page's stated pick has CHANGED since the stored snapshot, the entry reverts to the snapshot and the discrepancy is logged publicly on the entry ("source amended after publication").
Stage 6 — Decision & audit
Outcomes: VERIFIED (badge, joins board) · POST-KICKOFF (listed, labelled,
excluded from leaderboard) · HELD (human review) · REJECTED (reason logged, submitter sees generic decline).
Every decision writes an audit row: submission id, timestamps, fetch hash,
snapshot URL, extracted date, decision, reviewer (agent/human).
Duplicates: same (person, pick) keeps the EARLIEST verified source; later
submissions attach as corroborating links.
Disputes: a "report this entry" link on every card routes to the same
pipeline for re-verification against the snapshot.
Public-facing rules text (shown on the form)
"Every submission needs a link to where the prediction was published. We fetch the page, confirm the prediction and its publication date, and archive a copy. Predictions published after kick-off are labelled as such and don't count toward the leaderboard. Abusive or fabricated submissions are rejected. Submissions aren't shown publicly until verified."
