Conventions & Methodology
How this knowledge base is structured, sourced, and how credibility is rated. This page is the reference for anyone reading the repo or contributing to it. If you only read one thing, read The credibility framework.
The governing principle of the whole project: separate verified facts from unverified claims, and make the basis for every credibility judgment explicit. Testimony is recorded as testimony, primary documents as primary documents, and the gap between “someone said this” and “this is established” is never silently collapsed.
Directory structure
| Directory | What lives here | Analytical level |
|---|---|---|
sources/ | Source-of-record files — one analytical writeup per person, program, document, event, or report. The curated, interpreted layer. | High (synthesis + judgment) |
topics/ | Cross-cutting analyses that connect multiple sources — patterns, debates, timelines, the credibility framework itself. | High (synthesis across sources) |
raw/ | Primary material, captured verbatim. Subdivided by medium (below). The evidentiary substrate that sources/ and topics/ cite. | Low (capture + light framing) |
queries/ | Dated question-answer notes — a specific question worked through against the sources at a point in time (e.g. “how good was the AARO report?”). Named YYYY-MM-DD-slug.md. | Medium (focused reasoning) |
scratch/ | Working notes, audits, in-progress lists not meant for the published site. | n/a |
scripts/ | Triage tooling (e.g. the Reddit triage.py). | n/a |
raw/ subdirectories
| Subdir | Contents |
|---|---|
raw/articles/ | News articles, blog posts, Wikipedia captures, web pages (extracted to markdown) |
raw/transcripts/ | Podcast / YouTube / video transcripts (with timestamps where available) |
raw/reports/ | Government reports, FOIA productions, hearing transcripts, declassified PDFs (+ extracted text) |
raw/reddit/ | Reddit post + comment captures (JSON + analytical markdown), plus the triage DB |
raw/papers/ | Academic / self-published papers (PDF + extracted text) |
raw/extracts/ | Standalone primary extracts saved with the extract tools |
raw/data/ | Datasets (CSV, structured data) |
raw/media/ | Images and other media |
The three-layer model
The repo distinguishes three altitudes, and the distinction is load-bearing:
-
raw/— what was said/written. Verbatim capture. A transcript, an article, a PDF’s text. No judgment beyond a header documenting provenance. When you source something new, the raw extract is saved here even if you also write it up insources/— the raw layer is not optional. -
sources/— what a specific person/program/document is and how much weight it carries. One file per entity. Synthesizes the raw material about that entity and renders an explicit credibility judgment. -
topics/— what the sources mean together. Patterns across entities: the credibility framework, the 2017 watershed, the amnesty debate, the contactee tradition, etc.
A claim should be traceable downward: a topics/ assertion cites sources/ files, which cite raw/ primaries. Don’t let a sources/ or topics/ claim float without a raw/ anchor.
File header convention
Every sources/ file opens with a tags: frontmatter block (for entity-kind identification — see below) followed by a metadata block:
---
tags: [person]
---
# <Title>
- Type: <testimony | report | article | named-figure source-of-record | ...>
- Author / Subject: <who>
- Date: <ISO dates; incident vs. publication distinguished>
- Credibility: <rating — see framework below>
- <primary URLs, archive links, related wikilinks>
raw/ files open with a lighter header documenting provenance: source URL, date, author/outlet, extraction method (e.g. pymupdf4llm, Gemini CLI OCR, requests+readability), date sourced, and [[wikilinks]] to the analytical files that cite it.
Entity-kind tags
Every sources/ file carries a tags: frontmatter line classifying what kind of entity it documents. Quartz auto-generates a browsable index page per tag (e.g. /tags/person lists every person). This is the canonical way to identify people (and every other kind) — both by browsing the live site and by grep "^tags:" sources/*.md.
Controlled vocabulary (one or more per file):
| Tag | Use for | Count |
|---|---|---|
person | An individual whose claims/credibility are the subject | 18 |
report | Government/official reports | 3 |
document | Articles, compilations, leaked documents, papers | 3 |
media | Films, video evidence, fiction | 3 |
case | A specific sighting/incident | 3 |
organization | An entity/archive/group | 2 |
law | Legislation | 2 |
event | A hearing or discrete happening | 2 |
program | A government program | 1 |
A file may carry more than one when it genuinely spans kinds (e.g. [person, case] for Fravor/Nimitz, [person, organization] for Graves/ASA). When adding a new source, tag it before anything else — it is the entity’s primary classification.
The credibility framework
This is the spine of the project. Full role-grouped roster lives in community-credibility-assessment; the conventions for applying it are here.
The scale
Ratings are ~0–100, expressed with a tilde (~35) to signal they are judgments, not measurements. Rough bands:
| Band | Meaning | Example |
|---|---|---|
| ~70–85 | Credentialed insiders / operators making narrow, testable, institutionally costly claims | Gallaudet (~75), Mellon (~72) |
| ~50–70 | Real credentials or real access, but advocacy posture or unverified specifics | Grusch (~50), Coulthart (~45) |
| ~30–50 | Mixed: real background, but pattern of low-evidence or escalating claims | Elizondo (~35), Davis (~30) |
| ~10–30 | Discredited or fabulist, or claims with no falsification mechanism | Doty (~25), Greer (~10) |
| ~0–10 | Fabricated biography / fantasist | Schneider (~5) |
The core principle
People making the narrowest, most testable, most institutionally costly claims are the most credible. People making the broadest, most narrative-shaped, most career-aligned claims are the least.
A claim’s evidentiary weight scales with the claimant’s willingness to substantiate it. Withholding (“I know things I can’t share”) is a credibility-deferring move, not a credibility-enhancing one.
Where the rating lives
The numerical rating lives in the entity’s own sources/ file (in the front-matter Credibility: field and/or a ## Credibility section), and every person-tagged source page must also appear in the community-credibility-assessment roster — as a roster entry and in the at-a-glance index. The roster is the full set of rated people, not a curated subset.
This invariant is enforced at build time. scripts/check-person-ratings.mjs (wired into npm run build, deploy, and check via check:content) scans every person-tagged file under content/sources/ and fails the build if any is not wikilinked from the roster. So a newly-added person page that hasn’t been rated-and-rostered will block deployment. (Roster entries that have no source page — e.g. Lacatski, Doty, politicians — are fine; the check only runs source-page → roster, never the reverse.)
Build-time content gates
Two checks run before every build/deploy (and in check), via npm run check:content:
check:roster— the credibility-roster invariant above.check:links(scripts/check-links.mjs) — fails the build on any broken internal wikilink. It resolves[[target]]by basename against all ofcontent/, and excludes: links inside code spans (illustrative examples),[[x]](url)markdown-link artifacts, Quartz-generatedtags/*pages, and an explicitALLOWlist of intentional forward-links (pages deliberately not-yet-created — currentlyodni-annual-uap-report-2023andblackvault-wipe-2026-02-23). To add a deliberate forward-link, add its basename toALLOW; otherwise fix the link. This is what catches orphaned links like the ones left over from the infobase→ufopedia split.
The ## Credibility assessment section format
Match this structure (see davis-career-and-claims and buchanan-stargate-career-and-claims for worked examples):
- What raises X’s credibility — numbered list
- What lowers X’s credibility — numbered list
- Net assessment — the numerical rating + one-paragraph justification
- Position relative to other UAP figures — above/below comparisons to anchor the number
- Role-category placement — which category from community-credibility-assessment applies
Bimodal / component ratings
When credibility varies dramatically by claim, a single number is misleading — break it down by component. This is the strongly preferred treatment for split-track-record figures.
- 2026-04-14-bob-lazar-credibility-rating rates Lazar per-claim: Los Alamos employment 95/100, S-4 employment 60/100, hands-on exotic tech 50/100, extraterrestrial origin 15/100.
- buchanan-stargate-career-and-claims is bimodal: service record ~85, RV operational efficacy ~30, alien-base/UFO-piloting claims ~10, composite ~35.
The composite number is a convenience; the component breakdown is the honest representation.
Ratings move
Ratings carry their history when they change: ~42, down from ~48, originally ~55 (Kirkpatrick). Record the direction and the trigger for the update.
Sourcing workflow
- Resolve the primary. For Reddit share links, resolve to the canonical post; for videos, get channel + title + date; for paywalled/blocked pages, use the extract fallback chain (
requests+readability→ Playwright Firefox → Gemini CLI OCR for image-scanned PDFs → manual paste). - Save the raw extract into the appropriate
raw/subdir, with a provenance header. Do this even when you also write an analytical file — the rule is that the raw version is saved with the extract tools, not just embedded inside the writeup. - Write or update the analytical file in
sources/(entity) or augment atopics/file (pattern). - Cross-link both directions — the
raw/file links up to its analytical writeup; the analytical file links down to the primary. - Flag followup items — list under-sourced threads explicitly (a
## Followup itemssection) rather than silently dropping them. - Pull cited primaries. If a source references peer-reviewed work, pull the actual arXiv/journal primary, not just the secondary characterization.
Reddit triage
Triage is a production workflow, not a classification workflow: when triaging posts, source the relevant followup items and update/add topics in the same session, rather than only marking posts reviewed. Status values: untriaged | reviewed | followup | sourced.
Wikilinks
- Internal links use
[[path/to/file]]or[[path/to/file|display text]](Quartz/Obsidian style, no.md). - A
[[link]]to a not-yet-created file is acceptable; it marks something worth writing.
Link the first occurrence, once (Wikipedia model)
Follow Wikipedia’s linking discipline (MOS:DUPLINK + MOS:OVERLINK + WP:SEEALSO):
- First-occurrence inline linking. Link the first prose mention of a relevantly-related entity (any figure/program/event/document with a page), inline, in the body — even if the current page isn’t about it. Subsequent mentions of the same entity stay plain text. One link per entity per page (front-matter, headers, and a final Related list don’t count toward the prose link).
- Relevant, not just central. Link an entity that’s genuinely relevant in context, not only the page’s main subject — but not loose/trivial co-mentions (a name dropped purely for contrast). “Discussed here” → link; “named in passing as a contrast” → skip.
- Don’t over-link. No linking the same entity on every mention; no linking trivially-related items.
- Related / “See also” is curated and non-duplicative. The trailing
## Relatedlist is for pages not already linked inline in the body (primaries the page cites, sibling pages worth surfacing). If an entity is linked inline, it should not also appear in Related.
In short: inline-link the first relevant mention once; keep Related for what the prose didn’t already reach. Avoid the surname-collision trap when applying this (e.g. Harry Reid vs. Garry Reid; Eric Davis vs. other Davises) — match the specific person, not the bare surname.
Two homepages
index.md(lowercase) is the Quartz site homepage (has thetitle:front-matter). Keep its curated entry-point links current.INDEX.md(uppercase) is the full organized index of sources by category, linked from the homepage as “the full index.”
What this base does NOT do
- It does not collapse testimony into fact.
- It does not present a single credibility number where the claim-by-claim reality is bimodal.
- It does not treat aggregation of testimony as physical evidence (cf. the Age of Disclosure “34 named officials” analysis in community-credibility-assessment).
- It does not silently truncate coverage — if something is partially sourced or a gap remains, that is stated (see the Gaps section of INDEX).
Related
- INDEX — the full organized index
- community-credibility-assessment — the credibility roster
- the-evidence-question — what would actually count as evidence