Research Skill Workflow

This is the canonical operator runbook for the experimental page-content research workflow that landed in docs-v2-dev. Use this page for:

operational usage
source-of-truth boundaries
readiness status
maintenance and improvement workflow

Do not use rollout reports or tests/README.md as the canonical narrative source.

What Is Canonical

Canonical sources:

skill behavior: ai-tools/ai-skills/templates/*.template.md
fact storage: workspace/research/claims/
adjudication ledger: workspace/research/adjudication/page-content-research-outcomes.json
registry validation: operations/scripts/validators/content/veracity/docs-fact-registry.js
manual fact runner: operations/scripts/audits/content/veracity/docs-page-research.js
PR advisory runner: operations/scripts/dispatch/content/veracity/docs-page-research-pr-report.js
packet runner: operations/scripts/dispatch/content/veracity/docs-research-packet.js
adjudication workflow: operations/scripts/audits/content/veracity/docs-research-adjudication.js
research-to-plan template: docs-guide/tooling/research-to-implementation-plan-template.md
local/manual PR prep integration: operations/scripts/dispatch/ai/codex/create-codex-pr.js --advisory-research
packet planning template: docs-guide/tooling/research-review-packet-plan-template.md
forward plan: workspace/plan/future/page-content-research-trust-roadmap.md

Generated or derived:

local installed Codex skills under $CODEX_HOME/skills
saved advisory reports and validation artifacts

Historical only:

rollout evidence and exploratory pilots under workspace/plan/repo-ops-reports/

Legacy:

route-centric claim-ledger advisory helpers are retained only as legacy comparison tooling and are not the active PR advisory path

Workflow Model

The workflow is claim-led rather than route-led. It is responsible for:

extracting material factual claims
checking evidence sources
detecting contradictions across related pages
classifying claims by confidence and freshness risk
producing a propagation queue for dependent pages

It is not responsible for:

MDX syntax validation
style-guide compliance
link and import integrity
generic navigation cleanup

Route or ownership issues only belong here when they change factual ownership or contradiction resolution.

Readiness Status

Current status:

Codex-ready: yes
Cross-agent-ready: portable with minor work
Operating mode: experimental and advisory-first

Interpretation:

Codex skills can sync from the canonical template bundle immediately.
The public and internal docs now explain how to use the workflow.
Cross-agent portability exists structurally, but broader packaging and operator guidance still need hardening before claiming equal readiness across all agents.

Operator Commands

Validate the registry:

node operations/scripts/validators/content/veracity/docs-fact-registry.js --validate --registry workspace/research/claims

Run a single-page pass:

node operations/scripts/audits/content/veracity/docs-page-research.js \
  --page v2/orchestrators/guides/deployment-details/setup-options.mdx \
  --report-md /tmp/docs-page-research.md \
  --report-json /tmp/docs-page-research.json

Run a cluster review:

node operations/scripts/audits/content/veracity/docs-page-research.js \
  --files v2/orchestrators/guides/deployment-details/setup-options.mdx,v2/orchestrators/setup/prepare.mdx,v2/orchestrators/guides/operator-considerations/business-case.mdx \
  --report-md /tmp/docs-page-research-cluster.md \
  --report-json /tmp/docs-page-research-cluster.json

Run a nav-based research packet:

node operations/scripts/dispatch/content/veracity/docs-research-packet.js \
  --nav tools/config/scoped-navigation/docs-gate-work.json \
  --version v2 \
  --language en \
  --tab Orchestrators \
  --group Guides \
  --out workspace/reports/orchestrator-guides-review/research-guides-review

Run a files and folders packet:

node operations/scripts/dispatch/content/veracity/docs-research-packet.js \
  --folders v2/gateways/guides/payments-and-pricing,v2/gateways/guides/monitoring-and-tooling \
  --files v2/gateways/guides/support-and-operations/funding-and-support.mdx \
  --split-by dir \
  --out workspace/reports/gateway-guides-review/research-guides-review

Run a manifest-defined packet:

node operations/scripts/dispatch/content/veracity/docs-research-packet.js \
  --manifest workspace/reports/repo-ops/research-packet-manifest.json \
  --out workspace/reports/repo-ops/research-packet

Run the PR advisory helper:

node operations/scripts/dispatch/content/veracity/docs-page-research-pr-report.js \
  --files v2/orchestrators/guides/deployment-details/setup-options.mdx,v2/orchestrators/setup/prepare.mdx,v2/orchestrators/guides/operator-considerations/business-case.mdx \
  --report-md /tmp/page-content-research-pr.md \
  --report-json /tmp/page-content-research-pr.json

Run PR prep with advisory research:

node operations/scripts/dispatch/ai/codex/create-codex-pr.js \
  --advisory-research \
  --changed-files v2/orchestrators/guides/deployment-details/setup-options.mdx,v2/orchestrators/setup/prepare.mdx,v2/orchestrators/guides/operator-considerations/business-case.mdx

Validate the adjudication ledger:

node operations/scripts/audits/content/veracity/docs-research-adjudication.js \
  --validate \
  --ledger workspace/research/adjudication/page-content-research-outcomes.json

Record one adjudicated outcome from a report artifact:

node operations/scripts/audits/content/veracity/docs-research-adjudication.js \
  --record \
  --ledger workspace/research/adjudication/page-content-research-outcomes.json \
  --report-json workspace/reports/repo-ops/2026-03-16-page-content-research-pilot-gateway-trust-hardening.json \
  --reviewer codex \
  --claim-id gw-startup-program-current \
  --human-verdict time-sensitive \
  --outcome-class true_positive \
  --cause-tag wording_only_conflict \
  --action "keep advisory and continue current verification"

Record a missing-coverage outcome that was not detected in the report:

node operations/scripts/audits/content/veracity/docs-research-adjudication.js \
  --record \
  --ledger workspace/research/adjudication/page-content-research-outcomes.json \
  --report-json workspace/reports/repo-ops/2026-03-16-page-content-research-pilot-gateway-trust-hardening.json \
  --reviewer codex \
  --claim-family gateway-support-contact-channel \
  --human-verdict time-sensitive \
  --outcome-class false_negative \
  --cause-tag missing_coverage \
  --action "expand claim-family coverage for the support contact channel"

Write a trust summary from adjudicated outcomes:

node operations/scripts/audits/content/veracity/docs-research-adjudication.js \
  --summary \
  --ledger workspace/research/adjudication/page-content-research-outcomes.json \
  --report-md /tmp/page-content-research-adjudication.md \
  --report-out-json /tmp/page-content-research-adjudication.json

Use packet mode when:

the request covers a full nav section or several logical tranches
the findings need reusable packet artifacts for later fix execution
you want page-run and PR-run views preserved together across a larger scope

Use a single page or cluster run when:

the request is limited to one page or one tight claim family
a packet root would add more operational overhead than value
the next action is immediate page editing rather than section-wide reporting

Use the research-to-plan handoff when:

the research output is complete but the fixes span content, registry, and runner behavior
another agent needs a decision-complete implementation plan before execution
the next task is sequencing, not more source verification

The dedicated follow-on skill is docs-research-to-implementation-plan. It consumes page reports, PR advisory reports, or research packets and turns them into a planning-only implementation artifact.

Expected Outputs

Every substantive run should surface some combination of:

verified claims
conflicted claims
time-sensitive claims
unresolved or historical-only claims
cross-page contradictions
propagation queue items
explicit evidence sources
trust summary counts

Operators should prefer conservative interpretation:

if evidence is weak, treat the claim as unresolved
if wording is stronger than the evidence, downgrade the wording
if the same claim appears elsewhere, queue propagation work instead of fixing one page in isolation

Discovery Boundaries

The runner can now discover supporting evidence beyond explicit evidence_refs, but the ranking stays strict:

active repo files and official pages remain the highest default sources for current-state claims
v1/** is a historical lineage lane, not a silent current-state override
_contextData/**, _plans-and-research/**, _workspace/research/**, and v2/x-archived/** are context lanes only
GitHub discovery is strongest for implementation-status and support-status families
DeepWiki is corroboration only and should not become primary evidence for current product truth

Trust Summary

Each report now includes a compact trust summary with:

unresolved_claims: how many tracked claims still lack strong enough evidence
contradiction_groups: how many factual collisions the run found across reviewed pages
evidence_sources: how many evidence records were actually checked
explicit_page_targets: how many propagation targets came from explicit registry ownership or dependencies
inferred_page_targets: how many propagation targets came from IA/path inference

Interpretation:

higher contradiction_groups usually means a real review problem, not report noise
higher unresolved_claims means the registry or evidence adapters still need work before trusting wording changes
higher inferred_page_targets is acceptable when path inference is covering current siblings, but it should not dominate stable high-confidence families
low evidence_sources on a broad review usually means claim-family coverage or source mapping is still too thin

The trust summary is a proxy only. Trust-promotion decisions should come from adjudicated review outcomes, not from raw report counts alone.

Source-of-Truth Boundaries

Use this split consistently:

public contributor usage: v2/resources/documentation-guide/
internal operator workflow: this runbook in docs-guide/frameworks/
rollout/adoption record: workspace/plan/repo-ops-reports/
future hardening plan: workspace/plan/future/
executable behavior: scripts, templates, tests, and claim registries

If these sources disagree:

scripts and tests define runtime behavior
template bundles define skill behavior
this runbook defines operator workflow and readiness
public documentation guide pages summarize contributor usage

Maintenance Workflow

When improving the research skill:

expand claim-family coverage in workspace/research/claims/
improve evidence matching and classification logic in the runner
validate on real orchestrator and gateway page clusters
run PR advisory on tracked factual docs pages
update this runbook when the operator contract changes

Operator Review Rubric

Use this rubric when deciding whether a run was useful enough to trust:

useful:
- primary evidence is current and from the right source class
- contradiction groups are concrete and explainable
- propagation queue points at pages that really repeat or depend on the claim
noisy:
- weak sources outrank stronger official or GitHub evidence
- contradiction groups collapse unrelated wording into one family
- propagation is mostly speculative sibling fan-out
expand a claim family when:
- the same fact keeps recurring across active pages
- reviewers repeatedly need to verify the same current-state claim manually
- source classes and canonical ownership are clear enough to defend
narrow a claim family when:
- wording overlap keeps producing false contradictions
- the claim is really style guidance, not factual truth
- evidence quality is too weak to classify reliably

Adjudication Workflow

Adjudicate runs when:

a report is used to make or block a real content decision
a contradiction group looks noisy or unexpectedly broad
a reviewer had to manually rediscover facts that should have been tracked
a gateway status claim is being considered for stronger PR-time trust

Classify outcomes like this:

true_positive: the report surfaced a real issue or useful current-state warning
false_positive: the report surfaced a claim family that was not actually useful or was misleadingly grouped
false_negative: the reviewer had to manually verify a factual claim that the system failed to track
needs_split: one family is collapsing multiple concepts and needs to be divided
needs_narrowing: the family exists but its matching or propagation logic is too broad
needs_more_sources: the family is valid but current source coverage is too weak

Treat these as the default family-status interpretations:

stable: repeated adjudications show the family is useful and low-noise
advisory-only: keep reporting, but do not move toward stronger PR behavior yet
needs-split: separate mixed concepts before trusting the family further
needs-narrowing: reduce matching or inference breadth before trusting the family further
needs-more-sources: expand or improve source coverage before trusting the family further

Trust Tiers

Trust tiers are metadata only in the current phase:

experimental: not enough adjudicated evidence yet
advisory: usable, but still too noisy or under-evidenced for stronger handling
advisory-high-confidence: a narrow family with low noise and strong current source fit
not-eligible: outside the current trust-candidate slice

Current trust-candidate slice:

clearinghouse-public-readiness
remote-signer-current-scope
programme-availability
community-signer-testing-surface
gateway-support-contact-channel

Do not treat any other family as eligible for stronger PR-time trust until adjudicated outcomes say otherwise. Do not:

widen the workflow back into generic navigation QA
let tests/README.md become the primary narrative home again
treat exploratory reports as canonical instructions

Public contributor page: /v2/resources/documentation-guide/research-and-fact-checking
AI tools index: /docs-guide/tooling/ai-tools
Source of truth policy: /docs-guide/policies/source-of-truth-policy
Trust roadmap: workspace/plan/future/page-content-research-trust-roadmap.md

Documentation Index

​Research Skill Workflow

​What Is Canonical

​Workflow Model

​Readiness Status

​Operator Commands

​Expected Outputs

​Discovery Boundaries

​Trust Summary

​Source-of-Truth Boundaries

​Maintenance Workflow

​Operator Review Rubric

​Adjudication Workflow

​Trust Tiers

​Related Docs