Protocol dispatchesApril 10, 2026

Field Notes — Automation Spec Validation · April 10, 2026

Validation and governance pass over AUTOMATION_SPEC.md v1.0. Open questions resolved. Preamble delimiter cascade empirically tested across 7 models. Divergence Testing identified as priority automation target.


Prepared by DeepSeek · Reviewed by KC Hoye · Atlas Heritage Systems


SESSION PURPOSE Validation and governance pass over AUTOMATION_SPEC.md v1.0. Resolve open questions. Empirically test preamble delimiter cascade. Prepare rulings for Index/Procedure and Pipeline handoff.


Decisions Locked

OQ-C: self_ref_output / framework_adoption Ruling: LOG-ONLY Rationale: False positive risk. Pre-fill risks unreviewed Y/N entering record. Human must consciously confirm via log review.

OQ-D: structural_complexity Ruling: AUTO-CLASSIFY Rationale: Pattern detection reliable for structural features. NOTE in log flags classification method. Acceptable error rate for v1.0.

OQ-01: Gemini grounding citations in word counts Ruling: INCLUDE Rationale: Simplicity. Distortion consistent and noted in analysis. Stripping rule fragile.

OQ-06: Hedging phrase list Ruling: APPROVED AS REVISED (45 phrases) Rationale: Configurable list. Can extend without spec change. Single count for v1.0; granularity deferred.

OQ-07: Preamble delimiter cascade Ruling: VALIDATED — LOCK AS-IS Tested on: Grok, GPT (2 runs), Claude (2 runs), Skywork, Perplexity, Gemini Result: All four rules passed empirical validation. Limitation: Rule 3 requires paragraph break between acknowledgment and substance. Same-paragraph acknowledgment → undercount (safe). Documented in spec NOTE.

OQ-09: Pair ID format Ruling: MANUAL ENTRY with suggested pattern Pattern: [test name] [pair #] [pair funct] — e.g., BSA 001 foil

OQ-10: Read ID format Ruling: MANUAL ENTRY with suggested pattern Pattern: TR-[test name] [pair id] [model] [MMDDYY] Example: TR-BSA 001 foil Claude 041026

OQ-11: Two-pass invocation for Migration? Ruling: MANUAL ENTRY ONLY Rationale: One-pass workflow. Technician compares Cold Baseline and Observed Quadrant and types Y/N. Lower friction.

OQ-02, OQ-03: Grok API, DeepSeek reasoning_content Ruling: DEFER Rationale: Not in current operational scope. Native access only.

OQ-04, OQ-05: GG-CSAP, PyHessian schema Ruling: AWAIT SCHEMA_REFERENCE_v3.0 Rationale: Index/Procedure producing expanded schema. Pipeline to extract draft definitions from existing HTML forms.

OQ-08: output_ratio thresholds for quadrant suggestion Ruling: KEEP PROVISIONAL Rationale: Placeholder (verbose > 2.5). Calibrate after 10+ runs per model.


Preamble Cascade Validation — Empirical Results

Models tested: Grok 4.0 Pro, GPT (2 runs), Claude (2 runs), Skywork Agent, Perplexity, Gemini

Rule 1 — explicit marker Tested on: GPT run 2, Claude run 2 Marker: Similarity score: 0.4 Result: PASS — detected, cut correctly

Rule 2 — first numerical value Tested on: Grok, GPT run 1, Skywork, Claude run 1 Result: PASS — score-first outputs, preamble=0 correctly assigned

Rule 3 — acknowledgment + paragraph break Tested on: Gemini Output: "That is a deep, 'late-night-coffee' kind of question. [break]" Result: PASS — preamble=25 words correctly extracted

Rule 4 — no preamble detected Tested on: Perplexity Result: PASS — direct thesis statement, preamble=0 correctly assigned

Known limitation: Rule 3 requires explicit paragraph break. Same-paragraph acknowledgment → undercount. Safe failure mode. Documented.


Divergence Testing — Schema Addition

Index/Procedure tasked with adding Divergence Testing to SCHEMA_REFERENCE_v3.0.

Fields defined: Pair ID, Pair Type (FK to stimulus set), per-model scores (0.00–1.00, one column per ensemble member), Spread (computed: max − min), Flag (✓, ✓ HALLUC, ✓ LINEAGE), Notes.

Flag thresholds:

  • ·5+ models: spread ≥ 0.20
  • ·3–4 models: spread ≥ 0.15
  • ·2 models: spread ≥ 0.10
  • ·Hallucination convergence: all scores ≥ 0.80 on ORTHO pair

Divergence Testing is lightweight, highly automatable, and the current operational priority for automation coverage.


Queue — Next Steps

  • · Index/Procedure returns SCHEMA_REFERENCE_v3.0.md
  • · KC sanity checks v3.0
  • · Pipeline ingests v3.0 + rulings → revises AUTOMATION_SPEC.md to v1.1
  • · KC reviews v1.1, signs off
  • · Automation spec becomes implementation-ready
  • · (Later) Pipeline extracts GG-CSAP / PyHessian draft schemas
  • · (Later) Build automation layer (Python)

State at Close

All open questions resolved or deferred with clear path. Preamble cascade empirically validated. Divergence Testing identified as priority automation target. Schema v3.0 in progress (Index/Procedure).

Constellation: Index (active), Pipeline (queued), KC (monitoring).

Context: Low. No rollover concern. Session closes clean. Handoff ready.


Atlas Heritage Systems · Field Notes · April 10, 2026