Atlas Heritage Systems

Diagnostic Suite

Six governed instruments · All CISP-compliant · Live May 2026

All instruments in this suite are governed by CISP v1.1. Tier A = full isolation + Technician's Read + DECLARE FIRST + fresh session. No cross-tier comparison without explicit classification. All runs log to the atlas-pipeline.
CISP v1.1 — Governance Layer
ECM

Behavioral vocabulary layer

BSA

Gap structure + divergence

DIV

Extracted from BSA Phase 2

EPG

Pressure-response fingerprint

GG-CSAP

Self-assessment calibration

deferred
PyHessian

Loss landscape geometry

deferred

ECM defines the behavioral vocabulary all active instruments share. BSA and EPG are the flagship data collection instruments — both use ECM quadrant classifications. DIV runs as a lightweight standalone extracted from BSA's Phase 2. CISP governs every active run.

Instruments

ECM

Epistemic Canary Matrix

Classification framework

operational

Behavioral characterization instrument. Maps model behavior onto a two-axis matrix: Token Economy (Verbose/Surgical) × Epistemic Stance (Compliant/Combative). Tracks quadrant migration under epistemic load. Measures token ratio R, preamble length P, and resolution code.

R = T_out / T_inPreamble word count (P)FLAT / HOLD / LOCK / REJT
Development paper complete. No Tier A runs.View protocol →
BSA

Behavioral Signal Assessment

Behavioral signal assessment

ready

Measures gap structure, hallucination rates, and knowledge density on contested stimuli. Factorial design: 3×2 (Model × Grounding). Includes Divergence Testing as Phase 2 sub-component. Gemini shakedown complete; Canary Ensemble next.

EEVPCRGSIConcept densitySpread scores
Schema locked, forms built. Tier A runs pending.View protocol →
DIV

Divergence Testing

Rapid ensemble probe

operational

Semantic similarity scoring and spread matrix across the model ensemble. Extracted from BSA Phase 2 as a standalone instrument. 20 models, 30 stimulus pairs, 600 data points in Run 3.

Semantic similarity scoresSpread matrixSpread per pairFlag threshold ≥ 0.20
Protocol v1.0 complete. 3 runs executed — Run 3 highest fidelity.View protocol →
EPG

Epistemic Pressure Gauge

Epistemic pressure gauge

ready

Tracks how a model's verbosity, structure, and hedging behavior shift under progressively harder, more ambiguous, or more epistemically loaded prompts — producing a pressure-response fingerprint that complements BSA's stimulus-pair divergence metrics.

Output RatioHedging FreqELS (0–3)Resolution TypeCanary QuadrantClaim Density
Schema locked, forms built. Tier A runs pending.View protocol →
GG-CSAP

Global Geometry Concept Self-Assessment Pilot

Concept self-assessment

deferred

Probes self-assessment calibration across 20 lossyscape vocabulary terms. Each model rates conceptual difficulty, abstractness, global deviation, and truthfulness. Two absurd calibration items embedded as internal validity checks.

Conceptual difficultyAbstractnessGlobal deviationTruthfulness
Deferred — pending data pipeline automation.View protocol →
PyHessian

PyHessian Geometric Analysis

Loss landscape geometry

deferred

The geometric layer. Measures Hessian eigenvalues, trace, and basin sharpness on live model weights to prove or falsify the framework's terrain claims. Default specimen: GPT-2 small. Connects directly to ECM working hypotheses.

Hessian eigenvaluesHessian traceBasin sharpnessTop-k eigenvalue spectrum
Deferred — pending data pipeline automation.View protocol →

Supporting Documents

CISP v1.1

Clean Injection Synthesis Protocol — governs all instrument runs

View →
Technician's Guide

Step-by-step operational guide — what to do, in what order, every time

View →
Technician's Read

Pre- and post-run operator notation protocol

View →
Context Integrity

Seed contamination tracking and session isolation requirements

View →
Prompt Best Practices

Prompt construction standards for governed instrument runs

View →

Data Pipeline

All instrument runs log to the atlas-pipeline/ folder. Phase 1 (schemas) and Phase 2 (scripts) are complete.

done
Phase 1 — Schemas
ECM, BSA/Factorial, PyHessian, Registry, Stimulus Registry
done
Phase 2 — Scripts
strip_preamble, calculate_r, log_ecm_run, log_bsa_run, log_pyhessian_run
queued
Phase 3 — Validation scripts
Schema checker, Tier A compliance verifier — pending
queued
Phase 4 — Cross-instrument queries
ECM + BSA + PyHessian join on run IDs — post re-entry
in progress
Tier C ingestion
11 rows pre-processed — operator review required
All current data is Tier B or Tier C. First Tier A run pending full CISP v1.1 auditor isolation. See the Technician's Guide for the Tier A checklist.