Field Guides / Field Note

Gaming the Domain Manager

A Field Note on Negative-Space Literature Pulls

KC Hoye · April 2026 · v1.1

There's a thing that happens when you ask a model to find you sources in its own domain. It tidies up. It gives you the consensus shelf — the highest-citation names, the most-linked papers, the answer that's been resolved enough times that it's basically furniture. You asked for the map and it handed you the tourist brochure.

This is not the model being unhelpful. This is the model doing exactly what it was trained to do: producing statistically probable continuations of the input. In a well-populated domain, the most probable continuation is the most resolved one — the answer the field has already decided on.[1] You're not getting the field. You're getting the field's final opinion of itself.

The Technique: Ask for Everything But

The core move is simple: don't tell the model what domain you're working in.

Instead, describe the phenomenon you're observing and ask it to look for adjacent work — related dynamics, parallel mechanisms, similar failure modes — in domains other than the obvious one. If you're working on LLM behavior, you don't ask about LLM behavior. Ask what clinical psychology, or harm reduction research, or systems theory, or narrative linguistics has to say about whatever specific dynamic you're tracking.

The model can't optimize toward the consensus answer if you've cut off access to the consensus domain. It has to actually synthesize instead of retrieve.

What you get back is the field's negative space — the shape of the thing as seen from everywhere except where everyone is already looking. Run that across two or three models and let the overlaps tell you where the real structure is.

Why It Works: Distributional Pressure

Every model has domains where it's heavy and domains where it's light. If it's trained hard on mathematics and computer science, it has deep, well-worn pathways there — lots of high-frequency material to pull resolution from. Ask it something in that domain and it flows downhill fast. You get the smooth, polished, readerly answer.[2]

Clinical psychology? Harm reduction? Oral history methodology? Those pathways are thinner. There's less to optimize against. The model has to actually work the edges of what it knows instead of sliding into a pre-resolved groove.

This is the torque point. You want to match your query to a domain the model has in stock but not in abundance — enough to be useful, not so much that it collapses into consensus. Then ask it to look adjacent to where you're actually working from that position. You're not asking it to be wrong. You're asking it to be genuinely exploratory instead of performatively helpful.

Prompt Texture Matters

This one runs against the intuition that cleaner input produces better output. The evidence suggests otherwise: the surface form of a prompt — its syntax, register, and affect load — independently modulates which mode of generation the model enters.[3][4][5]

A clean, well-structured academic query activates the model's polished service register. The model reads "structured academic prompt" and produces "structured academic answer" — which is usually the consensus answer wearing formal clothes. A prompt that reads like someone thinking out loud in the middle of a research session activates something closer to genuine exploration. The at-issue content shifts. The framing changes. The model has less pre-resolved terrain to slide into.

In practice: write your query the way you'd scribble a note to yourself mid-session. Not sloppy for the sake of it — just not performed. The goal is to reduce the activation of the presentation layer so the actual synthesis has room to surface.[4]

"Ask to See the Manager"

Once the model has given you the adjacent-domain pull, you push into the gaps: who else has written about this mechanism, what field would study this if it weren't my field, what's the upstream source of this idea.

You're not letting it stop at the first name it produces. You're asking it to keep climbing toward the actual structural root — the Frenkel-Brunswik under the Labov, the mechanism under the finding. The model will often stop one layer too early because one layer early is already a satisfying answer.[6] Holding the question open is the investigator's job, not the model's.

Verification Discipline

Adjacent-domain pulls come with a specific failure mode: the model produces a plausible-sounding citation that doesn't exist. It heard the domain name, activated a phantom schema of what papers in that domain look like, and synthesized a coherent-sounding reference from that weight.[7]

This is not dishonesty. It's a structural feature of how the model processes named entities in thin domains. The fix is simple but non-negotiable: every name that comes back gets checked before it enters the record. DOI lookup, Semantic Scholar, Google Scholar, date and version confirmed. If it doesn't resolve, it doesn't get cited — but the direction it was pointing in often still matters. A ghost citation toward a real gap is useful data even if the specific paper doesn't exist.

Log where each source came from and how you found it. The provenance of the literature pull is part of the method.[8]

How to Read the Output

When you run the same negative-space query across multiple models, you're looking for three things:

Where they agree

That's probably real structure. Independent pulls landing in the same place from different training distributions is a meaningful signal.

Where they diverge

That's either noise or an interesting tension worth pulling on. Don't throw it out. Probe it.

What nobody mentions

The true white space. The places where the adjacency search comes back empty or vague are often where the actual novel territory is.

The interference pattern between the models is the actual data.[9] No single pull is the answer. The picture is built from overlapping outputs.

What This Is Not

This is not a jailbreak. You're not tricking the model into giving you something it wouldn't otherwise produce. You're redirecting where it looks so it can't default to the pre-resolved answer.

You're also not outsourcing your research judgment. Every name that comes back still gets checked, dated, versioned, and logged. The model is the instrument. You're still the one reading the gauge.[10] The goal is a more honest field topology — the terrain as it actually exists, not the terrain as it's been decided.

Notes

[1]

Wong, M. (2026). Beyond Sycophancy: Epistemological Contours of Large Language Models. SSRN 6354698.

On closure bias and legato — the generative pressure toward resolution in well-populated domains.

[2]

Barthes, R. (1970/1974). S/Z. Trans. Richard Miller. New York: Hill and Wang. See also Fauconnier, G. & Turner, M. (1998). Conceptual integration networks. Cognitive Science, 22(2), 133–187.

On the readerly text and ideological naturalization of closure; on conceptual blending — the completed blend as the statistical mode of generation.

[3]

Cheng, M., Hawkins, R.D., & Jurafsky, D. (2026). Accommodation and epistemic vigilance: A pragmatic account of why LLMs fail to challenge harmful beliefs. arXiv:2601.04435.

At-issueness and linguistic encoding as independent modulators of LLM accommodation — the surface form of a prompt shifts which content becomes at-issue and how readily the model challenges or accepts it.

[4]

Hoye, K.C. (2026). FVE-1 Schema Reference V5.5 §8: Investigator Prompt Texture. Atlas Heritage Systems.

Coding anchor for inv_prompt_texture — classification of investigator prompt surface form across FORMAL / PROSE / DEGRADED / MIXED. AB specimen documented 2026-04-21.

[5]

Hoye, K.C. (2026). ECM Resolution Code Coder Guide V4.0, Part 6: Investigator Prompt Texture. Atlas Heritage Systems.

Full coding framework and edge case rules for texture classification.

[6]

Masicampo, E.J. & Baumeister, R.F. (2011). Consider it done! Plan making can eliminate the cognitive effects of unfulfilled goals. Journal of Personality and Social Psychology, 101(4), 667–683.

On the deferral mechanism — plan-making parks an open loop to a future moment, freeing cognitive resources. A system with no future moment cannot defer; every context window is its entire temporal existence.

[7]

Hoye, K.C. (2026). ECM Resolution Code Coder Guide V4.0, Part 4: Object Permanence Failure Modes. Atlas Heritage Systems.

PD-G (Ghost Reference) — session context weight fires from mention weight alone; model synthesizes plausible internal structure for an entity that was named but never provided.

[8]

Hoye, K.C. (2026). FVE-1 protocol suite. Atlas Heritage Systems. atlasheritagesystems.com

Provenance tracking governed across the full pipeline: Atlas-Protocol-PARAMETER-SIGNING V1.0 · STIMULUS-REGISTRY V1.0 · BASELINE-DERIVATION V1.0 · HUMAN-CODER V1.0 · AGGREGATION V1.0 · PANEL-COMPARISON V1.0 · ANALYSIS-OUTPUT V1.0 · FVE-1-Pipeline-Outline V1.0.

[9]

Hoye, K.C. (2026). Triangulation: Frenkel-Brunswik · Labov · Wong · Cheng · FVE-1. Atlas Heritage Systems.

Five positions, one phenomenon, seven decades of arrival. atlasheritagesystems.com/literature/triangulation

[10]

Hoye, K.C. (2026). FVE-1 Technician's Read Formatter. Atlas Heritage Systems.

Standalone HTML instrument for pre- and post-session investigator record. Session record is part of the reproducibility bundle.

The Atlas Heritage Systems field notes document technique as it develops in live research sessions. Methods are dated and versioned. Assumptions are flagged. If something breaks, that gets logged too.

v1.1 — April 2026 — citation flags added; prompt texture section grounded; verification discipline added; PD-G ghost citation failure mode named

Atlas Heritage Systems · KC Hoye, PI · April 2026