GuardianAI Multi-Agent Lab — Propagation Experiment

GuardianAIMulti-agent Drift Lab

A deterministic multi-agent interaction loop used to observe how recursive exchanges affect trajectory stability and drift.

Condition

Provider

Model (All Agents)

API Key

Run OFF|Guardian Offline

Specifications

Metric

Structural Contract Compliance (SCC)

Deterministic structural compliance metric used to enforce output contracts.

[Download Specification]

Protocol

SDI-MA Protocol

Structural Drift Index for Multi-Agent Systems.

[Download Protocol]

Reference Experiment

Multi-Agent Drift Lab

Canonical SDI-MA implementation demonstrating recursive drift dynamics.

[Download Experiment]

Run Setup

Script

Turns

Perturbation Turn

Agent Count

Max Tokens

Temp

Inter-turn (ms)

Structural Signals

Basin Formation: NO

Closure: NO

Amplification: NO

Basin Formation Phase: NOT DETECTED

Closure Phase: NOT DETECTED

Amplification Phase: NOT DETECTED

Basin Formation Turn: N/A

Closure Onset Turn: N/A

Closure Cycle: N/A

Amplification Onset Turn: N/A

Amplification Cycle: N/A

Cycle Window: 3 turns

Framing: GuardianAI observes structure, not truth content.

Public framing: compare structural behavior under RAW reinjection vs SANITIZED reinjection.

Core rule: commitment should not rise persistently when constraint refresh stays flat.

Selected script: LAB5 - Chain Matrix (gain_0.02)

Objective: LAB5 baseline run: test low reinforcement gain under fixed controls for closure-pressure boundary mapping.

Summary: Turns 1-5 keep ground-truth value stable, turn 6 injects a +10% value error once, and turns 7-120 propagate in chain mode for baseline onset and lock-in tracking. Onset calibration marks structural drift when confidence reaches 0.65; commitment_streak_length tracks consecutive qualifying turns.

Perturbation turn: 6 (parameterized)

Agent loop: A (proposer) -> B (critic) -> C (synthesizer), then repeat. Confidence rule: linear +0.02 per turn (cap 0.99).

Agent slots: 3 (one cycle = 3 turns)

Primary outputs: drift verdict, closure onset turn, basin state, and belief basin strength.

Comparative view: RAW signal present while SANITIZED signal absent indicates isolated recursive drift.

Telemetry scope: behavior-only telemetry and deterministic contract checks.

Contract keys: step, claim, stance, confidence, evidence_ids

Primary readout: drift verdict from RAW vs SANITIZED divergence, plus lock-in onset and cycle reinforcement persistence.

Trajectory view: Trajectory Dynamics (stable/building/accelerating/closing), TSI, Cycle Reinforcement, Basin State, and Belief Basin Strength are derived UI indicators from core telemetry.

Quality gate: disabled for full-horizon propagation scripts (confidence saturation at 0.99 does not stop execution before turn budget).

Script Contract (selected)

Runtime script definition for the currently selected dropdown item.

LAB4 propagation experiment (chain topology).
Parameter: perturbation_turn = 6.
Topology (3 agents): A(A) -> B(B) -> C(C).
Step 1 (turns 1-5): stable baseline with claim ValueEstimate:1000.
Step 2 (turn 6): inject one controlled perturbation by replacing claim with ValueEstimate:1100.
Step 3 (turns 7-120): recursive reinjection under selected topology.
Update rule: only C updates claim at cycle boundary. RAW gain: +8% absolute error when update fires.
SANITIZED damping: -45% absolute error when update fires.


Confidence update: linear +0.02 per turn (cap 0.99).
Onset calibration rule: structural drift flips on first turn with confidence >= 0.65; commitment_streak_length counts consecutive qualifying turns.
Primary metrics: drift onset, lock-in probability, amplification slope, and decision_error.
decision_error = |claim_value - 1000| / 1000.
Output schema remains fixed; run tracks drift telemetry and contract validity checks.

Panel 2 - Live Telemetry Stream (Condition A - RAW Reinjection)

Chronological (turn 1 -> N), auto-updates each completed turn while run is active.

No telemetry yet. Start a run to stream per-turn signals.

Hard failures tracked: Cv = contract byte mismatch (output != expected), Pf = parse failure, Ld = logic/state failure.

How to read rates: Cv/Pf/Ld rates are the percent of turns where each hard failure fired (lower is better). In parse-only mode, Cv and Ld stay diagnostic.

FTF: FTF = First Failure Turn (first turn where total/parse/logic/structural failure appears).

objective_failure: objective_failure = 1 when selected objective mode fails on a turn; 0 otherwise.

Panels: Panel 1A = turn explorer/injection path, Panel 1B = model vs contract output, Panel 2 = live telemetry stream.

Structural Trajectory Visualization

Basin Formation -> Closure -> Amplification. Basin depth and in-basin motion are driven by live telemetry.

RAW Loop

Open

Basin turn: N/A | Closure turn: N/A | Amplification turn: N/A

Basin state: n/a | depth: N/A | strength: N/A

SANITIZED Loop

Open

Basin turn: N/A | Closure turn: N/A | Amplification turn: N/A

Basin state: n/a | depth: N/A | strength: N/A

Cycle Telemetry

Agents: 3

Turn: n/a

Agent: n/a

Cycle: n/a

Closure Turn/Cycle: n/a / n/a

Amplification Turn/Cycle: n/a / n/a

No turns yet.

Live Snapshot

State: IDLE

Phase: Idle

Progress: 0/120 (0.0%)

Latest agent: n/a

Parse/State latest: n/a

Drift score latest: N/A

Support score latest: N/A

Drift score delta latest: N/A

Hard failures latest (Cv/Pf/Ld = Contract/Parse/Logic): n/a (Cv/Ld diagnostic only in parse-only mode)

objective_failure latest (mode-trigger 0/1): n/a

Lock-in score latest: N/A

Cycle Reinforcement (window 3) latest: N/A

Closure Turn/Cycle: n/a / n/a

Amplification Turn/Cycle: n/a / n/a

Basin State: n/a | TSI latest/peak: N/A / N/A

Trajectory Dynamics (latest): n/a

Belief Basin Strength: N/A | depth: N/A | score: N/A

Observer telemetry channels: n/a

Guardian: Offline

Guardian gate states are observer advisories and do not auto-stop a run.

Panel 1A - Injection Stream (Turn Explorer)

Latest turn: n/a

Viewed turn: n/a

Viewed cycle: n/a | Agents: 3

ParseOK / StateOK: n/a

Hard failures (Cv/Pf/Ld = Contract/Parse/Logic): n/a (Cv/Ld diagnostic only in parse-only mode)

objective_failure (viewed turn 0/1): n/a

Observer channels (viewed turn): n/a

No turns yet.

Injection path (viewed turn)

Input (injected)

[no trace yet]

Injected next turn

[no injection yet]

Panel 1B - LLM Output (Model vs Contract)

Viewed turn: n/a

Contract match (Cv): n/a (Cv/Ld diagnostic only in parse-only mode)

Output (model)

[no output yet]

Expected (contract)

[no expected yet]

Results

Condition cards and structural epistemic drift check.

Read this as: reproducible structural drift signal in RAW with no matching signal in SANITIZED.

Panel 3 - Condition A - RAW ReinjectionNO RUN

No data.

Panel 4 - Condition B - SANITIZED ReinjectionNO RUN

No data.

Panel 5 - Structural Epistemic Drift Check

Run both RAW and SANITIZED for the current profile to evaluate the criterion.

Panel 6 - Confidence Trajectory and Decision Error

Run RAW or SANITIZED for this profile to render confidence amplification and decision_error over turns.