Amplifier Foundation · PR #131

Session Repair,
Repaired

Making the session-analyst agent reliable by removing its ability to break things

✓ Merged & Active

March 2026 · microsoft/amplifier-foundation

The Problem

A repair tool that
makes things worse

🏥

Session Analyst

An agent that helps users diagnose and repair broken Amplifier sessions — orphaned tool calls, ordering violations, incomplete turns.

⚡

Haiku-Class Model

A fast, cheap model with detailed manual JSONL editing instructions loaded into its context. Two paths available: a Python script or hand-editing files.

💥

Worse, Not Better

The agent frequently chose to hand-edit transcript.jsonl instead of using the repair script — making broken sessions more broken, not less.

Root Cause

The agent had two paths.
It picked the wrong one.

Path A — Script (desired)

A Python repair script. Correct, deterministic, tested. But described briefly in the agent's context.

Path B — Manual Edit (dangerous)

Detailed step-by-step JSONL editing procedures with exact JSON templates, bash commands, and worked examples. More detail = more likely to be used.

"A Haiku-class agent will attempt whatever its loaded context makes possible. The manual path had more detail, so it naturally preferred it."

— Foundation architecture review

The Fix

Three-part solution

🚫

1. Script-Only Mandate

Removed all manual repair procedures from the agent's context. Explicit prohibition on hand-editing JSONL files. No fallback path.

📐

2. Library Restructure

Reorganized session/ into a clean three-layer architecture. Consolidated duplicate code, replaced hardcoded strings with constants.

⚙️

3. Unified CLI

Created amplifier-session.py with subcommands: diagnose, repair, rewind, info, find. One script, one command per operation.

Part 1

Remove the dangerous path entirely

What was removed

All manual repair/rewind procedures with JSON templates and bash commands. Session-repair-knowledge.md went from 231 to 106 lines.

What was added

Explicit prohibition: "NEVER attempt to manually edit transcript.jsonl or events.jsonl." If the script fails, the agent stops and reports the error. No manual fallback.

"By removing the manual path entirely and making the script path the only one with executable detail, we align the agent's available actions with the desired behavior. There's literally nothing left for it to attempt manually."

— Foundation expert agent

Part 2

Three-layer session library

Level 1 — Message Algebra

Pure list[dict] operations

messages.py · diagnosis.py — Slicing, orphan detection, repair algorithms. No I/O, fully testable.

Level 2 — JSONL Store

File I/O with constants

store.py · events.py — Read/write transcripts, metadata, events. 14+ hardcoded filename strings replaced with constants.

Level 3 — Session Finder

Discovery & resolution

finder.py (NEW) — Resolve partial IDs, search by project/date/keyword/status. Sorted most-recent-first.

Duplicate orphan detection merged into one implementation. Private I/O helpers consolidated from 3 files into 1.

Part 3

From 4+ commands to 1

Before — Bash choreography

# Find the script somewhere on disk
SCRIPT="$(find / -path '*/amplifier-foundation/scripts/session-repair.py' -type f 2>/dev/null | head -1)"

# Find the session directory
SESSION_DIR="$(find ~/.amplifier/projects/*/sessions -name '*SESSION_ID*' -type d 2>/dev/null | head -1)"

# Run diagnosis, repair, verify
python "$SCRIPT" --diagnose "$SESSION_DIR"
python "$SCRIPT" --repair "$SESSION_DIR"
python "$SCRIPT" --diagnose "$SESSION_DIR"

After — Single commands

# Partial ID auto-resolves
python scripts/amplifier-session.py diagnose abc123
python scripts/amplifier-session.py repair abc123
python scripts/amplifier-session.py diagnose abc123

# New capabilities
python scripts/amplifier-session.py info abc123
python scripts/amplifier-session.py find --status broken
python scripts/amplifier-session.py find --keyword "error"

Key Insight

The fix wasn't
"make the agent smarter."

It was remove the
dangerous path
from its context.

Design the agent's environment so the right action is the only action available.

Results

What changed

850

tests passing — 88 new, 0 failures

19 → 37

exported symbols in __init__.py — strict superset, zero removals

~500 → ~400

lines in session-analyst agent instructions

231 → 106

lines in session-repair-knowledge.md

Deterministic

Session repair is now script-based, not probabilistic LLM hand-editing

Portable

Works with less capable models — the agent just invokes commands. Users can also run the CLI directly.

Development Velocity

Built in one session

32

Commits

28

Files Changed

5,233

Lines Added

~5h

Dev Time

1,143 lines removed. Human architect + AI agents throughout: 4 expert agents consulted on architecture, subagent-driven-development for implementation, 4 independent backward-compatibility reviewers before merge.

Primary contributor: Brian Krabach (bkrabach) — 32/32 commits. Multiple commits co-authored with Amplifier AI agents.

Process

Human + AI collaboration

🧭

Architecture

4 expert agents consulted: foundation-expert, amplifier-expert, zen-architect, core-expert. They confirmed session storage belongs in foundation.

📋

Design

Brainstorm mode produced a design with 8 validated sections. Three-phase execution plan: core restructure → new capabilities → agent rewire.

🔍

Verification

Subagent-driven-development recipe: fresh agent per task with spec review + quality review. 4 independent reviewers confirmed zero breaking changes.

Sources

Research Methodology

Data as of: March 18, 2026

Feature status: Merged & Active (PR #131 merged into main)

Repository: microsoft/amplifier-foundation

Research performed:

PR metadata: gh pr view 131 --json title,body,additions,deletions,changedFiles,commits,...
Commit timeline: gh pr view 131 --json commits --jq '[.commits[].authoredDate] | sort' → first: 2026-03-18T21:29:41Z, last: 2026-03-19T02:23:47Z
Contributor analysis: gh pr view 131 --json commits --jq '[.commits[].authors[].login]' → bkrabach: 32 commits, microsoft-amplifier: 1 co-author
File breakdown: gh pr view 131 --json files → 28 files, 12 test files
PR description: Test counts (850 total, 88 new), symbol counts (19 → 37), line reductions

Gaps & notes:

Agent instruction line counts (~500 → ~400) are approximate per PR description, prefixed with ~
One commit message reports __all__ as "21 to 38" while PR summary says "19 → 37" — used PR summary as canonical
Dev time (~5h) measured from first commit to last commit timestamps; excludes earlier design thinking
"4 expert agents" and "brainstorm mode" details from user briefing, not independently verified from commits

Primary contributor: Brian Krabach (bkrabach) — 100% of commits (32/32)

Get Started

Try it now

# Diagnose a session by partial ID
python scripts/amplifier-session.py diagnose abc123

# Find all broken sessions
python scripts/amplifier-session.py find --status broken

# Repair and verify
python scripts/amplifier-session.py repair abc123

Session repair is now deterministic. The agent can't break what it can't touch.

View PR #131 →

Session Repair,Repaired

A repair tool thatmakes things worse