NeuroRouter

Context operating system for live model windows.

NeuroRouter sits between Claude Code, Codex, and the model API. It keeps the live model window focused on the work that still matters: source transcript to semantic field to target-model context.

It is the answer to a very specific failure mode: one long session becomes stale, expensive, and fragile, but the operator still needs continuity. NeuroRouter preserves the active work field, repairs safe tool-chain breaks before they become upstream 400s, and nudges the agent away from dead loops before the session burns more money than progress.

When teams also use Hiveram, NeuroRouter does not become the shared truth. Hiveram stores the work graph and portable bundles. NeuroRouter decides what slice of that graph enters the live model window right now.

More context is not control. NeuroRouter projects the slice of work the model is allowed to act on now.

The four promises

No giant stale session by default Keep the live window tight enough to stay useful instead of dragging every dead branch forever.
No forced re-explanation Fresh execution sessions can start from a mission briefing or bounded context instead of replaying a transcript.
No hidden authority confusion Local shaping stays local. Shared truth lives somewhere explicit. Portable handoff stays bounded and reviewable.
No premium spend for routine execution Architect once, then let cheaper or more specialized agents execute from the same bounded truth.

Architect, recall, rocket

This is the operator workflow the product is growing toward: architect the work once, rehydrate a focused execution session with the right briefing, and launch a rocket-sized task on the cheapest capable surface.

Task payload The immediate objective, files, and reporting contract for a short focused execution pass.
Context payload The decisions, constraints, rejected paths, and evidence that the next session must keep intact.
Ice-cube payload Full-fidelity frozen context that stays outside the active window until someone explicitly recalls it.

Get The Free Version

The community edition is published at obstalabs/neurorouter. Install it with Homebrew, Scoop, or download the release binaries directly.

GitHub repo Latest release

Homebrew

macOS or Linux via the public Obsta Labs tap.

brew tap obstalabs/tap
brew install obstalabs/tap/neurorouter

Scoop

Windows via the public Obsta Labs bucket.

scoop bucket add obstalabs https://github.com/obstalabs/scoop-bucket
scoop install obstalabs/neurorouter

Direct Binary

Download tarballs and zip archives from the release page.

https://github.com/obstalabs/neurorouter/releases/latest

Four compiler passes

One product, one job: preserve what the next model call needs, remove what no longer carries work, and prove the result is still trustworthy.

Block detected secrets

Detected credentials, tokens, and connection strings are redacted or blocked locally before forwarding according to policy. Findings include rotation guidance. Secret protection is deterministic pattern matching for known credential formats; it does not claim to catch every encoding or transformation.

Extract the semantic field

The active objective, decisions, hard constraints, rejected approaches, quantities, file state, and blockers are promoted into a compact field contract. Repetition and scaffolding are allowed to fall away.

Compile for the target model

Claude, Codex, and compatible tool surfaces do not need identical transcripts. NeuroRouter emits the dialect each client can use while preserving the same load-bearing field.

Validate continuity

Session Integrity checks whether green metrics are still trustworthy. It can downgrade a perfect RCS when objective freshness, workspace identity, recovery, loop, progress, or tool-chain signals say the agent is no longer safe to continue.

Proof of continuity

The neurorouter integrity command summarizes whether sessions stayed healthy, degraded, or critical using support-safe session evidence. It reports size, RCS, anchor preservation, prevented failures, and integrity downgrades so compiler claims can be checked against real sessions instead of trusted as a demo metric.

Forked-session proof artifact

In the oauth-smoke-5 forked-session capture, a fresh execution branch held RCS 100 across 51 requests after the handoff. That matters because the proof is not "bigger context worked". The proof is that a fresh branch can stay structurally faithful without replaying the whole past.

oauth-smoke-5 forked session 018486d03
51 requests. RCS 100 sustained.
Fresh branch stayed aligned without transcript replay.
No continuity downgrade after the fork.

Cache-preservation proof artifact

In a 198-message Opus 4.7 session through NeuroRouter, Anthropic prompt cache hit rate stayed above 99% while context was actively shaped. Cache reads grew monotonically from 102K to 140K tokens. Cache writes per turn stayed in the hundreds. NeuroRouter shapes the suffix past the cache boundary, not the cached prefix.

session 87d04e1d — 198 messages, Opus 4.7, OAuth
cache_read:     102K → 140K (monotonic)
cache_creation: 200–5000 tokens/turn (suffix only)
input_tokens:   1–3 per turn
cache hit rate:  99%+
active shaping:  ~5% byte reduction, confined to post-breakpoint suffix

How it works

Drop-in replacement for your API endpoint.

# Claude Code
ANTHROPIC_BASE_URL=http://localhost:9120 claude

# Codex (Responses API)
neurorouter proxy --listen 127.0.0.1:9120 --client-profile codex --api-key env:OPENAI_API_KEY
codex -p nr resume SESSION_ID
codex -p nr -m gpt-5.5 resume SESSION_ID

# Codex config
[model_providers.neurorouter]
name = "NeuroRouter"
base_url = "http://127.0.0.1:9120"
openai_base_url = "http://127.0.0.1:9120"
wire_api = "responses"

[profiles.nr]
model_provider = "neurorouter"

# Other OpenAI-compatible clients
# Validate per client before claiming support.

Verified with Claude Code and Codex CLI. Other OpenAI-compatible tools require compatibility testing before support is claimed; generic chat-completions clients such as Qwen Code are not currently advertised as supported. Provider credentials pass through to configured upstream APIs; NeuroRouter does not phone home with request content or store provider keys on disk.

# start the proxy
neurorouter proxy --listen 127.0.0.1:9120

# see what would be shaped without sending
neurorouter proxy --listen 127.0.0.1:9120 --dry-run

Proof and mechanics

The request log shows what was sent, what was compiled away, and whether decisions, constraints, and rejected approaches survived:

[req] model=gpt-5.4  context=156KB (shaped from 3732KB, 95% shaped)  rcs:100  integrity=degraded

RCS is a request-level continuity score. It is useful, but it is not the whole health model: Session Integrity can invalidate a green RCS when the active objective is stale, the workspace lock conflicts, recovery had to disable major shaping stages, or loop/progress signals say the agent is stuck. Smaller context is not automatically better. Correct context is.

Vector Lock keeps the active constraint set: objective, chosen approach, current state, hard constraints, unresolved blockers, and rejected approaches. Workspace Identity Lock keeps the allowed repo, path, and release target explicit. These are not chat memory, RAG, or learning. They are the minimal local state that lets NeuroRouter compile the next request without losing the work.

What NeuroRouter is not

Security

NeuroRouter runs locally and forwards provider credentials only to the upstream provider you configure. It does not phone home with request content or store provider keys on disk. Detected credentials are redacted or blocked before forwarding according to policy.

This is a structural difference from cloud LLM proxies. A 2026 study found 26 LLM proxy services collecting user credentials. The LiteLLM supply-chain breach (March 2026) compromised thousands of organizations. A local proxy removes the hosted credential database from this path; it is not a promise to catch every encoded, chunked, or transformed secret.

When it pays for itself

NeuroRouter Pro is most useful when AI coding sessions get long, expensive, or fragile. It keeps useful context alive, removes stale transcript drag, protects detected secrets, repairs safe tool-chain continuity breaks before provider 400 errors, and warns when a session is no longer making trustworthy progress.

It is not a replacement for the model, a hosted gateway, memory, RAG, or autonomous agent brain. It is the live-window layer in a wider stack that can also include Hiveram for shared truth and portable handoff.

Pricing

Free

$0 — AGPL v3, self-hosted

Context hygiene. Deterministic shaping removes stale reads, repeated reminders, and detected secrets — locally, zero setup. The foundation.

Team

$49 / seat / month

LLM-augmented context intelligence. Everything in Pro plus: a small model (Haiku/Mini) runs parallel to extract objectives, constraints, and approaches that pattern matching cannot reach. Shared policy, session evidence analysis, and org-wide enforcement.

Enterprise

Custom pricing

Control AI usage at scale without losing speed. Org-wide policies, secure routing, and protection against data leaks, runaway cost, and workflow breakdowns.

Install Pro or Team

After checkout (or starting the 14-day trial), install the Pro build with Homebrew and activate your license key. Pro replaces the free neurorouter binary with the same command name — you do not run both at once.

1. Install

macOS or Linux via the public Obsta Labs tap.

brew tap obstalabs/tap
brew install obstalabs/tap/neurorouter-pro

2. Activate

Use the license key emailed at checkout or shown on the trial start page.

nr activate <your-license-key>

3. Launch

Start the local proxy and prepare your coding agent in one command.

nr launch claude
nr launch codex

Direct download: latest Pro release. Team and Enterprise tiers use the same binary — license keys carry the seat entitlements. Lost your key? Email hello@obstalabs.dev.