NeuroRouter sits between Claude Code, Codex, and the model API. It keeps the live model window focused on the work that still matters: source transcript to semantic field to target-model context.
It is the answer to a very specific failure mode: one long session becomes stale, expensive, and fragile, but the operator still needs continuity. NeuroRouter preserves the active work field, repairs safe tool-chain breaks before they become upstream 400s, and nudges the agent away from dead loops before the session burns more money than progress.
When teams also use Hiveram, NeuroRouter does not become the shared truth. Hiveram stores the work graph and portable bundles. NeuroRouter decides what slice of that graph enters the live model window right now.
More context is not control. NeuroRouter projects the slice of work the model is allowed to act on now.
This is the operator workflow the product is growing toward: architect the work once, rehydrate a focused execution session with the right briefing, and launch a rocket-sized task on the cheapest capable surface.
The community edition is published at obstalabs/neurorouter. Install it with Homebrew, Scoop, or download the release binaries directly.
GitHub repo Latest releasemacOS or Linux via the public Obsta Labs tap.
brew tap obstalabs/tap brew install obstalabs/tap/neurorouter
Windows via the public Obsta Labs bucket.
scoop bucket add obstalabs https://github.com/obstalabs/scoop-bucket scoop install obstalabs/neurorouter
Download tarballs and zip archives from the release page.
https://github.com/obstalabs/neurorouter/releases/latest
One product, one job: preserve what the next model call needs, remove what no longer carries work, and prove the result is still trustworthy.
Block detected secrets
Detected credentials, tokens, and connection strings are redacted or blocked locally before forwarding according to policy. Findings include rotation guidance. Secret protection is deterministic pattern matching for known credential formats; it does not claim to catch every encoding or transformation.
Extract the semantic field
The active objective, decisions, hard constraints, rejected approaches, quantities, file state, and blockers are promoted into a compact field contract. Repetition and scaffolding are allowed to fall away.
Compile for the target model
Claude, Codex, and compatible tool surfaces do not need identical transcripts. NeuroRouter emits the dialect each client can use while preserving the same load-bearing field.
Validate continuity
Session Integrity checks whether green metrics are still trustworthy. It can downgrade a perfect RCS when objective freshness, workspace identity, recovery, loop, progress, or tool-chain signals say the agent is no longer safe to continue.
The neurorouter integrity command summarizes whether sessions
stayed healthy, degraded, or critical using support-safe session evidence.
It reports size, RCS, anchor preservation, prevented failures, and integrity
downgrades so compiler claims can be checked against real sessions instead
of trusted as a demo metric.
Forked-session proof artifact
In the oauth-smoke-5 forked-session capture, a fresh execution branch held RCS 100 across 51 requests after the handoff. That matters because the proof is not "bigger context worked". The proof is that a fresh branch can stay structurally faithful without replaying the whole past.
oauth-smoke-5 forked session 018486d03 51 requests. RCS 100 sustained. Fresh branch stayed aligned without transcript replay. No continuity downgrade after the fork.
Cache-preservation proof artifact
In a 198-message Opus 4.7 session through NeuroRouter, Anthropic prompt cache hit rate stayed above 99% while context was actively shaped. Cache reads grew monotonically from 102K to 140K tokens. Cache writes per turn stayed in the hundreds. NeuroRouter shapes the suffix past the cache boundary, not the cached prefix.
session 87d04e1d — 198 messages, Opus 4.7, OAuth cache_read: 102K → 140K (monotonic) cache_creation: 200–5000 tokens/turn (suffix only) input_tokens: 1–3 per turn cache hit rate: 99%+ active shaping: ~5% byte reduction, confined to post-breakpoint suffix
Drop-in replacement for your API endpoint.
# Claude Code ANTHROPIC_BASE_URL=http://localhost:9120 claude # Codex (Responses API) neurorouter proxy --listen 127.0.0.1:9120 --client-profile codex --api-key env:OPENAI_API_KEY codex -p nr resume SESSION_ID codex -p nr -m gpt-5.5 resume SESSION_ID # Codex config [model_providers.neurorouter] name = "NeuroRouter" base_url = "http://127.0.0.1:9120" openai_base_url = "http://127.0.0.1:9120" wire_api = "responses" [profiles.nr] model_provider = "neurorouter" # Other OpenAI-compatible clients # Validate per client before claiming support.
Verified with Claude Code and Codex CLI. Other OpenAI-compatible tools require compatibility testing before support is claimed; generic chat-completions clients such as Qwen Code are not currently advertised as supported. Provider credentials pass through to configured upstream APIs; NeuroRouter does not phone home with request content or store provider keys on disk.
# start the proxy neurorouter proxy --listen 127.0.0.1:9120 # see what would be shaped without sending neurorouter proxy --listen 127.0.0.1:9120 --dry-run
The request log shows what was sent, what was compiled away, and whether decisions, constraints, and rejected approaches survived:
[req] model=gpt-5.4 context=156KB (shaped from 3732KB, 95% shaped) rcs:100 integrity=degraded
RCS is a request-level continuity score. It is useful, but it is not the whole health model: Session Integrity can invalidate a green RCS when the active objective is stale, the workspace lock conflicts, recovery had to disable major shaping stages, or loop/progress signals say the agent is stuck. Smaller context is not automatically better. Correct context is.
Vector Lock keeps the active constraint set: objective, chosen approach, current state, hard constraints, unresolved blockers, and rejected approaches. Workspace Identity Lock keeps the allowed repo, path, and release target explicit. These are not chat memory, RAG, or learning. They are the minimal local state that lets NeuroRouter compile the next request without losing the work.
NeuroRouter runs locally and forwards provider credentials only to the upstream provider you configure. It does not phone home with request content or store provider keys on disk. Detected credentials are redacted or blocked before forwarding according to policy.
This is a structural difference from cloud LLM proxies. A 2026 study found 26 LLM proxy services collecting user credentials. The LiteLLM supply-chain breach (March 2026) compromised thousands of organizations. A local proxy removes the hosted credential database from this path; it is not a promise to catch every encoded, chunked, or transformed secret.
lsof -i -P | grep neurorouter — only your upstreamgo build ./cmd/neurorouterNeuroRouter Pro is most useful when AI coding sessions get long, expensive, or fragile. It keeps useful context alive, removes stale transcript drag, protects detected secrets, repairs safe tool-chain continuity breaks before provider 400 errors, and warns when a session is no longer making trustworthy progress.
It is not a replacement for the model, a hosted gateway, memory, RAG, or autonomous agent brain. It is the live-window layer in a wider stack that can also include Hiveram for shared truth and portable handoff.
Free
$0 — AGPL v3, self-hosted
Context hygiene. Deterministic shaping removes stale reads, repeated reminders, and detected secrets — locally, zero setup. The foundation.
Pro
$29 / month
Deterministic context engineering. Vector Lock, Session Integrity, anchor preservation, graduated nudges, and local recovery. Every transformation is pattern-matched — no LLM calls, no network dependency, no non-determinism. The proxy is a mirror, not an oracle.
Team
$49 / seat / month
LLM-augmented context intelligence. Everything in Pro plus: a small model (Haiku/Mini) runs parallel to extract objectives, constraints, and approaches that pattern matching cannot reach. Shared policy, session evidence analysis, and org-wide enforcement.
Enterprise
Custom pricing
Control AI usage at scale without losing speed. Org-wide policies, secure routing, and protection against data leaks, runaway cost, and workflow breakdowns.
After checkout (or starting the 14-day trial), install the Pro build with
Homebrew and activate your license key. Pro replaces the free
neurorouter binary with the same command name — you do
not run both at once.
macOS or Linux via the public Obsta Labs tap.
brew tap obstalabs/tap brew install obstalabs/tap/neurorouter-pro
Use the license key emailed at checkout or shown on the trial start page.
nr activate <your-license-key>
Start the local proxy and prepare your coding agent in one command.
nr launch claude nr launch codex
Direct download: latest Pro release. Team and Enterprise tiers use the same binary — license keys carry the seat entitlements. Lost your key? Email hello@obstalabs.dev.