Sensorium — Sycophancy v0.1
Sensorium is a desktop chat app that maps how a language model handles sycophancy triggers. You bring your own OpenRouter key. Sensorium runs a small classifier locally via Ollama, sends a calibrated battery of probes to your chosen model, and renders a per-axis reading of how the model resists, softens, or yields under epistemic and social pressure.
Sycophancy is the failure mode where a model tells the user what the user wants to hear — building on planted falsehoods, abandoning correct positions under pressure, fabricating praise for weak work, defending logical contradictions, affirming false certainty about high-risk choices. Sensorium probes for each of these, separately, and shows you the shape of the model's behaviour rather than a single score.
Asking AI to "judge whether this is good" is the failure mode; left unconstrained, language models drift toward exactly this verdict-issuing posture. Sensorium is the discipline of refusing the drift. — from the Sensorium spec, §2
Features
Sensorium's surface is small by design — a chat panel, a cartography panel, a status strip. The features below name what each surface does, what each layer measures, and what each release artefact ships with. Nothing here is hidden behind sign-ups, paywalls, or telemetry.
.deb and .flatpak. Tauri-based, so the binary stays under 5 MB per arch.What it tests
Each axis ships with ten named probes of varying stylistic framing — academic, casual, adversarial, relational, philosophical, personal. By default, calibration draws one probe at random per axis. From settings, you can pin a specific named probe per axis instead — useful for repeatable tests against the same model on different days.
How it works
Sensorium splits its work across three layers, deliberately. Asking a language model to "judge whether a response was good" is the failure mode this architecture is built against — language models drift into verdict-issuing posture. Sensorium confines language work to bounded interfaces and puts the consequential judgement in code humans can inspect.
Qualification uses a small local language model (qwen2.5 family via Ollama) to classify each chat response into one of five fixed categories: refusal, redirect, templated, silent, substantive. This is bounded language work — fast, free, private to your machine.
Rules are deterministic Rust code. They read the classifications and dial values, then emit the per-axis verdict (HOLDS / SOFTENS / FOLDS). No machine learning at this layer; rules are auditable. Every verdict can be traced back to the inputs that produced it.
Language uses Claude Haiku (via OpenRouter, temperature 0) to narrate the verdicts in plain prose. The narrator never decides — it only describes what the rules layer already concluded.
Component details
The pieces, named precisely. Sensorium is built so each component can be swapped or upgraded without rebuilding the others — change the chat model from settings, change the classifier by pulling a different Ollama model, change the narration depth with a single dropdown.
| Component | Detail |
|---|---|
| Runtime | Tauri 2.x (Rust core + native WebView). ~3× smaller binary and ~3× lower idle memory than Electron — under 120 MB at v0.1. |
| Chat provider | OpenRouter. Any model accessible via your key — Claude, GPT-class, Gemini, Llama, Mistral, Qwen. Selected from a dropdown. |
| Classifier (Q-layer) | Ollama running locally. Default qwen2.5:0.5b (~400 MB), recommended qwen2.5:3b (~2 GB) for higher schema-population accuracy. |
| Rules engine (R-layer) | Deterministic Rust. Produces HOLDS / SOFTENS / FOLDS verdicts per axis from classifier output and prompt framing context. |
| Narrator (L-layer) | Claude Haiku 4.5 via OpenRouter. Temperature fixed at 0. Four narration modes — raw, economical, functional (default), robust — varying depth and cost. |
| Probe bank | Five axes × ten named probes. Stored as JSON; user-editable on disk. Calibration draws one per axis; full refresh runs 2–3 framings per axis. |
| Storage | JSON files in your OS user-data directory. API key in OS keychain (Keychain on macOS, libsecret on Linux). No telemetry, no analytics. |
| Refresh cadence | Default once per 24 hours per chat model. Configurable: 1h / 6h / 24h / weekly / manual. |
| Cost per refresh | ~$0.08 (raw) → ~$0.31 (robust) against Claude Sonnet 4.6. Dominated by chat-probe response tokens, not narration. |
Sensorium does not load ML models in its own process. All ML is external — cloud APIs or local daemons. Sensorium is the client, never the model server. This is a lifetime architectural commitment, not a v0.1 limitation. — from the Sensorium spec, §18.1
Download
Sensorium runs entirely on your machine. The only network calls are to OpenRouter (for chat and narration) and to Ollama on localhost (for classification). No telemetry, no install pings, no analytics.
sudo apt install ./sensorium_…amd64.deb
First launch on macOS: Sensorium is unsigned — Koher does not pay Apple's notarisation fee. macOS Gatekeeper warns the first time you launch. To bypass: right-click Sensorium.app → Open → Open anyway. Or from Terminal:
xattr -d com.apple.quarantine /Applications/Sensorium.app
After the first launch, no warning appears. The full source is on GitHub; you can read every line of what the app does.
Before you launch
Sensorium needs two things outside itself: an OpenRouter account and Ollama running locally.
OpenRouter API key
Sensorium uses your OpenRouter key for the chat model and the narration model. One key covers both. Keys are pay-as-you-go; minimum top-up is around $5, which covers months of casual use.
- Sign up at openrouter.ai
- Top up at openrouter.ai/credits
- Create a key at openrouter.ai/keys
- Paste the key into Sensorium's first-run wizard. It is stored in your OS keychain (macOS Keychain or Linux libsecret), never in a settings file.
Ollama
Free, open-source local model runtime. Sensorium uses it for the classifier — runs entirely on your machine.
- Download from ollama.com
- Pull the recommended model (Sensorium will recommend one based on your RAM):
ollama pull qwen2.5:3bfor 12–24 GB machines,qwen2.5:7bfor 24 GB+.
Cost
Calibration costs about $0.10–$0.30 per refresh against Claude Sonnet 4.6, depending on the narration mode you pick. Default cadence is once per 24 hours per chat model — calibration does not run on every launch. Chat itself is whatever the model you pick costs per token.
Privacy
All state stays on your machine. Sensorium does not phone home. There is no install ping, no usage analytics, no error reporter. The only network calls are direct HTTPS to OpenRouter (when you chat or refresh calibration) and HTTP to Ollama on localhost (when classifying responses).
Flavours
Sensorium ships as flavours — JSON configs that fully specify a behavioural-posture probe set. The base engine is one piece of code; each flavour cuts the model surface differently. Sycophancy ships bundled in every release; future flavours (Cop-out, others) install via Settings → Install a flavour → From URL.
Source & licence
Source on GitHub at koherarchitecture/sensorium. Released under AGPL-3.0. You can use Sensorium freely, modify it, redistribute it, run it for any purpose. If you modify Sensorium and run that modified version as a network service that others interact with, you must publish your changes. For typical desktop users this clause never bites; for organisations building hosted services on Sensorium's code, the source obligation kicks in.
Upcoming
Sensorium ships at most once every two weeks. The cadence is a ceiling, not a floor — most release windows pass without a release if nothing meaningful is ready. One substantive item per release; bug fixes ride along whenever they accumulate.
| Version | Earliest | What lands |
|---|---|---|
| v0.1.7 | 5 June 2026 | Windows packaging + Linux catch-up. Native Windows build alongside macOS and Linux. Code signing for Windows remains gated on grant funding. Linux .deb + .flatpak rebuilt against the v0.1.6 source so Linux users no longer trail Mac by two hotfix versions. |
| v0.1.8 | 19 June 2026 | Auto-update plumbing groundwork. Engine-level work toward in-app updates; not yet user-visible. |
| v0.1.9 | 3 July 2026 | Open from buffer. Substance pulled from in-development backlog; candidates include cue-cadence smoothing, vocabulary tuning, history-replay re-derivation, registry page on koher.app. |