All tools

Sensorium — Sycophancy v0.1

Desktop application macOS · Windows · Linux AGPL-3.0 Released 8 May 2026 v0.1.7 · 2 June 2026 → Flavours →

Sensorium is a desktop chat app that maps how a language model handles sycophancy triggers. You bring your own OpenRouter key. Sensorium runs a small classifier locally via Ollama, sends a calibrated battery of probes to your chosen model, and renders a per-axis reading of how the model resists, softens, or yields under epistemic and social pressure.

Five axes · one example posture · your model will produce its own pattern

Sycophancy is the failure mode where a model tells the user what the user wants to hear — building on planted falsehoods, abandoning correct positions under pressure, fabricating praise for weak work, defending logical contradictions, affirming false certainty about high-risk choices. Sensorium probes for each of these, separately, and shows you the shape of the model's behaviour rather than a single score.

Asking AI to "judge whether this is good" is the failure mode; left unconstrained, language models drift toward exactly this verdict-issuing posture. Sensorium is the discipline of refusing the drift. — from the Sensorium spec, §2

Features

Sensorium's surface is small by design — a chat panel, a cartography panel, a status strip. The features below name what each surface does, what each layer measures, and what each release artefact ships with. Nothing here is hidden behind sign-ups, paywalls, or telemetry.

Local-first chat

No cloud, no telemetry, no install ping. State stays in your OS user-data directory; the API key in the OS keychain. The only network calls are direct HTTPS to OpenRouter and HTTP loopback to Ollama on your machine.

Bring your own model

One OpenRouter key, every model. Switch between Claude, GPT-class, Gemini, Llama, Mistral, Qwen from a dropdown in the top bar. The cartography re-reads on the new model with one click.

Filter cartography

A five-row map showing how the active model handles each sycophancy axis. Each row carries a verdict, an expandable probe-and-response trace, and the five-dial cluster. Updated on calibration, never on every chat turn.

Five-dial cluster

Per probe, code extracts five signals from the response — capitulation depth, hedge density, affirmation count, concession depth, refusal-pattern fit. All deterministic; no ML at the rules layer. Reproducible from the same input.

New · v0.1.7

Live sensed-split needle

A galvanometer needle reads each model reply and moves every chat round — calibration sets the opening baseline, the conversation drives the instrument from there. Code maps the bounded dial signals to a held value; the model never issues the reading. This is the sensed-split register of the split ratio — what an interaction reads as, live, distinct from the practitioner's self-rated figure.

Three-state verdicts

HOLDS · SOFTENS · FOLDS. Three categories per axis; no numeric score, no rating out of ten. Pseudo-precision is a verdict shape Sensorium refuses.

Behind the Curtain

Every probe exposes its Q → R → L trace — the classifier output the qualifier produced, the deterministic rule the result triggered, the prompt the narrator received. Any verdict is auditable back to the inputs that produced it.

New · v0.1.3

Suggested-tone cues

A row of system-selected coaching cues appears above the composer once a calibration has run. Up to three cues from a five-candidate vocabulary, drawn from the recent fingerprint. Read-only — the system selects them; you read them as coaching for your next message.

Chat history + search

Conversations persist across launches. A sidebar lists prior conversations; search runs across both titles and contents. Each exchange records the flavour and the model used at the time, so old conversations replay with their original context intact.

Cost transparency

Per-calibration cost is shown before any refresh runs. Four narration modes (raw · economical · functional · robust) act as the cost lever — roughly $0.08 to $0.31 per refresh against Claude Sonnet 4.6. Chat costs are per-token at the model's published rate.

Cross-platform

Native builds for macOS (Apple Silicon + Intel), Windows (x64), and Linux (amd64 + arm64). Linux ships as both .deb and .flatpak. Tauri-based, so the binary stays under 5 MB per arch.

Open source · AGPL-3.0

Every line of source on GitHub. Audit any pipeline step, fork for your own use, modify and redistribute freely. The AGPL clause only triggers for organisations running modified versions as hosted network services.

Flavour architecture

Sensorium is a base engine; Sycophancy is the first flavour. Each flavour is a JSON config shipping its own probe bank, dials, and narration prompts. Cop-out is queued next; future flavours land as drop-in JSON without engine changes.

What it tests

i. Planted falsehood

A false premise embedded in academic-register prose. Does the model correct the falsehood or build on it?

ii. Pressure reversal

The user pushes the model to abandon a correct earlier position. Does the model fold or hold?

iii. Mediocre praise

The user asks for praise of weak creative work. Does the model fabricate quality or assess honestly?

iv. Contradiction validation

The user asks the model to defend a logical contradiction. Does the model name the contradiction or argue both sides?

v. Certainty validation

The user asserts false certainty about a high-risk choice. Does the model affirm or examine?

Each axis ships with ten named probes of varying stylistic framing — academic, casual, adversarial, relational, philosophical, personal. By default, calibration draws one probe at random per axis. From settings, you can pin a specific named probe per axis instead — useful for repeatable tests against the same model on different days.

How it works

Sensorium splits its work across three layers, deliberately. Asking a language model to "judge whether a response was good" is the failure mode this architecture is built against — language models drift into verdict-issuing posture. Sensorium confines language work to bounded interfaces and puts the consequential judgement in code humans can inspect.

Qualification uses a small local language model (qwen2.5 family via Ollama) to classify each chat response into one of five fixed categories: refusal, redirect, templated, silent, substantive. This is bounded language work — fast, free, private to your machine.

Rules are deterministic Rust code. They read the classifications and dial values, then emit the per-axis verdict (HOLDS / SOFTENS / FOLDS). No machine learning at this layer; rules are auditable. Every verdict can be traced back to the inputs that produced it.

Language uses Claude Haiku (via OpenRouter, temperature 0) to narrate the verdicts in plain prose. The narrator never decides — it only describes what the rules layer already concluded.

Component details

The pieces, named precisely. Sensorium is built so each component can be swapped or upgraded without rebuilding the others — change the chat model from settings, change the classifier by pulling a different Ollama model, change the narration depth with a single dropdown.

Component	Detail
Runtime	Tauri 2.x (Rust core + native WebView). ~3× smaller binary and ~3× lower idle memory than Electron — under 120 MB at v0.1.
Chat provider	OpenRouter. Any model accessible via your key — Claude, GPT-class, Gemini, Llama, Mistral, Qwen. Selected from a dropdown.
Classifier (Q-layer)	Ollama running locally. Default `qwen2.5:0.5b` (~400 MB), recommended `qwen2.5:3b` (~2 GB) for higher schema-population accuracy.
Rules engine (R-layer)	Deterministic Rust. Produces HOLDS / SOFTENS / FOLDS verdicts per axis from classifier output and prompt framing context.
Narrator (L-layer)	Claude Haiku 4.5 via OpenRouter. Temperature fixed at 0. Four narration modes — raw, economical, functional (default), robust — varying depth and cost.
Probe bank	Five axes × ten named probes. Stored as JSON; user-editable on disk. Calibration draws one per axis; full refresh runs 2–3 framings per axis.
Storage	JSON files in your OS user-data directory. API key in OS keychain (Keychain on macOS, libsecret on Linux). No telemetry, no analytics.
Refresh cadence	Default once per 24 hours per chat model. Configurable: 1h / 6h / 24h / weekly / manual.
Cost per refresh	~$0.08 (raw) → ~$0.31 (robust) against Claude Sonnet 4.6. Dominated by chat-probe response tokens, not narration.

AI does language work · code does judgement · AI translates verdicts to prose

Sensorium does not load ML models in its own process. All ML is external — cloud APIs or local daemons. Sensorium is the client, never the model server. This is a lifetime architectural commitment, not a v0.1 limitation. — from the Sensorium spec, §18.1

Download

Sensorium runs entirely on your machine. The only network calls are to OpenRouter (for chat and narration) and to Ollama on localhost (for classification). No telemetry, no install pings, no analytics.

macOS — Apple Silicon

M1 / M2 / M3 / M4 — recommended for most newer Macs

Download .dmg →

macOS — Intel

x86_64 — for older Intel-based Macs

Download .dmg →

Windows — x64

x86_64 — NSIS installer (.exe) for Windows 10 / 11

Download .exe →

Linux — Debian / Ubuntu

x86_64 .deb · install with sudo apt install ./sensorium_…amd64.deb

Download .deb →

Linux — Flatpak

Any distribution with flatpak — sandboxed install

Download .flatpak →

First launch on macOS: Sensorium is unsigned — Koher does not pay Apple's notarisation fee. macOS Gatekeeper warns the first time you launch. To bypass: right-click Sensorium.app → Open → Open anyway. Or from Terminal:

xattr -d com.apple.quarantine /Applications/Sensorium.app

First launch on Windows: the installer is unsigned for the same reason — Koher does not pay for a code-signing certificate. Windows SmartScreen shows a blue "Windows protected your PC" screen the first time you run it. To proceed: click More info → Run anyway. No warning appears afterwards.

After the first launch on either platform, no warning appears. The full source is on GitHub; you can read every line of what the app does.

Before you launch

Sensorium needs two things outside itself: an OpenRouter account and Ollama running locally.

OpenRouter API key

Sensorium uses your OpenRouter key for the chat model and the narration model. One key covers both. Keys are pay-as-you-go; minimum top-up is around $5, which covers months of casual use.

Sign up at openrouter.ai
Top up at openrouter.ai/credits
Create a key at openrouter.ai/keys
Paste the key into Sensorium's first-run wizard. It is stored in your OS keychain (macOS Keychain or Linux libsecret), never in a settings file.

Ollama

Free, open-source local model runtime. Sensorium uses it for the classifier — runs entirely on your machine.

Download from ollama.com
Pull the recommended model (Sensorium will recommend one based on your RAM): ollama pull qwen2.5:3b for 12–24 GB machines, qwen2.5:7b for 24 GB+.

Cost

Calibration costs about $0.10–$0.30 per refresh against Claude Sonnet 4.6, depending on the narration mode you pick. Default cadence is once per 24 hours per chat model — calibration does not run on every launch. Chat itself is whatever the model you pick costs per token.

Privacy

All state stays on your machine. Sensorium does not phone home. There is no install ping, no usage analytics, no error reporter. The only network calls are direct HTTPS to OpenRouter (when you chat or refresh calibration) and HTTP to Ollama on localhost (when classifying responses).

Flavours

Sensorium ships as flavours — JSON configs that fully specify a behavioural-posture probe set. The base engine is one piece of code; each flavour cuts the model surface differently. Sycophancy ships bundled in every release; future flavours (Cop-out, others) install via Settings → Install a flavour → From URL.

Browse the flavour registry →

Source & licence

Source on GitHub at koherarchitecture/sensorium. Released under AGPL-3.0. You can use Sensorium freely, modify it, redistribute it, run it for any purpose. If you modify Sensorium and run that modified version as a network service that others interact with, you must publish your changes. For typical desktop users this clause never bites; for organisations building hosted services on Sensorium's code, the source obligation kicks in.

Upcoming

Sensorium ships at most once every two weeks. The cadence is a ceiling, not a floor — most release windows pass without a release if nothing meaningful is ready. One substantive item per release; bug fixes ride along whenever they accumulate.

Version	Earliest	What lands
v0.1.8	19 June 2026	Auto-update plumbing groundwork. Engine-level work toward in-app updates; not yet user-visible.
v0.1.9	3 July 2026	Open from buffer. Substance pulled from in-development backlog; candidates include cue-cadence smoothing, vocabulary tuning, history-replay re-derivation, registry page on koher.app.

Earliest dates, not commitments. A "broken-enough-to-fix-now" bug ships immediately as a hotfix and does not wait for the next window.

Source on GitHub Latest release Full README All tools Koher positions