How it works
How Ren works
Ren is a small, local-first FastAPI service that turns plain intent into action through Claude and a pluggable tool system — with your memory and identity kept on the device it runs on.
The lifecycle of a turn
From a sentence to an action
A request arrives
You talk to Ren over HTTP (POST /chat, or SSE /chat/stream) or the /ws/audio voice socket. It’s a small local FastAPI service — by default bound to 127.0.0.1.
The agent assembles context
It loads recent turns from on-device memory and a cached system prompt (a stable base framework + your persona), then calls Claude at the requested tier.
Bounded multi-round tools
Claude can request tools; the registry dispatches them (async) and feeds results back, up to a round budget. If the budget is hit, one final call forces a written answer.
Persist & reply
The turn is written to the local SQLite store (WAL), older context is compacted into a rolling summary, and the reply streams back to you. Nothing leaves the device but the Claude call itself.
curl -s localhost:8000/chat \
-H 'content-type: application/json' \
-d '{"message": "dim the living room to movie night"}'
# → Ren picks a tier, runs the home tools, persists the turn, and replies.The pieces
Small parts, clean seams
Each component does one thing and stays replaceable — the design is built to grow without rewrites.
FastAPI surface
The HTTP/WebSocket service — /health, /chat, /memory, /threads, /identity, /ws/audio — with lifespan-managed state.
Agent loop
History + system prompt → Claude → bounded multi-round tool calls → persist. The heart of every turn.
Tool registry
A plugin registry with a dangerous gate. Register a tool (name, JSON schema, sync/async handler) and nothing else changes.
Local memory
Async SQLite (WAL, migrations) on your device — conversation turns and durable notes. Gitignored; never synced.
Identity
A local ed25519 keypair; attest() returns a self-signed identity card — the seam for a future trust network.
Model tiers
Code asks for fast / default / hard; one place maps intent → Claude model. A tiny heuristic, the router seam.
Home engine
A provider abstraction over Home Assistant, a generic webhook, and the capability-model connectors — rooms, scenes, automations, a state watcher.
Voice pipeline
Optional, CPU-only: Whisper STT, Piper TTS, Silero VAD, and wake-word — loaded only when you enable it.
Design commitments
The principles it holds to
Private by default
Memory is a SQLite file on your machine. Nothing about Ren requires a cloud account beyond the Claude API call itself.
Hardware-agnostic now, sovereign later
Ren runs anywhere Python runs. Only identity/ knows about hardware — today a key file; later, silicon-rooted, with no caller changes.
Speak in intent, not model ids
Callers ask for fast/default/hard. The mapping lives in one place — the seam where a real router will later live.
Configurable persona
The system prompt is a stable base framework plus your name + free-text overlay, composed once at startup as a prompt-cache anchor.