// What you get
Everything a home assistant should do — on your terms.
Kenzy hears you, understands you, acts on your home, and talks back — with every piece able to run on hardware you own. Here's what that means in practice. (Each capability is also its own small service, so you can run them on one machine or spread them across the house — the technical view lives on the architecture page.)
01 · IN EVERY ROOM · kenzy-node
Wake word, listening on-device
Every room node runs openWakeWord on each audio frame, locally. Nothing is streamed anywhere until the wake word actually fires — the mic isn't an open line to the cloud.
An optional Silero VAD gate suppresses false triggers on near-silence, so you can lower the threshold for better real-speech sensitivity without the assistant waking to a creaky floorboard.
- Bundled wake model, or drop in a
.tflite/.onnxyou trained yourself - VAD-gated detection to cut false wakes
- Wake-word interrupt — talk over playback to start a new request
Loaded models
Detection
02 · UNDERSTANDS YOU · kenzy-stt
Transcription on your machine
Speech-to-text runs on faster-whisper — locally, by default. Pick a model size to match your hardware, from tiny on a Pi to large-v3 on a GPU workstation.
Because it's a service behind a URL, you choose where it lives. Keep it on the LAN and your spoken words are transcribed without ever touching someone else's server.
- CPU or CUDA, with int8 / float16 compute options
- Model size is one line of config
- Prefer the cloud? An OpenAI-backed mode is one line of config — and if it fails, Kenzy quietly falls back to local whisper
whisper: model: "base" # tiny → large-v3 device: "cpu" # or "cuda" compute_type: "int8" language: "en"
03 · YOUR AI BRAIN · kenzy-llm
Bring your own language model
This is the heart of the "100% local" story. kenzy-llm runs on LiteLLM, which talks to local runtimes — Ollama, LM Studio, vLLM — exactly the same way it talks to OpenAI or Anthropic.
So you decide the privacy/quality trade-off, per install. Run a small model entirely offline, or route to a frontier model in the cloud. Changing your mind is two lines of YAML.
- Local or cloud — same config, swap freely
- Optional local fallback — a cloud outage degrades to a local model instead of silence
- Per-room conversation memory with a short TTL
- Structured responses that carry a TTS voice style
# fully offline model: "ollama/llama3.1" base_url: "http://localhost:11434"
04 · ACTS ON YOUR WORLD · skills
Tool-calling skills, zero boilerplate
A skill is just an async Python function in skills/ with a @skill decorator. Kenzy reads its signature and docstring to build the tool schema automatically — the model calls it when it fits.
Weather, news, stocks, web search, timers & reminders, shopping lists, and Home Assistant control ship in the box. Adding your own is one file, no registration.
- Auto-generated tool schemas from type hints
- Per-skill config in
llm.yaml, secrets from.env - Disable any skill by name without deleting it
@skill async def set_scene(name: str) -> str: """Activate a lighting scene by name.""" return f"Scene {name} is on."
05 · THE DAILY STUFF · timers, reminders & lists
The boring things, done brilliantly
"Set a timer for 20 minutes." "Wake me at 6." "Remind me to call Mom at 8." Spoken into the air from any room, delivered back in that room — with a proper tone first, so an alarm still rings even if the voice pipeline is having a bad day.
Shopping and to-do lists live in Home Assistant's own lists, so whatever you add by voice is on everyone's phone at the store. And Kenzy can hold a short conversation — answer a follow-up, finish a knock-knock joke — without you repeating the wake word.
- Timers, alarms & recurring reminders — set, check, and cancel by voice
- "Turn off the lights in 30 minutes" — deferred commands, spoken now, done later
- Lists sync through Home Assistant to every phone in the house
06 · INSTANT ANSWERS · fast path
Instant commands, no round-trip
Some things shouldn't wait on a language model. "Turn on the lights" should just happen. Kenzy's deterministic fast path parses common commands locally and acts immediately — the LLM is the fallback, not the bottleneck.
It uses padacioso for intent parsing and rapidfuzz for device matching, with the model catching anything ambiguous. The lights are usually on before the confirmation finishes speaking.
- High-frequency commands answer in milliseconds
- Falls through to the LLM when it isn't sure
- Past-tense confirmations — the device already changed
07 · KNOWS WHO'S TALKING · kenzy-speaker
Knows who is speaking
SpeechBrain's ECAPA-TDNN model identifies enrolled speakers from a short voice profile. Enroll each person once with kenzy-enroll and Kenzy can tell the household apart — all locally.
That powers real guardrails: unlocking a door or opening a garage by voice can be restricted to a recognized person, refusing an unidentified voice outright.
- On-device speaker embeddings, no cloud
- Speaker-gated secure actions (locks, covers)
- Runs in parallel with transcription — no added latency
Enrolled profiles
Secure action
08 · SPEAKS BACK · kenzy-tts
A voice of its own
Text-to-speech runs through OpenAI's TTS for polish, or Kokoro for a fully local PyTorch voice — your choice of where the audio is synthesized.
The LLM can attach a voice style to each reply, so Kenzy can sound calm, upbeat, or matter-of-fact to match the moment.
- Local Kokoro voices or hosted OpenAI voices
- Cloud voice down? Kenzy falls back to the local one — and if a request truly fails, she says so out loud instead of going silent
- Per-response voice styling
- Streamed back to the room as raw audio
Backends
Streamed to room
09 · YOUR SMART HOME · Home Assistant
Home Assistant, in both directions
Kenzy speaks Home Assistant natively: lights, locks, covers, scenes — resolved by room, matched by name, with a curation layer for the aliases and defaults your house actually uses ("the big lamp", "movie lights").
And it flows the other way too. Kenzy surfaces into HA over MQTT discovery: every room node appears as an HA device — who spoke last, room state, trigger and mute controls — so your automations can react to voice activity, and an automation can make Kenzy announce anything, house-wide.
- Voice control of your HA devices, room-aware and name-fuzzy
- Nodes appear in HA automatically — no HA-side code, just MQTT discovery
- Announcements from automations: "the wash is done," said out loud
In Home Assistant
From an automation
10 · WHOLE-HOUSE ROLLOUT · kenzy-deploy
Roll it out across the house
One command syncs source, manages virtualenvs, writes systemd units, and controls services across a fleet of Debian hosts over SSH. Put a node in every room without hand-configuring each one.
init → install → upgradeworkflow- Source or PyPI install mode, per host
- Health checks across every host
$ kenzy-deploy init # prep hosts $ kenzy-deploy install # first deploy $ kenzy-deploy upgrade # push updates $ kenzy-deploy status # health check
11 · MANAGE THE WHOLE FLEET · dashboard
One web dashboard for every room
An opt-in web dashboard, served by the server, turns the whole house into a single screen. See every room node and backend service live, and manage them from a browser — no SSH, no YAML spelunking — all behind a login, on your LAN.
Configure a room and run a guided audio-calibration wizard, manage skills and enrolled voices, watch the pipeline (transcripts, latency, fast-path hit rate), see and cancel timers, send announcements, and apply updates in place.
- Live fleet & service health — every room card shows CPU, memory, disk and temperature — plus per-room config + audio calibration
- Manage skills, speaker profiles, and API keys without touching files
- One-click upgrades for the server, services, and nodes — and a one-click backup of your whole config
Dashboard
Update available · one click
// How it compares
Yours, not theirs.
Big-tech speakers are easy to buy and easy to outgrow — they live in someone else's cloud, on someone else's terms. Here's how Kenzy lines up against the assistants you already know.
| What matters | Kenzy | Alexa | Apple | |
|---|---|---|---|---|
| Works without the cloud | ✓ | ✗ | ✗ | ✗ |
| No account or subscription | ✓ | ✗ | ✗ | ✗ |
| Use any AI model — local or cloud | ✓ | ✗ | ✗ | ✗ |
| Recognizes who's talking | ✓ | – | – | – |
| Announce to the whole house | ✓ | ✓ | ✓ | ✓ |
| Live voice intercom | ✓ | ✓ | ✗ | – |
| Local smart-home control | ✓ | ✗ | ✗ | ✗ |
| Open source | ✓ | ✗ | ✗ | ✗ |
| Self-hosted web dashboard | ✓ | ✗ | ✗ | ✗ |
| Runs on cheap hardware you own | ✓ | ✗ | ✗ | ✗ |
✓ yes · – partial / varies · ✗ no — general comparison as of 2026; competitors change, so check specifics for your setup.
Comparing self-hosted projects instead? Home Assistant Assist, OVOS, and Rhasspy are good projects built by people we admire — try them too. Kenzy's particular obsession is the whole-home experience: room-aware nodes everywhere, speaker ID, announcements and intercom, one dashboard.
// See how it fits together
One pipeline, six moving parts.
Every feature above is a service with a clear job and a simple interface. The architecture page shows how they connect.