// What you get

Everything a home assistant should do — on your terms.

Kenzy hears you, understands you, acts on your home, and talks back — with every piece able to run on hardware you own. Here's what that means in practice. (Each capability is also its own small service, so you can run them on one machine or spread them across the house — the technical view lives on the architecture page.)

01 · IN EVERY ROOM · kenzy-node

Wake word, listening on-device

Every room node runs openWakeWord on each audio frame, locally. Nothing is streamed anywhere until the wake word actually fires — the mic isn't an open line to the cloud.

An optional Silero VAD gate suppresses false triggers on near-silence, so you can lower the threshold for better real-speech sensitivity without the assistant waking to a creaky floorboard.

  • Bundled wake model, or drop in a .tflite/.onnx you trained yourself
  • VAD-gated detection to cut false wakes
  • Wake-word interrupt — talk over playback to start a new request

Loaded models

hey_ken_zee your_model.tflite

Detection

02 · UNDERSTANDS YOU · kenzy-stt

Transcription on your machine

Speech-to-text runs on faster-whisper — locally, by default. Pick a model size to match your hardware, from tiny on a Pi to large-v3 on a GPU workstation.

Because it's a service behind a URL, you choose where it lives. Keep it on the LAN and your spoken words are transcribed without ever touching someone else's server.

  • CPU or CUDA, with int8 / float16 compute options
  • Model size is one line of config
  • Prefer the cloud? An OpenAI-backed mode is one line of config — and if it fails, Kenzy quietly falls back to local whisper
  configs/stt.yaml
whisper:
  model: "base"      # tiny → large-v3
  device: "cpu"      # or "cuda"
  compute_type: "int8"
  language: "en"

03 · YOUR AI BRAIN · kenzy-llm

Bring your own language model

This is the heart of the "100% local" story. kenzy-llm runs on LiteLLM, which talks to local runtimes — Ollama, LM Studio, vLLM — exactly the same way it talks to OpenAI or Anthropic.

So you decide the privacy/quality trade-off, per install. Run a small model entirely offline, or route to a frontier model in the cloud. Changing your mind is two lines of YAML.

  • Local or cloud — same config, swap freely
  • Optional local fallback — a cloud outage degrades to a local model instead of silence
  • Per-room conversation memory with a short TTL
  • Structured responses that carry a TTS voice style
  configs/llm.yaml
# fully offline
model: "ollama/llama3.1"
base_url: "http://localhost:11434"
OllamaLM StudiovLLMOpenAIAnthropic

04 · ACTS ON YOUR WORLD · skills

Tool-calling skills, zero boilerplate

A skill is just an async Python function in skills/ with a @skill decorator. Kenzy reads its signature and docstring to build the tool schema automatically — the model calls it when it fits.

Weather, news, stocks, web search, timers & reminders, shopping lists, and Home Assistant control ship in the box. Adding your own is one file, no registration.

  • Auto-generated tool schemas from type hints
  • Per-skill config in llm.yaml, secrets from .env
  • Disable any skill by name without deleting it
  skills/my_skill.py
@skill
async def set_scene(name: str) -> str:
    """Activate a lighting scene by name."""
    return f"Scene {name} is on."

05 · THE DAILY STUFF · timers, reminders & lists

The boring things, done brilliantly

"Set a timer for 20 minutes." "Wake me at 6." "Remind me to call Mom at 8." Spoken into the air from any room, delivered back in that room — with a proper tone first, so an alarm still rings even if the voice pipeline is having a bad day.

Shopping and to-do lists live in Home Assistant's own lists, so whatever you add by voice is on everyone's phone at the store. And Kenzy can hold a short conversation — answer a follow-up, finish a knock-knock joke — without you repeating the wake word.

  • Timers, alarms & recurring reminders — set, check, and cancel by voice
  • "Turn off the lights in 30 minutes" — deferred commands, spoken now, done later
  • Lists sync through Home Assistant to every phone in the house
9:41 PM"remind me to call Mom at 8 tomorrow"
↓  stored on your server — no phone, no cloud
8:00 PM · SAME ROOM"You asked me to remind you to call Mom."

06 · INSTANT ANSWERS · fast path

Instant commands, no round-trip

Some things shouldn't wait on a language model. "Turn on the lights" should just happen. Kenzy's deterministic fast path parses common commands locally and acts immediately — the LLM is the fallback, not the bottleneck.

It uses padacioso for intent parsing and rapidfuzz for device matching, with the model catching anything ambiguous. The lights are usually on before the confirmation finishes speaking.

  • High-frequency commands answer in milliseconds
  • Falls through to the LLM when it isn't sure
  • Past-tense confirmations — the device already changed
UTTERANCE"turn off the kitchen lights"
↓ local parse
FAST PATH · ~MSTurned off the kitchen lights.no model call

07 · KNOWS WHO'S TALKING · kenzy-speaker

Knows who is speaking

SpeechBrain's ECAPA-TDNN model identifies enrolled speakers from a short voice profile. Enroll each person once with kenzy-enroll and Kenzy can tell the household apart — all locally.

That powers real guardrails: unlocking a door or opening a garage by voice can be restricted to a recognized person, refusing an unidentified voice outright.

  • On-device speaker embeddings, no cloud
  • Speaker-gated secure actions (locks, covers)
  • Runs in parallel with transcription — no added latency

Enrolled profiles

jonjaneguest

Secure action

UNLOCK · UNKNOWN VOICERefusedrequires a recognized speaker

08 · SPEAKS BACK · kenzy-tts

A voice of its own

Text-to-speech runs through OpenAI's TTS for polish, or Kokoro for a fully local PyTorch voice — your choice of where the audio is synthesized.

The LLM can attach a voice style to each reply, so Kenzy can sound calm, upbeat, or matter-of-fact to match the moment.

  • Local Kokoro voices or hosted OpenAI voices
  • Cloud voice down? Kenzy falls back to the local one — and if a request truly fails, she says so out loud instead of going silent
  • Per-response voice styling
  • Streamed back to the room as raw audio

Backends

Kokoro · localOpenAI · hosted

Streamed to room

09 · YOUR SMART HOME · Home Assistant

Home Assistant, in both directions

Kenzy speaks Home Assistant natively: lights, locks, covers, scenes — resolved by room, matched by name, with a curation layer for the aliases and defaults your house actually uses ("the big lamp", "movie lights").

And it flows the other way too. Kenzy surfaces into HA over MQTT discovery: every room node appears as an HA device — who spoke last, room state, trigger and mute controls — so your automations can react to voice activity, and an automation can make Kenzy announce anything, house-wide.

  • Voice control of your HA devices, room-aware and name-fuzzy
  • Nodes appear in HA automatically — no HA-side code, just MQTT discovery
  • Announcements from automations: "the wash is done," said out loud

In Home Assistant

kitchen nodestatelast speakertriggermute

From an automation

kenzy/announce"The wash is done."spoken in every room

10 · WHOLE-HOUSE ROLLOUT · kenzy-deploy

Roll it out across the house

One command syncs source, manages virtualenvs, writes systemd units, and controls services across a fleet of Debian hosts over SSH. Put a node in every room without hand-configuring each one.

  • init → install → upgrade workflow
  • Source or PyPI install mode, per host
  • Health checks across every host
  fleet rollout
$ kenzy-deploy init      # prep hosts
$ kenzy-deploy install   # first deploy
$ kenzy-deploy upgrade   # push updates
$ kenzy-deploy status    # health check

11 · MANAGE THE WHOLE FLEET · dashboard

One web dashboard for every room

An opt-in web dashboard, served by the server, turns the whole house into a single screen. See every room node and backend service live, and manage them from a browser — no SSH, no YAML spelunking — all behind a login, on your LAN.

Configure a room and run a guided audio-calibration wizard, manage skills and enrolled voices, watch the pipeline (transcripts, latency, fast-path hit rate), see and cancel timers, send announcements, and apply updates in place.

  • Live fleet & service health — every room card shows CPU, memory, disk and temperature — plus per-room config + audio calibration
  • Manage skills, speaker profiles, and API keys without touching files
  • One-click upgrades for the server, services, and nodes — and a one-click backup of your whole config

Dashboard

FleetConfigSkillsHome AssistantSpeakersScheduledActivityLogsSettings

Update available · one click

kenzy-serverUpgrade available — apply everywherehonors your dependency pins

// How it compares

Yours, not theirs.

Big-tech speakers are easy to buy and easy to outgrow — they live in someone else's cloud, on someone else's terms. Here's how Kenzy lines up against the assistants you already know.

What matters Kenzy Alexa Google Apple
Works without the cloud
No account or subscription
Use any AI model — local or cloud
Recognizes who's talking
Announce to the whole house
Live voice intercom
Local smart-home control
Open source
Self-hosted web dashboard
Runs on cheap hardware you own

yes  ·  partial / varies  ·  no  —  general comparison as of 2026; competitors change, so check specifics for your setup.

Comparing self-hosted projects instead? Home Assistant Assist, OVOS, and Rhasspy are good projects built by people we admire — try them too. Kenzy's particular obsession is the whole-home experience: room-aware nodes everywhere, speaker ID, announcements and intercom, one dashboard.

// See how it fits together

One pipeline, six moving parts.

Every feature above is a service with a clear job and a simple interface. The architecture page shows how they connect.