That’s right, Version 2.0 has finally landed. It’s complete with a host of updates and a completely overhauled communication setup. Kenzy still has all the same devices you’re familiar with, but everything in this version is new and improved. Kenzy’s Watcher (kenzy.image) is more powerful with motion, object, and face recognition. Kenzy’s Speaker (kenzy.tts) now sounds more human. Kenzy’s Listener (kenzy.stt) has significantly improved accuracy. And Kenzy’s dashboard has been revamped to provide more information in a web-app style interface.
New skills are available too including “Tell me a Joke”, “Tell me a knock knock Joke”, “What’s the weather?”, and a full HomeAssistant integration to “Turn on the Lights” or “Unlock the front door”.
Full changelog below:
Added
- Settings handler for consistency when customizing per device settings
- GPUs can be leveraged for torch and cuda enabled models
- Added options for saving video of detected people
- Directly incorporated kenzy_image into kenzy.image.core.detector
- Added reloadFaces logic to kenzy.image.detector (formerly of the kenzy-image package)
- Added voice activation with configurable timeout
- Added multi-model support for speak-to-text
- Added configurable timeout for SSDP client requests
- Added extras helpers to extract numbers from strings and convert numbers to english words.
- Added clean text routine for supporting the rich output from OpenAi’s Whisper model
- Basic support for simultaneous actions (such as two listener+speakers in two rooms connected to same skillmanager)
- Object recognition, Face detection, and Face recognition with optimizations to minimize processing time with support for multiple models
- Configurable saving of videos based on object detection alerts
- Added a default configuration for the base kenzy startup (saved to .kenzy/config.yml).
- Core support for versioning skills. (use `self._version` to set version number).
- Added `–skip` and `–only` options to skip or include device configs in provided file.
- New skill option for WeatherSkill (requires API key from openweathermap.org)
- Added option to set default value when getting settings in skills
Modified
- Settings/Configuration files can now be stored in JSON or YAML files
- Moved watcher to “`kenzy.image.device.VideoReader“`
- Moved listener to “`kenzy.stt.device.AudioReader“`
- Moved speaker to “`kenzy.tts.device.AudioWriter“`
- Restructured devices to allow for direct calls for “main” in each of image, stt, and tts
- Split out detector/creator processes for each of hte core functions into their own modules (e.g. kenzy.image.detector, kenzy.stt.detector, etc.)
- Moved all devices to their own HTTP server module when run as clients
- Fixed the UPNP logic so that it honors the full UPNP spec for control interface lookups
- Updated skills intent function signature to include “`**kwargs“` for additional values like raw text captured
- Fixed the context inclusion and usage for action/response activities (uses “location” for relative responses)
- Completely overhauled dashboard
- Fixed bug in skillmanager.device.collect
- Fixed bug in core.KenzyRequestHandler.log_message
- Fixed bug in *Cameras* count on dashboard
- Changed startup to use Multiprocessing instead of Threads for each device main runtime
- Added ThreadingMixIn to HTTPServer (oops!)
- Set default of “Kenzy’s Room” and “Kenzy’s Group” for location and group respectively
- Improved responses to the “How are you” question.
Removed
- Dropped support for PyQt5 panels
- Dropped direct support for Kasa smart switch/plug devices
- Dropped unnecessary libraries (urllib3, netifaces)
- Dropped support for MyCroft libraries “mimic3” (created forked version of padatious for future internal support)
- Dropped direct support for Raspberry Pi due to hardware limitations
Current version is 2.0.3
Install Kenzy with one line:
wget -q -O install.sh https://kenzy.ai/installer && sh install.sh
Read the documentation.