If you have a smart assistant such as Alexa, there's a good chance you've dabbled with automation. Asking the assistant to play something, to relay the current time, or to inquire about dietary requirements are all automations of sorts. The issue with these closed cloud-based platforms is the limit of complexity. Alexa can only perform so much before you hit a brick wall. If there's no skill available, you're out of luck. That's why I decided to go it alone and connect a large language model (LLM) already running in the home lab to Home Assistant.
What I found was, after throwing Alexa out the front door, a smarter and more private method of running automations with nothing but the sound of my voice.
Assistants aren't as smart as they think
Try asking them more complex queries
There's very little reasoning with a cloud-based smart assistant. These are very much locked in the period when smart home technologies were perceived as pointless and largely irrelevant. That's not to say Siri, Google Assistant, and others don't have their uses. These assistants may be passable for the average consumer, but they quickly run out of steam for anything advanced. I'm talking about a true smart home with custom automations through Home Assistant, and a few devices based on ESP32s.
These are cloud-based assistants, meaning they only work when you have an Internet connection, and they're bound to what tasks their respective company enables them to perform. If you say something that doesn't quite match their predefined template, you're out of luck, but Amazon, Google, and other brands will happily collect all your private data and store it somewhere in the cloud. When you take a step back and look at these assistants, it's apparent just how limited they are on the LAN.
I had already started to switch everything to Home Assistant, even ditching the Philips Hue bridge in the process. The next point on the list was Alexa, since it didn't play too well with the open-source controller. Interestingly, Home Assistant is designed to work well with most protocols, be it Zigbee or even REST. I didn't require the backing of the cloud anymore and could ultimately remove an entire layer of abstraction, using something that was locally hosted within the home network.
If you're in a similar boat, I have some good news for you. The answer? A local LLM.
Setting up a local assistant
DIY Alexa, but better
Getting your own AI models up and running from home is simple. What we'll be doing is creating our very own ChatGPT, but hosted on our own hardware. I already have LLMs running, and we have some exceptional in-depth guides on how to achieve this using Proxmox. Once you're up to speed and can interact with LLMs from within the LAN, you're pretty much there. Armed with Ollama and OpenWeb UI, I could easily integrate Qwen into Home Assistant through an API.
Whisper and Piper on Home Assistant make the magic happen, and we use Home Assistant's Voice Preview as the smart speaker to voice commands and even play some music through connected speakers — it works flawlessly through Music Assistant and Jellyfin. Piper synthesizes speech on the fly (text-to-speech), and Whisper is a transcription tool (speech-to-text), both of which are vital to get an LLM-powered smart assistant up and running without needing to touch the cloud.
Adding support for Ollama in Home Assistant is super easy. All you need to do is activate the Ollama interaction, load the IP of where Ollama is running (be sure to use the Ollama port and not OpenWeb UI), pick the desired model loaded into the LLM platform, and you're essentially good to go! Your downloaded models can now be selected within Home Assistant. Using an LLM and feeding it the right data allows your assistant to understand intent and not just the command itself.
Before, I had to say a couple of commands to turn everything off before going to bed. Now, I can simply say, "Okay, Nabu. Good night." All the lights turn off, as well as any smart plugs that control devices other than our server, to ensure nothing is running overnight. I can also integrate Frigate into the mix, allowing the LLM to utilize sensors and data from IP camera feeds to control external lighting only if it's me or a family member returning home in the dark. It's pretty neat just how natural you can ask Home Assistant to do stuff with an LLM.
The best part? It's all local. The LLM can store data, so it knows the time, data, weather, who's at home, and past interactions. Responses are tailored, and it can handle much more complicated tasks than a cloud-based assistant. It takes some work to get everything configured, but that's again a positive since you control how everything works and learn something in the process. Nothing is sent outside the LAN, and new entities can be later added to enhance the experience further.
Use your own speakers
You don't have to use official Home Assistant hardware, though I recommend it because the Voice Preview is excellent and you're also supporting further development of this comprehensive smart home platform. We've covered ways to create your own smart speaker using an ESP32 or even repurpose an old microphone.
