Alexa feels like it's become a bit of a salesman, interjecting with "by the way" suggestions and pushing subscriptions like Alexa+ on you constantly. Alongside this is the reality of the lack of privacy: every time you say a wake word, your audio is processed on a server you don't even own or control.
The shift towards local AI is the biggest trend in smart homes today, and the ESP32-S3 has become the gold standard for replacing cloud-dependent Echo and Nest devices. The ESP32-S3 is the perfect solution. At around $4 for the chip or $15 with a microphone, you can have a voice satellite that is faster, cleaner, and 100% private by using ESPHome and Home Assistant Assist pipeline. You can get quicker response times without a single byte of audio ever leaving your front door.
I built a $15 ESP32 smart clock that does things no store-bought one can
I programmed it with ESPHome, and it only took a few hours from start to finish.
How to get set up
It's easier than you think
If you want to get started, it's worth noting that setting up this Alexa alternative isn't hard at all. There are different options for those who want a plug-and-play option and those who want to do some tinkering.
When picking out the hardware, you can opt for the ESP32-S3-Box-3. This is the polished, out-of-the-box option made to be a voice assistant from the start. Or, if you're more of a DIY fanatic, then picking up a base ESP32-S3-DevKit paired with an INMP441 microphone and amp does the job too. Just be warned that you have to put it together yourself. Whilst this isn't a difficult process, it can be one that some people just aren't bothered to do.
Once you've got the hardware out of the way, it's time for the one-click install. You can pick up the Home Assistant voice already web flasher. You don't need to write any code; just plug the ESP32 into your PC via USB, then click Install. You can then utilize other software like ESPHome, which handles the ears in the form of the microphone and the mouth in the form of the speaker, while your Home Assistant server handles the brain, which is the speech-to-text and the intent.
When switching to an ESP32 paired with ESPHome and Home Assistant, you're using a local rather than the cloudto rule your home. This means that you'll get faster response times because the audio doesn't have to travel to an Amazon data center and then back. As a result, any actions that you're asking your voice assistant to do, like turning on a light, will happen almost instantly.
- Brand
- AITRIP
- Connectivity Features
- UART, USB
The ESP32 is a tinkerers dream, allowing you to DIY your own home voice assistant.
You can even opt to use software like Micro Wake Word. In 2026, we now have high-accuracy wake-word detection, like "Hey Jarvis," running directly on the $4 chip's internal NPU. Unlike Alexa, which might get confused by your specific room names, this speaker is hard-coded to your Home Assistant entities. It knows exactly what the bedside lamp is and what the garage is, and it works for you rather than against you, as you constantly have to repeat commands just to get them to work, like you might have to do with Alexa.
What are the perks of setting up your own voice assistant?
Privacy isn't the only benefit
Using the ESP32 also provides you with advanced features beyond just turning the lights on and off. The ESP32 can act as a media player. You can send text-to-speech announcements like "The laundry is done" or even stream low-bitrate internet radio. It also acts as a Bluetooth proxy. While the ESP32 is listening for your voice, it can also act as a Bluetooth bridge, pulling in data from distant Go VCRs.
On top of that, if you decide to build your own, you can add a physical toggle switch that cuts power to the microphone, something a software mute button can never truly guarantee. You know for a fact that when that mute button is on, the microphone literally doesn't even work because you've made it yourself.
While a bare circuit board won't look great on a coffee table, the way that an Alexa will, it doesn't have to stay this way. There is a massive community of 3D-printed enclosures that make these DIY devices look like professional consumer products. Even if you are picking up 3D-printed enclosures or printing them yourself, it's still cost-effective to put one of these in every room for the price of a single Amazon Echo Show.
Make your smart speaker smart again
Stop settling for a speaker that throws ads at you
It felt like the smart speaker was the Trojan horse of the cloud era. The ESP32 is a tool that you can use to take your walls back. It's not just about saving money on a speaker, though that is a benefit. It's about owning the ears of your own home without your voice having to travel to the opposite side of the planet just for you to turn off a light in your own home.
