I've been developing for the ESP32 for nearly a year now, and in that time I've gained a lot of experience in understanding what works, the limitations, and some of the best ways to implement different features. The ESP32-P4 upended a lot of what I knew about these chips, as it's a powerful chip with unique capabilities, with one major drawback: no Wi-Fi capabilities. Thankfully, many boards implementing it pair it with an additional ESP32 for Wi-Fi (commonly the C6, like in this instance), but I was curious: how would Claude Code handle a large, complex project aimed directly at it?

I often see people say that AI helps them write code for the ESP32, and I've sometimes used it for time-saving. For example, it can save a lot of time when it comes to pulling all of the pins needed to instantiate hardware from a code sample, but I wanted to take it and step further. Specifically, I wanted to know whether Claude Code could have saved me time when I built my own example project using the Elecrow CrowPanel Advance ESP32-P4 display. I used a tool called Auto Claude for this, which automates much of the development process and creates subtasks, implementation plans, and more.

As it turns out, even when given the context of Elecrow's sample source code (and my own, which I wrote by figuring out Elecrow's sample code and re-implementing it myself), Claude Code massively struggled. I was able to make a pretty cool Google Nest Hub-esque device, but with a lot of manual fixing, tweaking, and polishing.

Auto Claude, token usage, and Vibe Kanban

Auto Claude builds an entire plan for you

In the AI coding space, there are a ton of tools that position themselves as essential, acting as an all-encompassing harness capable of breaking down your prompts into actionable subtasks. They're primarily aimed at non-developers, essentially promising to bridge the gap between a "vibe coder" and an engineer. If I'm prototyping a quick idea with an LLM, I know what the architecture will be, I know the communication protocols, and I know the hardware. Auto Claude, though, alongside other tools like it, attempt to break down the request in order to replace a user's presumed lack of knowledge.

Naturally, this has numerous drawbacks. Not only do you burn through token usage limits considerably quicker than an engineer who knows what they want and how to achieve it, but mistakes and assumptions made by the model will fly over the head of the user. Here's the prompt I gave Auto Claude using Anthropic's Claude Sonnet 4.5:

I would like to create a program for an ESP32-based Elecrow CrowPanel Advance ESP32-P4. This GitHub repository should be used to understand the hardware layout and pins required for working features: https://github.com/Incipiens/Elecrow-CrowPanel-Advance-ESP32-P4

The program will aim to replicate a Google Nest Hub's functionality. It should be able to show a clock on the screen at all times. It should be possible to interface with the display to swipe across to different pages. One page will be a calendar, for example. There should be a way to stream video to it, as it supports H264, possibly from a Jellyfin server that has transcoding. Those videos should have audio. As well, events, such as calendar events that are currently ongoing, should be shown on the active clock display when currently relevant. It should also be themable.

Data can be pulled from a Home Assistant instance, where the user generates a long-lived token for data retrieval. There should be a centralized area where a user can set the entity names for data retrieval. Also plan future features that could be added, too.

Auto Claude did well to break down the task into 29 separate subtasks, but by the end of the development cycle, there were numerous compilation errors, architectural mistakes, and wildly incorrect assumptions made. The H264 request was, in itself, a test, as the ESP32-P4 will struggle with decoding even 720p video, though it will work. My request itself was also overly verbose and some of the criteria I gave, like the usage of a long-lived token from Home Assistant and the link to a GitHub repository, are still considerably more detailed than the requests harnesses like these are typically expecting.

However, to be fair to Auto Claude, it did a good job at breaking down my request, splitting it up, and documenting the process throughout. While you can achieve most of what it did through detailed prompting and agents (such as the superpowers collection of agents), it abstracts much of that from the user, creating a Git repository with frequent commits and even displaying a Kanban board of tasks currently being undertaken. In that regard, it's similar to Vibe Kanban, but fully integrated with, and built for, Claude Code.

Because Auto Claude burns through tokens trying to plan the implementation of your project, you're better off manually prompting with specific requests, rather than giving a vague, open-ended prompt like I did. This would include specifying technical details, libraries to use, and the pins to interface with. I was curious how it would do on a task like this, though. And the truth is, well, it didn't do great.

Compilation errors and manual fixes

A lot was broken

Once the task completed (which took quite a while), I had a repository containing upwards of 22,000 lines of code, 28 individual source files, and markdown files with implementation details, self-generated critiques, and research files that were used for code generation. Attempting to compile it, though, was a different story. It failed at generating a valid LVGL configuration for graphics, and no "lv_config" parameter was configured to point to lv_conf.h.

When I fixed this, the next error related to NVS encryption. This is the Non-Volatile Storage on the ESP32, and can be used to contain constants like Wi-Fi connection details, user data, and more. Claude Code had enabled encryption, yet did not actually configure encryption. However, the last error was the most surprising, and that was how it implemented Wi-Fi.

As previously mentioned, the ESP32-P4 does not have Wi-Fi, and boards like this often pair it with an additional ESP32 with Wi-Fi. They then communicate over SDIO, and all of this is enabled by the esp-hosted library. However, Claude Code implemented a native Wi-Fi manager that ran on the ESP32-P4, without using the esp-hosted library, which meant that no Wi-Fi support was present. At all. Plus, LVGL was incredibly slow, and the H.264 implementation used several functions that don't actually exist in the library.

You can see the Wi-Fi manager Claude generated below, where there is no esp_hosted.h import or associated code.

Thankfully, I have enough experience with all of these that I was able to take apart what Claude had built and implement everything correctly. However, given the code that's already out there, including my own repository, I would have likely been significantly quicker just doing it myself in the first place. The boilerplate code for hardware instantiation from the given repo is nice, but all of these errors make it hard to recommend for development of complicated embedded projects like these.

"What about Context7?" you may wonder. Unfortunately, Context7 doesn't have any documentation for the ESP32-P4 or the esp-hosted library. In other words, it wouldn't have helped, but the fact that the library reference was already present in the GitHub repository should have been a clue. Once it compiled, it crashed on start, which I solved by specifically allocating my LVGL frame buffers to PSRAM.

All-in-all, I spent about as much time debugging this program and rebuilding portions of it in order to get it to compile as I would have building many aspects of it myself. While Claude Code did a good job at getting the initial hardware configuration working, everything else still required my personal experience and development capabilities to get it over the line.

Other projects might have more success

New hardware is out of the question

None of this is to say that Claude Code, or AI coding assistants generally, are without merit. For well-documented frameworks with extensive training data, like React, Django, and even vanilla ESP-IDF for common chips like the ESP32-S3, the results can be genuinely impressive. The problem here was caused by specfics: the ESP32-P4 is new, esp-hosted is niche, and ESP32 problems, in general, aren't plastered across Stack Overflow. With that said, they're not the catch-all that they appear to be on the surface.

LLMs work best when they're interpolating between known patterns, not extrapolating inferred knowledge into undocumented territory. For mainstream web development or even standard microcontroller projects, tools like Auto Claude could absolutely save time. But for bleeding-edge hardware with sparse documentation, you're still the one who needs to understand the hardware, work backwards from existing code samples, and understand the intricacies of simple decision-making like whether your frame buffer belongs in PSRAM or not.

If you're looking to make use of AI in your projects, you should use these tools for what they're good at: scaffolding, boilerplate, and rapid iteration on existing, well-understood ground, rather than something new. If you're using them to fill in the gaps in your knowledge, they won't work well, as you won't be able to tell what a right or wrong answer even looks like. If you're working with hardware that came out six months ago, expect to spend just as much time fixing the output as you would have writing it yourself. The AI got me halfway there, but the remaining half was still all me, and I don't see that changing for embedded development for a long time to come.