For a long time, running an AI model locally felt like a gimmick, rather than something actually useful. You could generate a paragraph of text, edit or generate an image if you were patient, all while putting up with subpar results compared to the cloud-based behemoths operated by the likes of OpenAI, Google, and Anthropic. They did everything better, faster, and with fewer compromises, and if you wanted something that actually worked, you had to send your data off to someone else's servers... all while accepting the trade-offs.
Since those early days, things have significantly shifted. Quietly, and likely faster than most people expected, local AI models have crossed that threshold from an interesting experiment to becoming a genuinely useful tool. They may still not compete with cloud-based alternatives, but they don't need to. Latency, privacy, cost, and control are all things that matter to many people, and a local LLM enables exactly that.
Do they replace cloud-based models? Absolutely not, but there are a ton of real-world problems that a local model can be used to tackle.
Local data processing made simple
From notes to documents
One of the best uses I've found for a local LLM is for raw data processing. With a large context window, many local models like gpt-oss-20b, gemma-27b, and seed-oss-36b are fantastic at taking data inputs and providing a valid output. That includes finding information in a PDF, converting unstructured data to a table, or even doing things like tagging your Obsidian notes with an Obsidian MCP plugin. These are all repetitive, mundane tasks for humans, but the exact kind of thing an LLM is great at when it has the necessary tools and context window to interact with and keep track of data.
What's more, these are the exact kinds of tasks that many people would feel uncomfortable with using a cloud-based alternative for. Do you really want to send all of your notes, PDFs, or personal data in general to the likes of Google or OpenAI? Your local model won't ship everything you say to it to a cloud provider where it may be used for review or training, and it's just as good at those menial tasks as a cloud model. It may not be as fast (though smaller models often are when working with processing local data), but for many of those tasks, they're not time sensitive anyway. Another great use is for text extraction, tagging, and organization with PaperlessNGX, where you would expect many people to store private documents not suitable for the cloud.
The same goes for proofreading, simple code fixes, and powerful search capabilities when paired with the likes of SearXNG. I've even built my own Chrome extension that I use for proofreading, as I grew uncomfortable with the fact that Grammarly is borderline a keylogger. When I invoke my extension, it takes what I've written, sends it to my local model, and comes back to me with grammar-based nitpicks and highlighted typos.
I don't use that extension all the time, but I find it just as powerful as Grammarly, or even its Pro edition, without the trade-offs of sending everything I type in my browser to a third-party that, by default, trains its AI models on user data. I have other applications that I've written that pass data to my local LLM as well, but all of these are some of the most powerful that I actually get the most use out of.
A voice assistant I can trust
Who needs Google?
I've talked a lot about voice assistants powered by a local LLM, but with good reason. Many people are distrustful of voice assistants powered by the likes of Google and Amazon as they're seen as providing a way for those big companies to listen in on your conversations, collect data, and, more or less, spy on consumers. A voice assistant powered by your own voice processing pipeline provides none of that, while still offering a more than apt (and sometimes even better) experience in comparison.
Take, for example, my GLaDOS-powered voice assistant that uses my own Whisper pipeline and local LLM for responses. For local home control, thanks to Home Assistant, it's just as fast as a cloud-based provider; arguably more so given that there's no off-site processing or latency to speak of. Then, as a party trick of sorts, I can ask an off-the-wall question and get a response from something imitating the personality of GLaDOS in the voice of GLaDOS. It's fun, it's quick, and it's unique.
Separately, I'm also granted a level of control that I simply wouldn't get with a cloud-based provider, though that's more a quality of Home Assistant rather than a local LLM. Still, Home Assistant provides the functionality, and the local LLM fills in the gaps so that a differently-worded request still works. With Music Assistant, for example, you can be more vague with your requests, or the same goes for the weather. With a local LLM, both of these voice requests are perfectly valid and will get you a response with the appropriate blueprint:
- "Okay Nabu, will it rain in the afternoon on Saturday?"
- "Okay Nabu, play the latest Fred Again album"
It requires some setup, but the result is a better, more tailored experience when compared with any of the major voice assistants that I've used. You get the benefit of fast, instant reactions to common requests like "turn off the light," and the unique and powerful capabilities of a large language model that can contextually process a request.
Local LLMs are powerful
Just be aware of their limitations
Local LLMs have come a long way, but they're not magic. They still struggle with tasks that require the sheer scale and reasoning power of frontier models. Complex multi-step problem solving, nuanced creative writing, and tasks requiring broad world knowledge are still areas where cloud-based models pull ahead, sometimes significantly. If you're expecting your local setup to match Claude or GPT-5 head-to-head, you'll be disappointed.
But that's not really the point. The value of a local LLM isn't in competing with the best; it's just about being good enough for the things that matter to you, while giving you something the cloud never can: complete ownership of your data and workflow. Nobody's logging your queries, training on your documents, or quietly deprecating the model you've built your tools around. When you run a model locally, it stays exactly as it is until you decide to change it.
There's also something satisfying about the self-sufficiency of it all. When my internet goes down, my voice assistant still works. When a cloud provider has an outage or decides to change their API, my local tools keep humming along. That resilience has real value, even if it's hard to quantify.
If you've been on the fence about dipping into the local LLM space, now is genuinely a good time to start. Hardware requirements have become more reasonable, tooling has matured, and the models themselves have reached a point where they're no longer just impressive demos. You don't need a server rack or a computer science degree. A decent GPU and a bit of patience will get you surprisingly far, especially with free tools like LM Studio or the open-source Kobaldcpp.
Cloud models aren't going anywhere, and for good reason. But for the tasks where privacy, control, and reliability matter most, a local LLM might just be the better tool for the job.
