Pretty much all mainstream AI tools live in the cloud, and the way you use them is fairly straightforward too. Just type out your prompt or command, beam it over to OpenAI, Google or Anthropic's servers where the most cutting-edge models run, and get back your response. While that workflow is certainly convenient, it also means that your prompts, documents, and personal information are being sent over to third-party servers. Additionally, there's a constant reliance on internet connectivity. It's not a huge hassle most times, but if your internet acts spotty, or if you're traveling, it can be a pretty serious hassle.

As a lifelong tinkerer, I wanted something different. I wanted an AI assistant that ran on my own network, on hardware that I had full control over, and, most importantly, didn't require a subscription to unlock full functionality. That curiosity led me to build a small local AI setup using Ollama and Open WebUI on a $200 mini PC. The idea was straightforward: run Ollama to handle the local LLMs and expose them using Open WebUI, which provides a very familiar browser-based interface.

Having used it for a while now, I'm surprised by just how practical the setup is at solving real problems in day-to-day use.

Ollama dramatically simplifies setting up local AI

Get started with a single command

The biggest issue with local AI has been the setup process. It's not particularly hard, but I'd describe it as tedious. Running language models often means dealing with dependency conflicts and manual configuration. Ollama simplifies that entire process dramatically. Ollama works as a local runtime for large language models, packaging them in a way that makes downloading and launching them extremely easy. Just install Ollama and pull the models you want to try.

Once Ollama is up and running, a single command downloads an optimized version and begins running it locally. Ollama also exposes a local API which allows other tools to communicate with the model as if it were a remote AI service. I've used this API to integrate Ollama with other self-hosted tools like Karakeep before. That small detail also turns my mini PC into something closer to a full-fledged AI server.

Now, let's be clear, the hardware itself isn't particularly powerful. I'm using a small x86 mini PC with a modest processor and just 16GB of RAM and no dedicated GPU. But for the kind of use cases I'm looking at, that's not been an impediment. The mini PC can easily handle modern models reasonably well. Especially, optimized models like Llama 3 and Mistral work well in this environment. Using Ollama for the task is appealing because it makes swapping out models just as easy. In case one doesn't perform to my liking, I can remove it and download another with a simple command.

Open WebUI turns a local model into a useful assistant

A browser-based interface makes local AI accessible to all devices

As easy as Ollama makes it to get a local LLM running on your machine, it doesn't actually provide a straightforward interface to actually use the LLM. That's where Open WebUI steps in by providing a full-fledged chat interface on top of Ollama. It runs as a web application and connects directly to the models running on your system. Once configured, it automatically detects the available models and lets me interact with them through a browser window, which is a very similar experience to using a commercial model like ChatGPT or Gemini.

You can open a tab, choose a model from the list, and start typing out your prompt. The model replies inside a threaded conversation, and your chat history remains available for later reference. Precisely how things work with commercial solutions. Switching between models only takes a click, and it is specifically this interface which turns the setup into something genuinely useful as an everyday solution.

Since the LLM runs locally on my mini PC, I can access the AI assistant from any device on my network. As long as I'm on my network, I can tap into it from my desktop, my laptop, or even my phone without needing to install any additional software.

This setup gives me the benefit of privacy too. The data never really leaves my network because the model runs right there on my own hardware. Of course, performance varies quite a bit depending on what model I'm running, and some tasks like image generation are a no-go on my lowly mini PC, but as long as you keep your expectations in check, smaller models feel quick enough.

It can't replace cloud AI, but a small local AI server is still worth it

Between Ollama and Open WebUI, it is surprising how well this stack has integrated into my workflow. The mini PC runs in the background, acting as a small AI host which can be accessed across devices. Once everything is set up, there's very little ongoing maintenance or configuration required unless you are downloading fresh models.

Let's be real, this setup doesn't replace cloud AI entirely. Cloud AI is significantly faster and capable of performing tasks like image generation and video generation that cannot be done with local models, especially not on a mini PC. But for everyday tasks like brainstorming, grammar correction, summarization, you get an always-available private assistant entirely on the premises.

Ollama

Ollama is a platform to download and run various open-source large language models (LLM) on your local computer.