Voozh

Reading about people setting up self-hosted LLMs always felt too technical and out of reach for me. Especially since most guides seem to assume you already understand what you’re doing. So it wasn’t really a lack of interest that held me back, but more so the fear of failing to set up a local LLM properly.

Turns out, it’s much easier than I thought - at least with the right tools. NotebookLM was one of them. It helped me slow the whole process down and actually understand what I was doing instead of skimming past jargon hoping for the best. I dropped in documentations and tutorials, and asked it to break things down for me in small steps and plain language. Here’s how it went…

Learning what a self-hosted LLM is

Anyone can set one up

I started with some basic research to get myself up to speed. To simplify what I'd learned: A self-hosted LLM is a large language model that runs on hardware you control instead of on someone else’s servers. When you use something like ChatGPT or Claude, your prompts are sent over and processed on a remote data center, and the results sent back to you.

With a self-hosted LLM, that loop happens locally. The model is downloaded to your machine, loaded into memory, and runs directly on your CPU or GPU. So you’re not dependent on an internet connection or limited by usage caps and subscriptions, and your data never leaves your device. So basically, you’re running the model instead of renting it.

A lot of people associate self-hosting with setting up Docker containers or managing similar infrastructure. While that is one way to do it, it’s not a requirement. Tools like LM Studio handle all the technicalities for you - they load the model into memory, manage resources, and give you an easy interface to work with.

Even without self-deployment tools or command-line setups, using something like LM Studio still counts as self-hosting because the models you run in it live on your machine and execute locally. So, as a local LLM newbie, I stuck with LM Studio.

Setting up NotebookLM to help me get started with LM Studio

Modifying the AI

When I create a notebook for learning about a specific topic, I always start with a system prompt. In NotebookLM’s case, this is called the Custom Mode and you can find it via the sliders icon in the top-right of the chat. I told NotebookLM my skill level and instructed it on how to relay information to me.

Given that I had already downloaded LM Studio with the help of Perplexity, I wanted to focus this more on actually using LM Studio, the different models, and how to get the most out of my setup. So most of my sources were official LM Studio documentation and guides.

I have the NotebookLM Web Importer extension to quickly add web links to my notebooks. But LM Studio actually lets you copy its documentation as Markdown text, which NotebookLM accepts as a source too.

LM Studio has some information for each of the models it offers (most of which are open-source or open-weight), which I also plugged into NotebookLM. This would help me get a good grasp on which ones are best for my needs and hardware.

Using NotebookLM to get the most out of LM Studio

Putting my self-hosted LLM to work

The first thing I wanted to gauge was which models were right for me and what their best use cases were. LM Studio had over 30 to explore, so this was my first prompt to NotebookLM:

“For each model I’ve added, explain what it’s best used for in plain terms. For example: writing, summarizing, coding help, brainstorming, long-context work. Skip marketing language.”

However, I should have just directly asked for the model best suited to me. Something along the lines of:

“Best models specifically for reasoning, natural conversation, and general knowledge. Prioritize clarity, consistency, and low hallucination over creativity or coding.”

NotebookLM told me that Nemotron 3 and gpt-oss were some of the top fits. I leaned more toward gpt-oss because it has a configurable reasoning effort with access to its reasoning process. It also seemed to have the most adjustable properties out of any of the models, so I asked NotebookLM for some tips on configuring gpt-oss to my needs.

My primary goal was setting up the model to help me with my UX design studies and some other technical topics (such as self-hosting). LM Studio’s interface makes it incredibly easy to configure your models - all you have to do is open the settings (wrench icon in the top right). Depending on your model, you’ll find options like system prompt and response length.

I kept going back and forth between NotebookLM and LM Studio to get more clarity as I was using it. For example, NotebookLM helped me understand what “tokens” meant in the context of LLMs, and when I’d want to adjust them in LM Studio.

A practical way to self-host your own LLM

I always assumed that self-hosting starts with complicated setups. But tools like LM Studio bypasses all of that - it already runs models locally, and you completely control how it behaves. Using NotebookLM to condense the model and configuration information made the learning process much smoother, and I ended up with a model that’s honestly better and more personalized than tools like ChatGPT. Plus, all my data remains private and in my control.

LM Studio

See at LM Studio

NotebookLM

See at NotebookLM

URL: https://www.xda-developers.com/learn-how-to-self-host-llm-with-notebooklm/

⇱ How NotebookLM made self-hosting an LLM easier than I ever expected