Local LLMs have been gaining traction these past couple of years, and I wanted to see what it's like building a small, self-contained setup for my projects, research, and day-to-day tasks. My gpt-oss20B model ended up being more usable than expected, so I kept it in my rotations rather than discarding it as a side project.
As powerful as local LLMs have gotten, however, they face a fundamental limitation: their knowledge is frozen at their time of training, so they’re prone to making things up. They’re fine for general knowledge and structured tasks, but they’re not very reliable for any work that depends on recent or current information.
But there’s a solution to this - you can add search tools as MCP servers to your local LLM, enabling them to pull in real-time data. I was recommended the Brave Search MCP, and since I already use Brave Browser, it felt like a good fit. It was quicker to set up than I expected, and it makes a real difference in the way my local LLM responds…
The Brave Search MCP
Letting local LLMs access real-time web results
Brave Search MCP is like a bridge between your local LLM (or whichever AI you set it up with) and the Brave Search engine. It uses the Brave Search API to query the search index, and the MCP server handles the communication so your model can incorporate fresh data directly into its answers.
The payoff is basically mixing the model’s pretrained data with current news, trends, or updates - all handled locally and privately, without depending on cloud service. For anyone frustrated with the issue of local models making things up on occasion when they don’t have access to the web, adding Brave Search via MCP fixes that gap while keeping the setup under your control.
Setting up Brave Search MCP with my local model
It’s not as technical as I thought it would be
I’m not a very technical AI user - which is why I like working with user-friendly graphical interfaces like LM Studio. And this setup wasn’t as complicated as I expected it to be. The first step was signing up for a Brave Search API key. You will need to create an account and load in your payment method - but Brave gives you $5 in monthly credits for free, which is approximately 1,000 prompts. So with light to moderate use, your card won’t be charged. You can also set a limit to ensure it doesn’t exceed this cap.
Once I had that, I opened my mcp.json file in a text editor - you’ll likely find it in the user configuration directory of your local LLM runner. And then I added this to it with my Brave API key:
{
"mcpServers": {
"brave-search": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-brave-search"
],
"env": {
"BRAVE_API_KEY": "PASTE YOUR API KEY HERE"
}
},
"fetch": {
"command": "uvx",
"args": [
"mcp-server-fetch"
]
}
}
} Also, ensure you have Node.js installed (which comes with npx), as well as uvx, which the server will use to fetch web results. The latter isn’t necessary for this to work, but it will give you more context beyond the search result snippets as it actually extracts the full content from the web pages.
Then it was time to get it running in LM Studio. First, you’re going to want to enable the MCP Brave Search and Fetch plugins. In my runner, they were located at the bottom in the text bar. I actually ran into an issue the first time trying to use it, and all it took was restarting LM Studio - you can also force-restart the plugins.
Last but not least: include a system prompt pointing your model to use the plugins when necessary. The LLM will decide when it needs fresh information, call Brave Search automatically, and incorporate the results into its responses. Something along the lines of "Use the brave-search tool to search the web if you don't know the answer."
Using Brave Search in my local model
It was hit-and-miss at first, until I wrote better prompts
Once everything was set up, I expected Brave Search to just work right out of the box and give me more accurate results. Instead, the first few prompts gave me really weird and even more inaccurate responses - it returned chunks of random text with weird tags, or got stuck in tool calling loops without finishing the response. What happened here is that the model knew it had access to the tool, but didn’t know how to use it cleanly. So instead of calling Brave and getting results, it would either expose the raw tool call or keep trying the process over and over.
Funnily enough, this seems to have been triggered whenever I specifically instructed my model to call Brave in the prompt. The solution was simple: using more natural prompts. The first prompting technique I experimented with was asking it for information that I knew its database wouldn’t have, like “design trends of 2026”, but without mentioning Brave. It started calling Brave immediately, and gave me fast and accurate results, with citations.
But to actually get useful responses, you have to give your prompts structure and include everything you expect to get from your model, just as you normally would without web access. This is just the nature of working with local LLMs since they’re a bit more matter-of-fact and don’t infer context as well as cloud models.
Another technique is to leverage freshness triggers; this can include words like “recent, latest, currently, etc.” This is more likely to trigger your model to call Brave. Limiting the scope also helps - asking for two or three results instead of “top 10” or “everything trending” helps avoid long tool-calling loops.
Upgrading my local model with a simple tool
Adding Brave Search MCP to my local LLM didn’t completely transform it into a perfect system, but it solved one of its biggest weaknesses. Instead of relying on purely static knowledge and occasionally making things up, it now has a way to pull in not only current information, but more niche information it never had to begin with. The setup itself was straightforward - it’s the prompting experimentation that took a minute to figure out. Once I ironed out my prompts and found what works, the results became consistent enough to rely on.
