VOOZH about

URL: https://thenewstack.io/how-to-set-up-and-run-a-local-llm-with-ollama-and-llama-2/

⇱ How To Use Ollama: Set Up and Run a Local LLM With Llama 3


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-01-29 08:15:01
How To Use Ollama: Set Up and Run a Local LLM With Llama 3
sponsor-aerospike,sponsored-topic,tutorial,
AI / AI Engineering / Large Language Models / Software Development

How To Use Ollama: Set Up and Run a Local LLM With Llama 3

Take a look at how to run an open source LLM locally, which allows you to run queries on your private data without any security concerns.
Jan 29th, 2025 8:15am by David Eastman
👁 Featued image for: How To Use Ollama: Set Up and Run a Local LLM With Llama 3
Photo by Mónica Cisneros Parasí on Unsplash.
This post was originally published on Feb 17, 2024; it has been updated.

I’ve posted about coming off the cloud, and now I’m looking at running an open source LLM locally on my MacBook. If this feels like part of some “cloud repatriation” project, it isn’t: I’m just interested in tools I can control to add to any potential workflow chain.

Assuming your machine can spare the size and memory, what are the arguments for doing this? Apart from not having to pay the running costs of someone else’s server, you can run queries on your private data without any security concerns.

For this, I’m using Ollama. This is ”a tool that allows you to run open-source large language models (LLMs) locally on your machine.” They have access to a full list of open source models, which have different specializations — like bilingual models, compact-sized models, or code generation models. This started out as a Mac-based tool, but Windows is now available as a preview. It can also be used via Docker.

If you were looking for an LLM as part of a testing workflow, then this is where Ollama fits in:

For testing, local LLMs controlled from Ollama are nicely self-contained, but their quality and speed may suffer compared to the options you have on the cloud. Building a mock framework will result in much quicker tests, but setting these up — as the slide indicates — can be tedious.

Installing Ollama

I installed Ollama by downloading the app onto My MacBook, ran the app, and was prompted to try llama3.2 (for now I’ll ignore the argument that this isn’t actually open source). Opening up my Warp terminal, I assumed I’d have to install the model first, but the run command took care of that:

👁 Image

Note that it plops you into a chat mode, so you can test it immediately. It came back rapidly.

Inspecting Llama 3

Looking at the specs for the llama3.2 model, I see it defaults to the 3B parameter model and my 16GB MacBook Pro M4 was quite comfortable running it. I made one quick test query:

👁 Image

This was quick, so the model is clearly alive. Well, when I say “alive” I don’t quite mean that, as the model is trapped temporally at the point it was built:

👁 Image

If you were wondering, the correct answer to the arithmetic problem is actually 1,223,834,880. Better models simply spin out these problems to calculator apps when they spot them. Paradoxically, the inability to do simple maths marks out the limits of the new AI. Remember, LLM’s are not intelligent, they are just extremely good at extracting linguistic meaning from their models. But you know this, of course.

The convenient console is nice, but I wanted to use the available API. Ollama sets itself up as a local server on port 11434. We can do a quick curl command to check that the API is responding. Here is a non-streaming (that is, not interactive) REST call via the terminal with a JSON style payload:

> curl http://localhost:11434/api/generate -d '
{
 "model": "llama3.2",
 "prompt": "Why is the sky blue?",
 "stream": false
}'

The response was:

👁 Image

The full response line — which covered Rayleigh scattering, light’s wavelength, and the sun’s angle — all looked correct to me. It took 7 seconds, as you can see recorded in the Warp terminal block.

Using the Model

The common route to gain programmatic control would be to use Python, and maybe a Jupyter Notebook. But my tool of choice will be to try to use some C# bindings. I found some here. Fortunately, OllamaSharp is also available as a package via NuGet.

I’m not too keen on Visual Studio Code, but once you set up a C# console project with NuGet support, it is quick to get going.

Open VS Code from your terminal in your project directory:

👁 Image

Start a new .NET project via the Command Palette, choose a Console App, and name your project:

👁 Image

Then add OllamaSharp as a Nuget project, again from the Command Palette.

Here is the code to contact Ollama with a query, written into Program.cs and generating a completion straight into the console:

using OllamaSharp;

var uri = new Uri("http://localhost:11434");
var ollama = new OllamaApiClient(uri);

// select a model which should be used for further operations ollama.
ollama.SelectedModel = "llama3.2";
ConversationContext context = null;
await foreach (var stream in ollama.GenerateAsync("How are you today?"))
 Console.Write(stream.Response);

In a few seconds we get the response directly in the terminal tab:

👁 Image

That’s nice.

Further Tests of the Model

OK, so now we are ready to ask something a little bit more specific. I’ve seen people asking for categorized summaries of their bank accounts, but before I entrust it with that, let me try something more mundane. I’ll ask for a recipe based on the food in my fridge:

...
string question =
@"I have the following ingredients in my fridge:
aubergine, milk, cheese, peppers.
What food could I cook with this and other basic ingredients?";

ConversationContext context = null;
await foreach (var stream in ollama.GenerateAsync(question))
 Console.Write(stream.Response);

It took a few seconds to come back, for the record, the result is here:

With the ingredients you have (aubergine, milk, cheese, peppers), here are a few dish ideas that don’t require additional basic ingredients:

  1. Eggplant Fritters: Finely chop the aubergine, mix it with grated cheese, and shape into patties. Pan-fry until crispy and serve hot.
  2. Cheesy Aubergine Bake: Slice the aubergine, layer it in a baking dish with sliced peppers, and top with grated cheese. Drizzle with milk to create a creamy sauce. Bake in the oven until the cheese is melted and bubbly.
  3. Grilled Pepper and Aubergine Salad: Grill the peppers and aubergine slices until tender. Chop and mix with crumbled cheese, served with a side of creamy milk sauce (heat the milk with grated cheese until smooth).
  4. Aubergine and Cheese Omelet: Finely chop the aubergine and sauté it in a pan with some oil. Then, whip up an omelet with eggs and add chopped aubergine, sliced peppers, and grated cheese.

These ideas should inspire you to create a tasty dish using your available ingredients!

Given that we did not train the LLM, and didn’t add any recipe texts via Retrieval-augmented generation (RAG) to improve the quality by supplementing the LLM’s internal representation, I think this answer is fine. It comprehended what “basic ingredients” meant, and each recipe covers a different style. It also intuited that I didn’t need every one of my ingredients to be used, and correctly figured the distinct ingredient was the aubergine.

I would certainly have the confidence to let this summarize a bank account with set categories, if that was a task I valued — we are running locally after all. While things are still in flux with open source LLMs, especially around the issues of training data and bias, the maturity of the solutions is clearly improving, giving reasonable hope for future capability under considered conditions.

Aerospike is the real-time database built for infinite scale, speed, and savings. Our customers are ready for what’s next with the lowest latency and the highest throughput data platform. Cloud and AI-forward, we empower leading organizations like Adobe, Airtel, Criteo, Experian, and PayPal.
Learn More
The latest from Aerospike
TRENDING STORIES
David has been a London-based professional software developer with Oracle Corp. and British Telecom, and a consultant helping teams work in a more agile fashion. He wrote a book on UI design and has been writing technical articles ever since....
Read more from David Eastman
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Docker, Slice.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.