DeepSeek is a cutting-edge AI model designed for natural language processing, offering powerful capabilities such as text generation, summarization, and reasoning. It can run locally on Linux, making it an excellent choice for users who want privacy, control, and offline access to AI.
One of DeepSeek’s strengths is its flexibility while it can run on CPU-only systems, performance is significantly improved when using a dedicated GPU. On a CPU, response times may be slower, and larger models may require substantial RAM. With a GPU, DeepSeek can generate responses much faster by leveraging parallel processing, making real-time interactions more seamless.
This guide will walk you through the installation and setup of DeepSeek on Ubuntu or Debian-based Linux distributions, ensuring you can get started with AI on your own machine, whether you have a high-end GPU or not.
In this tutorial you will learn:
How to install and configure Ollama for running DeepSeek
How to optimize system resources for best performance
Software Requirements and Linux Command Line Conventions
Category
Requirements, Conventions or Software Version Used
System
Ubuntu/Debian, at least 16GB RAM recommended
Software
Ollama, Python 3.8+, DeepSeek-R1 models
Other
At least 10GB free disk space (more for larger models)
Conventions
# – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command $ – requires given linux commands to be executed as a regular non-privileged user
Prerequisites
Before we begin, ensure your system meets the minimum requirements. While DeepSeek can run on a CPU-only machine, having a high-performance processor and sufficient RAM will improve execution speed.
If a compatible GPU is installed, Ollama will automatically detect and utilize it for accelerated processing. If no GPU is found, a message will be displayed indicating that the model is running on the CPU.
No manual configuration is required.
DID YOU KNOW?DeepSeek isn’t just another AI model—it is inspired by reinforcement learning techniques used in cutting-edge AI research! Unlike traditional models that passively generate text, DeepSeek incorporates goal-oriented training, meaning it continuously refines its responses to align with user intent.
DeepSeek’s 671B model is one of the largest AI models ever trained, requiring over a petabyte of storage and running on thousands of GPUs simultaneously! Yet, thanks to its efficient architecture, even the smaller 1.5B model can generate high-quality results on consumer hardware.
Installation Steps
Video
Install Ollama: Ollama is required to run DeepSeek models. It provides an optimized local runtime for running machine learning models efficiently.First, ensure that curl is installed on your system. If it’s not installed, you can use apt install curl with:
$ sudo apt install curl
Once curl is available, download and run the official Ollama installation script:
$ curl -fsSL https://ollama.com/install.sh | sh
After installation, verify that Ollama is installed correctly by checking its version:
$ ollama --version
Additionally, ensure that the Ollama service is running with:
Download DeepSeek-R1: Now, fetch the model you want to run. DeepSeek-R1 models vary in size, balancing speed and accuracy based on your hardware capabilities. Larger models provide better reasoning and accuracy but require more RAM, VRAM, and disk space.To install the 7B model as an example, run:
$ ollama pull deepseek-r1:7b
Available DeepSeek-R1 Models, Hardware Requirements and Recommendations
Model
Parameters
Disk Space Required
Minimum RAM
Recommended GPU VRAM
Performance
deepseek-r1:1.5b
1.5 Billion
3 GB
8 GB
4 GB
Fastest, low memory usage, less accuracy
deepseek-r1:7b
7 Billion
15 GB
16 GB
8 GB
Good balance of speed and accuracy
deepseek-r1:8b
8 Billion
17 GB
24 GB
10 GB
Similar to 7B, slightly improved reasoning
deepseek-r1:14b
14 Billion
30 GB
32 GB
16 GB
Better understanding, needs more RAM
deepseek-r1:32b
32 Billion
70 GB
64 GB
24 GB
High accuracy, slower response times
deepseek-r1:70b
70 Billion
160 GB
128 GB
48 GB
Very accurate, slow inference speed
deepseek-r1:671b
671 Billion
1.5 TB
512 GB+
Multiple GPUs, 100 GB+ VRAM
Cutting-edge accuracy, extremely slow
Choosing the Right Model:
1.5B – 7B models: Best for everyday tasks, chat applications, and lightweight inference.
32B – 70B models: Highly advanced, suitable for research and deep analysis, but require substantial resources.
671B model: Requires data-center-level hardware. Used for cutting-edge AI research.
NOTE
Even with 512+ GB RAM and multiple GPUs with 100+ GB VRAM, the DeepSeek-R1:671B model remains slow due to its massive 671 billion parameters, requiring an immense number of calculations per response. While multiple GPUs improve overall throughput, they don’t significantly reduce latency for a single request, as data movement, memory bandwidth, and computational limits create bottlenecks. Even high-end AI infrastructure struggles with this scale, making smaller models (7B–14B) far more practical for real-time applications. The 671B model is best suited for research and large-scale AI experiments, where precision outweighs speed.
If you are unsure, start with deepseek-r1:7b as a general-purpose model.
Begin Using DeepSeek: Once the model is downloaded, you can start interacting with it directly.To run the DeepSeek-R1 model, use:
$ ollama run deepseek-r1:7b
You can explore more advanced usage and configurations in the Ollama Documentation.
DeepSeek offers various model sizes, each with different hardware requirements. If your system struggles with larger models, consider using a smaller variant like `1.5b`. Running DeepSeek without a GPU is possible, but optimizations will improve efficiency.