SQLite ships inside every Python installation, yet most developers barely scratch its surface. With Python 3.13 bundling SQLite 3.47 and the sqlite3 module supporting WAL mode, JSON functions, full-text search, and window functions out of the box, you can build production-grade local databases without installing a single external package. This step-by-step SQLite Python tutorial walks you through 13 hands-on steps – from your first connection to a complete working project – with tested code blocks, real output examples, and every pitfall mapped so you ship faster.
Whether you are prototyping a REST API, building an Electron desktop app, or storing sensor data on a Raspberry Pi, the sqlite3 standard library module is often the fastest path from idea to working software. According to the SQLite consortium, SQLite is the most widely deployed database engine in the world, running on billions of devices. Python’s built-in sqlite3 module gives you full access to that power with zero configuration. By the end of this tutorial you will have a complete task-manager CLI application backed by SQLite, with migrations, full-text search, JSON columns, and WAL-mode concurrency – all in pure Python.
Prerequisites and Environment Setup
Before writing your first line of SQLite Python code, confirm that your environment meets these requirements. Every version number below was tested on April 11, 2026.
| Requirement | Minimum Version | Recommended Version | How to Check |
|---|---|---|---|
| Python | 3.10 | 3.13+ | python3 --version |
| SQLite (bundled) | 3.35.0 | 3.47.2 | python3 -c "import sqlite3; print(sqlite3.sqlite_version)" |
| pip | 23.0 | 24.3+ | pip --version |
| Operating System | Any | macOS 14+, Ubuntu 22.04+, Windows 11 | — |
| Disk Space | 50 MB | 200 MB (for sample data) | df -h |
| Text Editor / IDE | Any | VS Code, PyCharm, or Cursor | — |
The sqlite3 module is part of the Python standard library, so pip install is not required. However, if your system Python ships an older SQLite build (common on Ubuntu 22.04 which ships SQLite 3.37.2), you may want to upgrade your Python installation to get the latest SQLite features like json_group_array() and stricter JSON validation. Python 3.13 bundles SQLite 3.47.2, which includes every feature used in this tutorial.
Create a project directory and a virtual environment to keep things clean:
mkdir sqlite-python-tutorial && cd sqlite-python-tutorial
python3 -m venv venv
source venv/bin/activate # Windows: venvScriptsactivate
python3 -c "import sqlite3; print(f'Python sqlite3 API: {sqlite3.version}'); print(f'SQLite engine: {sqlite3.sqlite_version}')"
Expected output:
Python sqlite3 API: 2.6.0
SQLite engine: 3.47.2
If sqlite3.sqlite_version returns anything below 3.35.0, some advanced features (like RETURNING clauses and math functions) will not work. In that case, install the latest Python from python.org or use pyenv install 3.13.1.
Step 1: Create a Database and Connect
Every SQLite Python project begins with a connection. The sqlite3.connect() function either opens an existing database file or creates one if it does not exist. This is fundamentally different from PostgreSQL or MySQL, where you must first start a server process. SQLite is a serverless, file-based database – the entire database lives in a single .db file on disk.
import sqlite3
from pathlib import Path
DB_PATH = Path("tasks.db")
# Connect (creates file if missing)
conn = sqlite3.connect(DB_PATH)
# Enable foreign key enforcement (off by default!)
conn.execute("PRAGMA foreign_keys = ON")
# Use Row factory for dict-like access
conn.row_factory = sqlite3.Row
print(f"Database created: {DB_PATH.resolve()}")
print(f"SQLite version: {sqlite3.sqlite_version}")
print(f"File size: {DB_PATH.stat().st_size} bytes")
Expected output:
Database created: /home/user/sqlite-python-tutorial/tasks.db
SQLite version: 3.47.2
File size: 0 bytes
The file starts at 0 bytes because SQLite uses lazy initialization – the file grows only when you write data. The PRAGMA foreign_keys = ON line is critical: SQLite disables foreign key enforcement by default for backwards compatibility, and forgetting this pragma is one of the most common pitfalls in SQLite Python development. You must run it on every new connection, not just once.
Pitfall 1: Forgetting PRAGMA foreign_keys = ON. Without it, SQLite silently ignores foreign key constraints. Your INSERT statements succeed even when referencing non-existent parent rows, leading to orphaned data that is painful to clean up later.
Step 2: Design and Create Tables with Proper Schema
Good schema design is the foundation of any database project. For our task-manager application, we need three tables: projects, tasks, and tags with a many-to-many junction table. SQLite uses a dynamic type system, but declaring explicit types improves readability and enables STRICT tables (available since SQLite 3.37).
SCHEMA_SQL = """
-- Projects table
CREATE TABLE IF NOT EXISTS projects (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE,
description TEXT DEFAULT '',
created_at TEXT NOT NULL DEFAULT (datetime('now')),
metadata TEXT DEFAULT '{}' -- JSON column
);
-- Tasks table with foreign key to projects
CREATE TABLE IF NOT EXISTS tasks (
id INTEGER PRIMARY KEY AUTOINCREMENT,
project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
title TEXT NOT NULL,
body TEXT DEFAULT '',
priority INTEGER NOT NULL DEFAULT 3 CHECK (priority BETWEEN 1 AND 5),
status TEXT NOT NULL DEFAULT 'todo' CHECK (status IN ('todo','in_progress','done')),
due_date TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
-- Tags table
CREATE TABLE IF NOT EXISTS tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL UNIQUE
);
-- Many-to-many junction
CREATE TABLE IF NOT EXISTS task_tags (
task_id INTEGER NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
tag_id INTEGER NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
PRIMARY KEY (task_id, tag_id)
);
-- Indexes for common queries
CREATE INDEX IF NOT EXISTS idx_tasks_project ON tasks(project_id);
CREATE INDEX IF NOT EXISTS idx_tasks_status ON tasks(status);
CREATE INDEX IF NOT EXISTS idx_tasks_due_date ON tasks(due_date);
"""
conn.executescript(SCHEMA_SQL)
conn.commit()
# Verify tables
tables = conn.execute(
"SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"
).fetchall()
print("Tables created:", [t["name"] for t in tables])
Expected output:
Tables created: ['projects', 'tags', 'task_tags', 'tasks']
Several design decisions deserve explanation. We use TEXT for dates because SQLite has no native DATE type – instead it provides date/time functions like datetime('now') that operate on ISO-8601 strings. The CHECK constraints on priority and status enforce data integrity at the database level, catching bad data before it enters. The metadata column stores JSON as plain TEXT, which we will query with SQLite’s JSON functions in Step 9.
Pitfall 2: Using AUTOINCREMENT unnecessarily. SQLite’s AUTOINCREMENT prevents ID reuse but adds overhead by maintaining an internal sqlite_sequence table. For most applications, INTEGER PRIMARY KEY alone provides auto-incrementing behavior without the performance cost. We use AUTOINCREMENT here because task IDs should never be recycled after deletion.
Step 3: Insert Data with Parameterized Queries
Never concatenate user input into SQL strings. The sqlite3 module supports parameterized queries with ? placeholders (qmark style) and :name placeholders (named style). Both prevent SQL injection and handle type conversion automatically. This is a non-negotiable security practice in any SQLite Python application.
import json
# Insert projects
projects = [
("Backend API", "REST API with Flask", json.dumps({"team": "backend", "sprint": 12})),
("Mobile App", "React Native client", json.dumps({"team": "mobile", "sprint": 12})),
("DevOps", "CI/CD pipelines", json.dumps({"team": "infra", "sprint": 13})),
]
conn.executemany(
"INSERT OR IGNORE INTO projects (name, description, metadata) VALUES (?, ?, ?)",
projects
)
# Insert tasks using named parameters
task_data = [
{"project_id": 1, "title": "Design database schema", "priority": 1, "status": "done", "due_date": "2026-04-05"},
{"project_id": 1, "title": "Implement user auth", "priority": 1, "status": "in_progress", "due_date": "2026-04-12"},
{"project_id": 1, "title": "Write API tests", "priority": 2, "status": "todo", "due_date": "2026-04-15"},
{"project_id": 1, "title": "Set up rate limiting", "priority": 3, "status": "todo", "due_date": "2026-04-18"},
{"project_id": 2, "title": "Configure React Native", "priority": 1, "status": "done", "due_date": "2026-04-03"},
{"project_id": 2, "title": "Build login screen", "priority": 2, "status": "in_progress", "due_date": "2026-04-14"},
{"project_id": 2, "title": "Integrate push notifications","priority": 3,"status": "todo", "due_date": "2026-04-20"},
{"project_id": 3, "title": "Set up GitHub Actions", "priority": 1, "status": "done", "due_date": "2026-04-01"},
{"project_id": 3, "title": "Configure Docker staging", "priority": 2, "status": "in_progress", "due_date": "2026-04-10"},
{"project_id": 3, "title": "Deploy monitoring stack", "priority": 3, "status": "todo", "due_date": "2026-04-22"},
]
conn.executemany(
"""INSERT INTO tasks (project_id, title, priority, status, due_date)
VALUES (:project_id, :title, :priority, :status, :due_date)""",
task_data
)
# Insert tags and link them
tag_names = ["urgent", "backend", "frontend", "devops", "security", "testing"]
conn.executemany("INSERT OR IGNORE INTO tags (name) VALUES (?)", [(t,) for t in tag_names])
# Link tags to tasks
tag_links = [(1,2),(2,2),(2,5),(3,6),(4,5),(5,3),(6,3),(7,3),(8,4),(9,4),(10,4)]
conn.executemany("INSERT OR IGNORE INTO task_tags (task_id, tag_id) VALUES (?, ?)", tag_links)
conn.commit()
count = conn.execute("SELECT COUNT(*) as c FROM tasks").fetchone()["c"]
print(f"Inserted {count} tasks across {len(projects)} projects")
Expected output:
Inserted 10 tasks across 3 projects
The executemany() method is significantly faster than calling execute() in a loop because it batches the operations into a single transaction. For bulk inserts of thousands of rows, this difference can be 10-50x. We use INSERT OR IGNORE for projects and tags so the script is idempotent – running it twice does not create duplicates.
Pitfall 3: Inserting rows one-by-one in a loop. Each individual execute() call triggers a transaction commit by default (unless you wrap it in BEGIN...COMMIT). Inserting 10,000 rows one at a time can take 30+ seconds, while executemany() handles the same workload in under 100 milliseconds.
Step 4: Query Data with SELECT, JOIN, and Aggregation
Reading data is where SQLite shines. The Row factory we set in Step 1 lets us access columns by name instead of index, making code far more readable. Let us run several queries of increasing complexity to demonstrate the sqlite3 cursor interface.
# Simple SELECT with filtering
overdue = conn.execute(
"""SELECT id, title, due_date, status
FROM tasks
WHERE due_date < date('now') AND status != 'done'
ORDER BY due_date""",
).fetchall()
print("=== Overdue Tasks ===")
for row in overdue:
print(f" #{row['id']} {row['title']} (due: {row['due_date']}, status: {row['status']})")
# JOIN query: tasks with project name and tag list
print("n=== Tasks with Projects and Tags ===")
results = conn.execute("""
SELECT t.id, t.title, p.name AS project, t.status,
GROUP_CONCAT(tg.name, ', ') AS tags
FROM tasks t
JOIN projects p ON p.id = t.project_id
LEFT JOIN task_tags tt ON tt.task_id = t.id
LEFT JOIN tags tg ON tg.id = tt.tag_id
GROUP BY t.id
ORDER BY t.priority, t.due_date
""").fetchall()
for r in results:
print(f" #{r['id']:2d} [{r['status']:12s}] {r['title']:30s} | {r['project']:12s} | tags: {r['tags'] or 'none'}")
# Aggregation: task counts per project and status
print("n=== Project Summary ===")
summary = conn.execute("""
SELECT p.name,
COUNT(*) AS total,
SUM(CASE WHEN t.status='done' THEN 1 ELSE 0 END) AS done,
SUM(CASE WHEN t.status='in_progress' THEN 1 ELSE 0 END) AS active,
SUM(CASE WHEN t.status='todo' THEN 1 ELSE 0 END) AS pending
FROM tasks t
JOIN projects p ON p.id = t.project_id
GROUP BY p.id
""").fetchall()
for s in summary:
print(f" {s['name']:15s} | total: {s['total']} | done: {s['done']} | active: {s['active']} | pending: {s['pending']}")
This step demonstrates three fundamental query patterns: filtered SELECT, multi-table JOIN with GROUP_CONCAT for denormalized tag lists, and aggregation with conditional SUM for pivot-style reporting. The GROUP_CONCAT function concatenates all tag names for each task into a comma-separated string, eliminating the need for multiple round-trips or post-processing in Python.
Pitfall 4: Using fetchall() on large result sets. For queries that return thousands of rows, fetchall() loads everything into memory at once. Use the cursor as an iterator instead: for row in conn.execute("SELECT ..."). This streams results one row at a time and keeps memory usage constant regardless of result size.
Step 5: Update and Delete with the RETURNING Clause
SQLite 3.35.0 introduced the RETURNING clause, which lets you see exactly which rows were modified by an UPDATE or DELETE statement without running a separate SELECT. This is a major shift for sqlite3 Python code because it eliminates race conditions and reduces round-trips.
# Update with RETURNING clause (requires SQLite >= 3.35)
updated = conn.execute("""
UPDATE tasks
SET status = 'in_progress',
updated_at = datetime('now')
WHERE status = 'todo' AND priority = 2
RETURNING id, title, status, updated_at
""").fetchall()
print("=== Updated Tasks ===")
for row in updated:
print(f" #{row['id']} {row['title']} -> {row['status']} at {row['updated_at']}")
# Delete with RETURNING
deleted = conn.execute("""
DELETE FROM tasks
WHERE status = 'done' AND due_date < date('now', '-7 days')
RETURNING id, title
""").fetchall()
print(f"n=== Deleted {len(deleted)} completed tasks older than 7 days ===")
for row in deleted:
print(f" #{row['id']} {row['title']}")
conn.commit()
# Verify remaining count
remaining = conn.execute("SELECT COUNT(*) as c FROM tasks").fetchone()["c"]
print(f"nRemaining tasks: {remaining}")
The RETURNING clause works with INSERT, UPDATE, and DELETE statements. It returns a result set just like SELECT, so you can call fetchall(), fetchone(), or iterate over the cursor. This is particularly valuable in web applications where you need to return the modified record to the client without a separate query.
Pitfall 5: Forgetting to call conn.commit(). The sqlite3 module runs in auto-commit mode for DDL statements (CREATE, DROP) but uses implicit transactions for DML statements (INSERT, UPDATE, DELETE). If you do not call conn.commit(), your changes are rolled back when the connection closes. This is the single most common source of “my data disappeared” bugs in SQLite Python projects.
Step 6: Use Context Managers for Safe Transactions
Manual commit() and rollback() calls are error-prone. Python’s context manager protocol provides a cleaner approach: use the connection object in a with block, and it will automatically commit on success or roll back on exception. This pattern is essential for any production SQLite Python application.
def transfer_task(conn, task_id: int, new_project_id: int) -> dict:
"""Move a task to a different project atomically."""
with conn: # Auto-commit on success, rollback on exception
# Verify target project exists
project = conn.execute(
"SELECT id, name FROM projects WHERE id = ?", (new_project_id,)
).fetchone()
if not project:
raise ValueError(f"Project {new_project_id} not found")
# Update the task
result = conn.execute("""
UPDATE tasks
SET project_id = ?, updated_at = datetime('now')
WHERE id = ?
RETURNING id, title, project_id
""", (new_project_id, task_id)).fetchone()
if not result:
raise ValueError(f"Task {task_id} not found")
return dict(result)
# Usage
try:
moved = transfer_task(conn, task_id=2, new_project_id=2)
print(f"Moved task #{moved['id']} '{moved['title']}' to project {moved['project_id']}")
except ValueError as e:
print(f"Error: {e}")
# Demonstrate rollback on error
try:
transfer_task(conn, task_id=999, new_project_id=1)
except ValueError as e:
print(f"Rolled back: {e}")
The with conn: pattern is the idiomatic way to handle transactions in sqlite3. Inside the block, all SQL statements share a single transaction. If the block exits normally, the transaction commits. If an exception propagates out, the transaction rolls back automatically. This guarantees atomicity without manual try/except/finally boilerplate.
For read-only operations, you can also use conn.execute("BEGIN DEFERRED") to explicitly start a deferred transaction, which only acquires a read lock until a write is attempted. This improves concurrency when multiple processes access the same database file.
Step 7: Enable WAL Mode for Concurrent Access
By default, SQLite uses rollback journal mode, which blocks all readers during a write operation. Write-Ahead Logging (WAL) mode is SQLite’s concurrency solution: it allows multiple readers to proceed simultaneously while a single writer appends to a separate WAL file. For any SQLite Python application serving more than one concurrent user – even a local web app – WAL mode is effectively mandatory.
# Enable WAL mode (persists across connections)
journal_mode = conn.execute("PRAGMA journal_mode=WAL").fetchone()[0]
print(f"Journal mode: {journal_mode}")
# Optimize for concurrent reads
conn.execute("PRAGMA synchronous=NORMAL") # Faster writes, safe with WAL
conn.execute("PRAGMA cache_size=-64000") # 64MB page cache
conn.execute("PRAGMA busy_timeout=5000") # Wait 5s on lock instead of failing
conn.execute("PRAGMA temp_store=MEMORY") # Temp tables in RAM
# Verify settings
pragmas = {
"journal_mode": conn.execute("PRAGMA journal_mode").fetchone()[0],
"synchronous": conn.execute("PRAGMA synchronous").fetchone()[0],
"cache_size": conn.execute("PRAGMA cache_size").fetchone()[0],
"busy_timeout": conn.execute("PRAGMA busy_timeout").fetchone()[0],
}
print("Active PRAGMAs:", pragmas)
Expected output:
Journal mode: wal
Active PRAGMAs: {'journal_mode': 'wal', 'synchronous': 1, 'cache_size': -64000, 'busy_timeout': 5000}
WAL mode creates two additional files alongside your database: tasks.db-wal (the write-ahead log) and tasks.db-shm (shared memory for coordination). These files are normal and should not be deleted while the database is in use. WAL mode persists – once set, it remains active even after closing and reopening the connection.
The busy_timeout pragma is equally important. Without it, a concurrent write attempt raises sqlite3.OperationalError: database is locked immediately. With busy_timeout=5000, SQLite retries for up to 5 seconds before raising the error, which is usually enough for the other writer to finish.
Pitfall 6: Deploying SQLite on network file systems (NFS, SMB). WAL mode requires shared-memory primitives that do not work reliably on network-mounted drives. If your database file lives on NFS or a Docker volume backed by a network filesystem, WAL mode silently falls back to rollback journal or causes corruption. Always use a local filesystem for SQLite databases.
Step 8: Add Full-Text Search with FTS5
SQLite’s FTS5 extension provides fast, relevance-ranked full-text search without any external search engine. It powers search functionality in applications like Apple’s Spotlight, Firefox history, and countless mobile apps. Python’s sqlite3 module includes FTS5 support if your SQLite build was compiled with the ENABLE_FTS5 flag – which it is in every standard Python 3.10+ distribution.
# Create FTS5 virtual table mirroring task content
conn.executescript("""
CREATE VIRTUAL TABLE IF NOT EXISTS tasks_fts USING fts5(
title,
body,
content='tasks',
content_rowid='id',
tokenize='porter unicode61'
);
-- Populate FTS index from existing data
INSERT INTO tasks_fts(tasks_fts) VALUES('rebuild');
-- Triggers to keep FTS in sync with tasks table
CREATE TRIGGER IF NOT EXISTS tasks_ai AFTER INSERT ON tasks BEGIN
INSERT INTO tasks_fts(rowid, title, body) VALUES (new.id, new.title, new.body);
END;
CREATE TRIGGER IF NOT EXISTS tasks_ad AFTER DELETE ON tasks BEGIN
INSERT INTO tasks_fts(tasks_fts, rowid, title, body)
VALUES('delete', old.id, old.title, old.body);
END;
CREATE TRIGGER IF NOT EXISTS tasks_au AFTER UPDATE ON tasks BEGIN
INSERT INTO tasks_fts(tasks_fts, rowid, title, body)
VALUES('delete', old.id, old.title, old.body);
INSERT INTO tasks_fts(rowid, title, body) VALUES (new.id, new.title, new.body);
END;
""")
# Search with ranking
query = "auth OR login"
results = conn.execute("""
SELECT t.id, t.title, t.status, rank
FROM tasks_fts fts
JOIN tasks t ON t.id = fts.rowid
WHERE tasks_fts MATCH ?
ORDER BY rank
""", (query,)).fetchall()
print(f'=== Search: "{query}" ({len(results)} results) ===')
for r in results:
print(f" #{r['id']} {r['title']} [{r['status']}] (rank: {r['rank']:.4f})")
# Highlight matches
highlighted = conn.execute("""
SELECT highlight(tasks_fts, 0, '<b>', '</b>') AS title_hl
FROM tasks_fts WHERE tasks_fts MATCH ?
""", (query,)).fetchall()
print("nHighlighted:")
for h in highlighted:
print(f" {h['title_hl']}")
FTS5 content-sync tables (using content='tasks') maintain a shadow index that stays synchronized with your main table through triggers. The tokenize='porter unicode61' option applies Porter stemming so that searching “running” also matches “run” and “runs.” The highlight() function wraps matching terms in your chosen delimiters – useful for building search result previews in web UIs.
Pitfall 7: Not rebuilding the FTS index after bulk operations. If you insert data directly via SQL (bypassing the triggers), the FTS index becomes stale. Run INSERT INTO tasks_fts(tasks_fts) VALUES('rebuild') after any bulk import to resynchronize the full-text index.
Step 9: Query JSON Columns with Built-in Functions
SQLite’s JSON1 extension lets you store and query JSON data directly inside TEXT columns. This is perfect for flexible metadata, configuration objects, and semi-structured data that does not warrant its own table. Python 3.10+ builds include JSON1 by default.
# Query JSON fields inside the metadata column
print("=== Projects by Team (from JSON metadata) ===")
results = conn.execute("""
SELECT name,
json_extract(metadata, '$.team') AS team,
json_extract(metadata, '$.sprint') AS sprint
FROM projects
WHERE json_extract(metadata, '$.team') IS NOT NULL
ORDER BY team
""").fetchall()
for r in results:
print(f" {r['name']:20s} | team: {r['team']:10s} | sprint: {r['sprint']}")
# Update a nested JSON value
conn.execute("""
UPDATE projects
SET metadata = json_set(metadata, '$.priority', 'high', '$.deadline', '2026-04-30')
WHERE name = 'Backend API'
""")
conn.commit()
# Verify the update
meta = conn.execute(
"SELECT json_pretty(metadata) as m FROM projects WHERE name = 'Backend API'"
).fetchone()
print(f"nUpdated metadata:n{meta['m']}")
# Aggregate JSON: build a summary object
summary = conn.execute("""
SELECT json_group_object(
name,
json_object('team', json_extract(metadata, '$.team'),
'sprint', json_extract(metadata, '$.sprint'))
) AS summary
FROM projects
""").fetchone()
print(f"nProject summary JSON:n{summary['summary']}")
The key JSON functions you need to know are: json_extract() to read values, json_set() to update or insert keys, json_remove() to delete keys, json_group_array() and json_group_object() for aggregation, and json_pretty() for formatted output. These functions work on any TEXT column – no special column type is required.
For performance, you can create indexes on JSON paths using generated columns or expression indexes:
-- Expression index on JSON field (SQLite >= 3.35)
CREATE INDEX IF NOT EXISTS idx_projects_team
ON projects(json_extract(metadata, '$.team'));
Pitfall 8: Storing JSON as Python dicts without serialization. The sqlite3 module does not automatically serialize Python dictionaries. You must call json.dumps() before INSERT and json.loads() after SELECT, or register custom adapters (covered in Step 11). Passing a raw dict to a parameterized query raises InterfaceError.
Step 10: Build a Database Migration System
Production applications evolve, and so must their schemas. SQLite supports ALTER TABLE ADD COLUMN and column renaming (since SQLite 3.25), but a simple migration system using user_version PRAGMA keeps your schema versioned without any external tools.
MIGRATIONS = [
# Migration 1: Add 'archived' column to tasks
"""ALTER TABLE tasks ADD COLUMN archived INTEGER NOT NULL DEFAULT 0;""",
# Migration 2: Add 'color' column to projects
"""ALTER TABLE projects ADD COLUMN color TEXT DEFAULT '#3B82F6';""",
# Migration 3: Create a view for dashboard reporting
"""CREATE VIEW IF NOT EXISTS dashboard AS
SELECT p.name AS project,
COUNT(t.id) AS total_tasks,
SUM(CASE WHEN t.status='done' THEN 1 ELSE 0 END) AS completed,
ROUND(100.0 * SUM(CASE WHEN t.status='done' THEN 1 ELSE 0 END) / COUNT(t.id), 1) AS pct_done
FROM projects p
LEFT JOIN tasks t ON t.project_id = p.id
GROUP BY p.id;""",
]
def run_migrations(conn, migrations: list) -> int:
"""Apply pending migrations based on user_version pragma."""
current = conn.execute("PRAGMA user_version").fetchone()[0]
applied = 0
for i, sql in enumerate(migrations):
version = i + 1
if version <= current:
continue
try:
conn.executescript(sql)
conn.execute(f"PRAGMA user_version = {version}")
applied += 1
print(f" Applied migration {version}")
except Exception as e:
print(f" Migration {version} failed: {e}")
raise
return applied
print(f"Current schema version: {conn.execute('PRAGMA user_version').fetchone()[0]}")
applied = run_migrations(conn, MIGRATIONS)
print(f"Applied {applied} migrations")
print(f"New schema version: {conn.execute('PRAGMA user_version').fetchone()[0]}")
# Test the dashboard view
print("n=== Dashboard View ===")
for row in conn.execute("SELECT * FROM dashboard"):
print(f" {row['project']:15s} | {row['completed']}/{row['total_tasks']} tasks ({row['pct_done']}% done)")
The user_version PRAGMA is an integer stored in the database file header – it survives backups, copies, and transfers. Each migration increments the version, and on startup the system skips already-applied migrations. This pattern scales to hundreds of migrations and is used by frameworks like Django (which uses a more complex variant) and Android’s Room library.
Step 11: Register Custom Type Adapters and Converters
The sqlite3 module’s adapter/converter system lets you transparently store and retrieve custom Python types. Adapters convert Python objects to SQLite-compatible values on INSERT, while converters do the reverse on SELECT. This eliminates repetitive serialization code throughout your application.
import sqlite3
import json
from datetime import datetime, date
from dataclasses import dataclass, asdict
# --- Adapter: Python object -> SQLite value ---
def adapt_datetime(dt: datetime) -> str:
return dt.isoformat()
def adapt_dict(d: dict) -> str:
return json.dumps(d)
sqlite3.register_adapter(datetime, adapt_datetime)
sqlite3.register_adapter(dict, adapt_dict)
# --- Converter: SQLite value -> Python object ---
def convert_datetime(val: bytes) -> datetime:
return datetime.fromisoformat(val.decode())
def convert_json(val: bytes) -> dict:
return json.loads(val.decode())
sqlite3.register_converter("DATETIME", convert_datetime)
sqlite3.register_converter("JSON", convert_json)
# Open connection with PARSE_DECLTYPES to activate converters
conn2 = sqlite3.connect(
DB_PATH,
detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES
)
conn2.row_factory = sqlite3.Row
# Test with a temporary table
conn2.execute("""
CREATE TABLE IF NOT EXISTS events (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
starts_at DATETIME NOT NULL,
config JSON DEFAULT '{}'
)
""")
# Insert using native Python types (adapters handle conversion)
conn2.execute(
"INSERT INTO events (name, starts_at, config) VALUES (?, ?, ?)",
("Sprint Review", datetime(2026, 4, 15, 14, 0), {"room": "A3", "remote": True})
)
conn2.commit()
# Select returns native Python types (converters handle it)
event = conn2.execute("SELECT * FROM events WHERE id = 1").fetchone()
print(f"Event: {event['name']}")
print(f"Type of starts_at: {type(event['starts_at']).__name__} -> {event['starts_at']}")
print(f"Type of config: {type(event['config']).__name__} -> {event['config']}")
conn2.close()
Expected output:
Event: Sprint Review
Type of starts_at: datetime -> 2026-04-15 14:00:00
Type of config: dict -> {'room': 'A3', 'remote': True}
The detect_types=PARSE_DECLTYPES flag tells sqlite3 to look at the column’s declared type (e.g., DATETIME, JSON) and apply the matching converter. PARSE_COLNAMES enables an alternative syntax using column aliases like SELECT col AS "col [DATETIME]". For most applications, PARSE_DECLTYPES alone is sufficient and cleaner.
Pitfall 9: Registering converters but forgetting detect_types. Without detect_types=sqlite3.PARSE_DECLTYPES in the connect() call, your converters are registered but never invoked. SQLite returns raw strings and bytes, and you spend hours debugging why your datetime column returns "2026-04-15 14:00:00" as a string instead of a datetime object.
Step 12: Implement User-Defined Functions and Aggregates
SQLite lets you extend SQL with custom Python functions. This is one of the most powerful – and underused – features of the sqlite3 module. You can create scalar functions (called per-row), aggregate functions (called across groups), and even window functions.
import math
import hashlib
# --- Scalar function: calculate days until due date ---
def days_until(due_date_str):
if not due_date_str:
return None
due = datetime.strptime(due_date_str, "%Y-%m-%d").date()
return (due - date.today()).days
conn.create_function("days_until", 1, days_until)
# --- Scalar function: generate a short hash ID ---
def short_hash(text):
return hashlib.sha256(text.encode()).hexdigest()[:8]
conn.create_function("short_hash", 1, short_hash)
# --- Aggregate function: weighted priority score ---
class WeightedPriority:
def __init__(self):
self.total_weight = 0
self.weighted_sum = 0
def step(self, priority, status):
weight = {"todo": 3, "in_progress": 2, "done": 0}.get(status, 1)
self.weighted_sum += priority * weight
self.total_weight += weight
def finalize(self):
if self.total_weight == 0:
return 0.0
return round(self.weighted_sum / self.total_weight, 2)
conn.create_aggregate("weighted_priority", 2, WeightedPriority)
# Use custom functions in queries
print("=== Tasks with Days Until Due ===")
for row in conn.execute("""
SELECT id, title, due_date,
days_until(due_date) AS days_left,
short_hash(title) AS hash_id
FROM tasks
WHERE status != 'done'
ORDER BY days_until(due_date)
"""):
days = row['days_left']
label = f"{days}d left" if days and days > 0 else "OVERDUE" if days is not None else "no date"
print(f" {row['hash_id']} #{row['id']} {row['title']:30s} | {label}")
# Use custom aggregate
print("n=== Weighted Priority by Project ===")
for row in conn.execute("""
SELECT p.name, weighted_priority(t.priority, t.status) AS urgency
FROM tasks t
JOIN projects p ON p.id = t.project_id
GROUP BY p.id
ORDER BY urgency DESC
"""):
print(f" {row['name']:15s} | urgency score: {row['urgency']}")
Custom functions execute inside SQLite’s query engine, which means they participate in indexing, WHERE clause filtering, and ORDER BY sorting. The days_until() function, for example, can be used in a WHERE days_until(due_date) < 7 filter to find tasks due within a week. Aggregate functions follow Python’s step/finalize pattern, receiving one call to step() per row and a final finalize() call to produce the result.
Pitfall 10: Custom functions that raise exceptions. If your custom function raises an unhandled Python exception, SQLite converts it to a generic OperationalError with a cryptic message. Always wrap custom function bodies in try/except and return a sensible default (like None) on failure, or at minimum re-raise with a descriptive message.
Step 13: Build the Complete Task Manager CLI
Now let us assemble everything into a complete, working command-line application. This project combines connections, schemas, migrations, FTS5 search, JSON columns, WAL mode, custom functions, and context managers into a single cohesive tool.
#!/usr/bin/env python3
"""task_manager.py - SQLite-powered task manager CLI."""
import sqlite3
import json
import sys
from datetime import datetime, date
from pathlib import Path
DB_PATH = Path("tasks.db")
def get_connection():
conn = sqlite3.connect(DB_PATH)
conn.row_factory = sqlite3.Row
conn.execute("PRAGMA foreign_keys = ON")
conn.execute("PRAGMA journal_mode = WAL")
conn.execute("PRAGMA busy_timeout = 5000")
conn.execute("PRAGMA cache_size = -64000")
# Register custom function
def days_until(d):
if not d: return None
return (datetime.strptime(d, "%Y-%m-%d").date() - date.today()).days
conn.create_function("days_until", 1, days_until)
return conn
def cmd_add(conn, project, title, priority=3):
with conn:
proj = conn.execute(
"SELECT id FROM projects WHERE name = ?", (project,)
).fetchone()
if not proj:
conn.execute("INSERT INTO projects (name) VALUES (?)", (project,))
proj = conn.execute(
"SELECT id FROM projects WHERE name = ?", (project,)
).fetchone()
result = conn.execute(
"INSERT INTO tasks (project_id, title, priority) VALUES (?, ?, ?) RETURNING id, title",
(proj["id"], title, priority)
).fetchone()
print(f"Created task #{result['id']}: {result['title']}")
def cmd_list(conn, status=None):
query = """
SELECT t.id, t.title, p.name AS project, t.status, t.priority,
t.due_date, days_until(t.due_date) AS days_left
FROM tasks t JOIN projects p ON p.id = t.project_id
"""
params = []
if status:
query += " WHERE t.status = ?"
params.append(status)
query += " ORDER BY t.priority, t.due_date"
tasks = conn.execute(query, params).fetchall()
print(f"{'ID':>4} {'Title':30s} {'Project':15s} {'Status':12s} {'Pri':>3} {'Due':>12}")
print("-" * 82)
for t in tasks:
days = f"({t['days_left']}d)" if t['days_left'] is not None else ""
print(f"#{t['id']:>3} {t['title']:30s} {t['project']:15s} "
f"{t['status']:12s} P{t['priority']} {(t['due_date'] or ''):>10} {days}")
def cmd_search(conn, query):
results = conn.execute("""
SELECT t.id, highlight(tasks_fts, 0, '[', ']') AS title, t.status
FROM tasks_fts JOIN tasks t ON t.id = tasks_fts.rowid
WHERE tasks_fts MATCH ? ORDER BY rank
""", (query,)).fetchall()
print(f'Search "{query}": {len(results)} results')
for r in results:
print(f" #{r['id']} {r['title']} [{r['status']}]")
def cmd_done(conn, task_id):
with conn:
result = conn.execute(
"UPDATE tasks SET status='done', updated_at=datetime('now') "
"WHERE id=? RETURNING title",
(task_id,)
).fetchone()
if result:
print(f"Completed: {result['title']}")
else:
print(f"Task #{task_id} not found")
def cmd_stats(conn):
for row in conn.execute("SELECT * FROM dashboard"):
pct = int(row["pct_done"]) if row["pct_done"] else 0
bar = "#" * (pct // 5) + "." * (20 - pct // 5)
print(f" {row['project']:15s} [{bar}] {row['pct_done']}% "
f"({row['completed']}/{row['total_tasks']})")
def main():
conn = get_connection()
if len(sys.argv) < 2:
print("Usage: python task_manager.py [add|list|search|done|stats] ...")
return
cmd = sys.argv[1]
if cmd == "add" and len(sys.argv) >= 4:
cmd_add(conn, sys.argv[2], sys.argv[3],
int(sys.argv[4]) if len(sys.argv) > 4 else 3)
elif cmd == "list":
cmd_list(conn, sys.argv[2] if len(sys.argv) > 2 else None)
elif cmd == "search" and len(sys.argv) >= 3:
cmd_search(conn, sys.argv[2])
elif cmd == "done" and len(sys.argv) >= 3:
cmd_done(conn, int(sys.argv[2]))
elif cmd == "stats":
cmd_stats(conn)
else:
print(f"Unknown command: {cmd}")
conn.close()
if __name__ == "__main__":
main()
Run the CLI with these example commands:
python task_manager.py add "Backend API" "Implement JWT refresh tokens" 1
python task_manager.py list
python task_manager.py list todo
python task_manager.py search "auth"
python task_manager.py done 2
python task_manager.py stats
This 90-line application demonstrates every concept from the tutorial: safe connections with WAL mode, parameterized queries, context managers for transactions, FTS5 search, custom functions, and the dashboard view from our migration system. It is a production-ready foundation you can extend with due-date reminders, tag filtering, export to CSV, or a Flask/FastAPI web interface.
SQLite Python Performance Optimization Guide
SQLite is often dismissed as “just for prototypes,” but with proper tuning it handles millions of rows and hundreds of concurrent readers. The following optimizations can improve query performance by 10-100x depending on your workload.
| Optimization | PRAGMA / Technique | Impact | When to Use |
|---|---|---|---|
| WAL mode | PRAGMA journal_mode=WAL | 10x concurrent read throughput | Always (unless on NFS) |
| Memory-mapped I/O | PRAGMA mmap_size=268435456 | 2-3x read speed on large DBs | Databases over 100 MB |
| Page cache | PRAGMA cache_size=-64000 | Fewer disk reads | Always |
| Synchronous NORMAL | PRAGMA synchronous=NORMAL | 2x write speed | WAL mode only |
| Batch inserts | executemany() | 50x insert speed | Bulk data loading |
| Expression indexes | CREATE INDEX ... ON (expression) | 10-100x for computed queries | Frequent JSON or function-based queries |
| Covering indexes | Index includes all SELECT columns | Eliminates table lookup | Hot read paths |
| ANALYZE | ANALYZE; PRAGMA optimize | Better query plans | After large data changes |
Run EXPLAIN QUERY PLAN before optimizing to understand where time is spent. Premature indexing is a common mistake – only index columns that appear in WHERE, JOIN, and ORDER BY clauses of your actual queries. Each index adds write overhead and consumes disk space.
# Benchmark: batch insert 100,000 rows
import time
conn_bench = sqlite3.connect(":memory:")
conn_bench.execute("CREATE TABLE bench (id INTEGER PRIMARY KEY, val TEXT, num REAL)")
data = [(f"item-{i}", i * 0.5) for i in range(100_000)]
# Method 1: executemany (fast)
start = time.perf_counter()
conn_bench.executemany("INSERT INTO bench (val, num) VALUES (?, ?)", data)
conn_bench.commit()
fast_time = time.perf_counter() - start
conn_bench.execute("DELETE FROM bench")
# Method 2: one-by-one in autocommit (slow)
conn_bench.isolation_level = None # autocommit
start = time.perf_counter()
for val, num in data[:1000]: # Only 1,000 - full set would take minutes
conn_bench.execute("INSERT INTO bench (val, num) VALUES (?, ?)", (val, num))
slow_time = (time.perf_counter() - start) * 100 # Extrapolate to 100K
print(f"executemany (100K rows): {fast_time:.3f}s")
print(f"one-by-one (100K est): {slow_time:.1f}s")
print(f"Speedup: {slow_time / fast_time:.0f}x")
conn_bench.close()
On typical hardware, executemany() inserts 100,000 rows in under 0.5 seconds, while individual inserts with autocommit take 30-60 seconds for the same dataset. The difference comes from transaction overhead: each autocommit INSERT requires a full fsync to disk, while executemany() batches everything into a single transaction.
Troubleshooting SQLite Python: 10 Common Errors Solved
After years of production use, these are the errors that trip up most sqlite3 Python developers. Each entry includes the exact error message, root cause, and tested fix.
1. sqlite3.OperationalError: database is locked
Cause: Another connection or process holds a write lock. In the default journal mode, even readers block writers.
Fix: Enable WAL mode (PRAGMA journal_mode=WAL) and set PRAGMA busy_timeout=5000. If the error persists, check for long-running transactions that are not committed or rolled back. Use lsof tasks.db on Linux/macOS to identify which processes have the file open.
2. sqlite3.OperationalError: no such table
Cause: The table has not been created yet, or you are connected to a different database file than expected.
Fix: Print Path(db_path).resolve() to verify the absolute path. Check for typos in table names (SQLite is case-insensitive for SQL keywords but case-sensitive for table and column names). Run SELECT name FROM sqlite_master WHERE type='table' to list all tables in the connected database.
3. sqlite3.InterfaceError: Error binding parameter
Cause: You passed a Python type that sqlite3 cannot convert (e.g., a dict, list, or custom class) without registering an adapter.
Fix: Register an adapter with sqlite3.register_adapter() or manually serialize the value (e.g., json.dumps() for dicts). Check that all parameters match the expected SQL types.
4. sqlite3.IntegrityError: FOREIGN KEY constraint failed
Cause: You are trying to insert a row with a foreign key that references a non-existent parent row, or delete a parent row that still has children (without CASCADE).
Fix: Insert parent rows first. Use ON DELETE CASCADE in your schema if child rows should be automatically deleted. Verify that PRAGMA foreign_keys = ON is set – if it is OFF, this error will not appear but your data integrity will silently break.
5. sqlite3.OperationalError: database or disk is full
Cause: The filesystem is full, or SQLite’s max_page_count pragma has been hit.
Fix: Check disk space with df -h. Run VACUUM to reclaim space from deleted rows (SQLite does not shrink the file automatically). If the database is legitimately large, increase PRAGMA max_page_count or move to a larger filesystem.
6. Data disappears after script exits
Cause: Missing conn.commit() before the connection closes. DML operations (INSERT, UPDATE, DELETE) in the default isolation mode are not auto-committed.
Fix: Use with conn: context managers for all write operations, or explicitly call conn.commit() after writes. Alternatively, set conn.isolation_level = None for full autocommit mode (but this hurts performance for bulk operations).
7. sqlite3.ProgrammingError: Cannot operate on a closed database
Cause: You called conn.close() and then tried to use the connection object again, or a context manager closed it.
Fix: Restructure your code to keep the connection alive for its entire usage scope. Create connections at the start of a request and close them at the end. In web frameworks, use connection pooling or per-request connection patterns.
8. FTS5 returns no results after bulk insert
Cause: The FTS5 content-sync table’s triggers were not in place during the insert, so the full-text index is stale.
Fix: Run INSERT INTO your_fts_table(your_fts_table) VALUES('rebuild') to fully rebuild the FTS index from the content table. This is fast even for large datasets.
9. sqlite3.OperationalError: near “RETURNING”: syntax error
Cause: Your SQLite version is older than 3.35.0, which is when RETURNING was introduced.
Fix: Check your version with sqlite3.sqlite_version. Upgrade Python to 3.11+ (which bundles SQLite 3.39+). On Ubuntu, upgrading the system libsqlite3-dev package and recompiling Python is another option.
10. JSON functions return NULL unexpectedly
Cause: The JSON column contains invalid JSON, or the JSON path expression has a typo. json_extract() returns NULL for invalid paths without raising an error.
Fix: Validate JSON before insertion with json_valid(): SELECT json_valid(metadata) FROM projects. Use json_type() to inspect the structure. Ensure paths start with $. (e.g., $.team, not just team).
Advanced SQLite Python Tips for Production
Once you have the fundamentals down, these advanced techniques separate hobby projects from production-grade SQLite Python applications.
Use connection pooling for web apps. While SQLite connections are lightweight (about 1ms to open), repeatedly opening and closing connections in a web request cycle is wasteful. Libraries like aiosqlite for async code or a simple queue.Queue-based pool can recycle connections safely. Flask-SQLAlchemy handles this automatically when configured with a SQLite URL.
Run PRAGMA optimize on close. Added in SQLite 3.18, PRAGMA optimize analyzes query patterns since the connection was opened and updates internal statistics accordingly. Call it just before closing a long-lived connection: conn.execute("PRAGMA optimize"). This keeps the query planner’s statistics fresh without the overhead of a full ANALYZE.
Back up safely with the Online Backup API. Never copy a SQLite database file while it is open – you risk capturing a half-written transaction. Instead, use the conn.backup() method (available in Python 3.7+), which calls SQLite’s online backup API and produces a consistent snapshot even during writes:
import sqlite3
source = sqlite3.connect("tasks.db")
backup = sqlite3.connect("tasks_backup.db")
source.backup(backup, pages=100, progress=lambda status, remaining, total:
print(f"Backup progress: {total - remaining}/{total} pages"))
backup.close()
source.close()
Use STRICT tables for type safety. Since SQLite 3.37, you can declare tables as STRICT to enforce column types at insertion time. This catches type mismatches that would otherwise be silently coerced:
CREATE TABLE measurements (
id INTEGER PRIMARY KEY,
sensor TEXT NOT NULL,
value REAL NOT NULL,
ts TEXT NOT NULL
) STRICT;
Monitor database health. Run these queries periodically in production to catch issues early:
# Check integrity
result = conn.execute("PRAGMA integrity_check").fetchone()[0]
assert result == "ok", f"Database corruption detected: {result}"
# Check database size and page utilization
stats = conn.execute("""
SELECT (SELECT page_count FROM pragma_page_count()) AS pages,
(SELECT freelist_count FROM pragma_freelist_count()) AS free_pages,
(SELECT page_size FROM pragma_page_size()) AS page_size
""").fetchone()
total_mb = stats['pages'] * stats['page_size'] / 1048576
waste_pct = (stats['free_pages'] / max(stats['pages'], 1)) * 100
print(f"DB size: {total_mb:.1f} MB | Free pages: {stats['free_pages']} ({waste_pct:.1f}% waste)")
If waste exceeds 20%, run VACUUM to compact the database. Note that VACUUM requires temporary disk space equal to the database size and holds an exclusive lock for the duration, so schedule it during low-traffic periods.
SQLite Python vs Other Database Options
Choosing the right database depends on your use case. Here is how SQLite Python compares to alternatives you might consider for your next project.
| Feature | SQLite (sqlite3) | PostgreSQL (psycopg2) | MySQL (mysql-connector) | MongoDB (pymongo) |
|---|---|---|---|---|
| Setup | Zero config, built-in | Server required | Server required | Server required |
| Concurrency | Single writer, many readers (WAL) | Full MVCC, unlimited writers | Row-level locking | Document-level locking |
| Max DB size | 281 TB (theoretical) | Unlimited | 256 TB | Unlimited |
| JSON support | json_extract, json_set | Native JSONB with indexing | JSON column type | Native document store |
| Full-text search | FTS5 built-in | tsvector/tsquery | FULLTEXT index | Atlas Search |
| Best for | Embedded, local, prototyping | Web apps, analytics | Web apps, legacy | Flexible schema, real-time |
| Python package | Standard library | pip install psycopg2 | pip install mysql-connector | pip install pymongo |
SQLite is the right choice when you need a database that deploys with your application – no server process, no credentials, no network latency. It handles read-heavy workloads with millions of rows efficiently. For our in-depth comparison, see SQLite vs MySQL 2026: 4.9x Read Speed Gap and 20x Write Divide. Switch to PostgreSQL or MySQL when you need concurrent writers, role-based access control, or replication across multiple servers.
Common Pitfalls Summary
Here is a quick-reference table of every pitfall covered in this SQLite Python tutorial, organized by severity and frequency.
| No. | Pitfall | Severity | Fix |
|---|---|---|---|
| 1 | Forgetting PRAGMA foreign_keys = ON | High | Run on every new connection |
| 2 | Unnecessary AUTOINCREMENT overhead | Low | Use plain INTEGER PRIMARY KEY unless IDs must never recycle |
| 3 | One-by-one inserts in a loop | High | Use executemany() or wrap in explicit transaction |
| 4 | fetchall() on millions of rows | Medium | Iterate cursor directly or use LIMIT/OFFSET |
| 5 | Missing conn.commit() | Critical | Use with conn: context manager for all writes |
| 6 | SQLite on network filesystems | Critical | Always use local disk for .db files |
| 7 | Stale FTS5 index after bulk import | Medium | Run INSERT INTO fts(fts) VALUES(‘rebuild’) |
| 8 | Passing raw dicts to parameterized queries | Medium | Use json.dumps() or register adapter |
| 9 | Converters registered but not activated | Medium | Pass detect_types=PARSE_DECLTYPES to connect() |
| 10 | Custom functions that raise exceptions | Low | Wrap in try/except, return None on failure |
Related Coverage
More Python and Database Tutorials
If you found this SQLite Python tutorial useful, explore these related guides on tech-insider.org:
- SQLite vs MySQL 2026: 4.9x Read Speed Gap and 20x Write Divide
- How to Master Redis with Python: 12-Step Tutorial with 5 Projects
- How to Build a Flask REST API in 12 Steps
- How to Build a REST API with Django REST Framework
- How to Automate Tasks with Python: Complete Automation Tutorial
- How to Build a REST API with FastAPI
- AI Coding Tools in 2026: How Generative Code Is Transforming Development
Frequently Asked Questions
Is SQLite good enough for production?
Yes, for the right workload. SQLite handles read-heavy applications with millions of rows at sub-millisecond latency. It powers production systems at Apple, Mozilla, Airbus, and countless mobile apps. The limitation is concurrent writes – if you need multiple simultaneous writers, switch to PostgreSQL or MySQL. For single-writer, many-reader scenarios (which covers most web dashboards, CLI tools, IoT devices, and embedded applications), SQLite is not just good enough – it is often the best choice.
How large can a SQLite database get?
SQLite supports databases up to 281 terabytes (the theoretical maximum based on page count limits). In practice, databases in the 1-50 GB range work well with proper indexing and WAL mode. Beyond 50 GB, you should carefully benchmark your specific query patterns and consider whether a client-server database would be more appropriate.
Do I need to install sqlite3 with pip?
No. The sqlite3 module is part of Python’s standard library and ships with every Python installation from version 2.5 onward. Running pip install sqlite3 will actually fail because there is no such PyPI package. If you need a newer SQLite version than what your Python bundles, install the pysqlite3 package from PyPI instead.
How do I handle concurrent access in a web application?
Enable WAL mode, set a busy timeout (5-10 seconds), and use one connection per request. In Flask, use g.db for per-request connections. In FastAPI, use dependency injection. For async frameworks, use the aiosqlite library which wraps sqlite3 with async/await support. Never share a single connection across threads – create a new connection in each thread instead.
Can I use SQLite with async Python (asyncio)?
The standard sqlite3 module is synchronous and will block the event loop. Use the aiosqlite library (pip install aiosqlite), which provides an async wrapper with the same API: async with aiosqlite.connect("db.sqlite") as db:. It runs sqlite3 operations in a background thread pool, keeping your event loop responsive.
How do I migrate from SQLite to PostgreSQL?
Export your SQLite data using .dump or a tool like pgloader, which handles schema translation automatically. Key differences to watch for: SQLite’s dynamic typing vs PostgreSQL’s strict types, AUTOINCREMENT vs SERIAL/IDENTITY, date strings vs native DATE/TIMESTAMP types, and GROUP_CONCAT vs string_agg(). If you use an ORM like SQLAlchemy, the migration is often just changing the connection string.
What is the difference between sqlite3.connect(“:memory:”) and a file database?
An in-memory database (:memory:) lives entirely in RAM and is destroyed when the connection closes. It is perfect for unit tests, temporary computations, and benchmarking because it eliminates disk I/O. A file-based database persists between program runs. You cannot share an in-memory database between connections – each connect(":memory:") call creates a completely independent database. Use file::memory:?cache=shared as the URI if you need multiple connections to the same in-memory database.
How do I back up a SQLite database safely?
Never copy the .db file directly while it may be open – you risk capturing an inconsistent snapshot. Use Python’s conn.backup(target_conn) method, which calls SQLite’s online backup API and produces a consistent copy even during active writes. For automated backups, you can also use the sqlite3 CLI: sqlite3 tasks.db ".backup tasks_backup.db". Both methods work correctly with WAL mode.
Sofia Lindström
Sofia Lindström is the Editor-in-Chief at Tech Insider, where she leads editorial strategy and oversees coverage across AI, cybersecurity, and enterprise technology. With over a decade in Swedish tech journalism, she previously served as technology editor at Dagens Industri and covered the Nordic startup ecosystem for Breakit. Sofia holds an MSc in Media Technology from KTH Royal Institute of Technology and is a frequent speaker at Web Summit and Slush. She is passionate about making complex technology accessible to business leaders.
View all articles