February 10, 2026
RAG with SQLite: Why I Dropped Vector Databases
Building a RAG pipeline with sqlite-vec. Simpler deployment, fewer dependencies, and surprisingly good performance for document retrieval.
7 min read
I spent six months wrestling with vector databases. Pinecone, Weaviate, Milvus—I tried them all. They worked, but the operational overhead never felt justified for a single-developer project. Last month, I switched to SQLite with sqlite-vec. Here’s why.
The problem with vector databases
Vector databases solve real problems at scale: sharding, replication, distributed query execution. But for a personal knowledge base with a few thousand documents, those problems don’t exist.
What I experienced instead:
- Separate infrastructure — One more service to monitor, one more dependency to version
- Network latency — Even hosted vector databases add 100-300ms per query
- Complex deployment — Local development required Docker, production required cloud instances
- Cost — Free tiers are generous but limiting; paid tiers feel excessive for small projects
The SQLite alternative
SQLite is already in my stack for most projects. It’s a single file, requires no network, and has excellent Python support. With sqlite-vec, I can store vectors alongside metadata and query them in the same database.
Here’s the setup:
import sqlite3
from sqlite_vec import sqlite_vec
# Enable the extension
conn = sqlite3.connect('rag.db')
conn.enable_load_extension(True)
sqlite_vec.load(conn)
# Create a table with vectors
conn.execute('''
CREATE TABLE documents (
id TEXT PRIMARY KEY,
content TEXT,
embedding BLOB
)
''')
# Create a vector index
conn.execute('''
CREATE INDEX documents_idx ON documents USING vec(embedding)
''')
Querying is straightforward:
query_embedding = embed("What is the revenue model?")
results = conn.execute('''
SELECT id, content, distance
FROM documents
WHERE embedding MATCH ?
ORDER BY distance
LIMIT 5
''', [query_embedding.SerializeToString()])
Performance comparison
I tested on a corpus of 2,500 blog posts and documentation pages:
| Metric | Pinecone (cloud) | SQLite (local) |
|---|---|---|
| Query latency | 120-250ms | 15-30ms |
| Index build time | N/A (managed) | ~2 minutes |
| Disk usage | Unknown | ~80MB |
| Deployment complexity | Medium | Low |
For my use case, SQLite is 5-10x faster on queries and trivial to deploy.
The tradeoffs
SQLite isn’t perfect for vector search. Here’s where it falls short:
1. No automatic sharding
If you have millions of vectors, you’ll need to shard manually. For me, 10k vectors is plenty—and at that scale, a single file is fine.
2. Limited HNSW tuning
sqlite-vec uses HNSW for approximate nearest neighbor search, but tuning options are limited compared to dedicated vector databases. In practice, I haven’t found this to be a problem.
3. No managed service
You have to host it yourself. For me, this is a feature—I can deploy the database file alongside the application.
The surprising benefit
Keeping everything in SQLite enabled a workflow I hadn’t anticipated: ad-hoc SQL queries on both structured and unstructured data.
-- Find all documents about "pricing" that mention "freemium"
SELECT id, content
FROM documents
WHERE content LIKE '%pricing%'
AND content LIKE '%freemium%'
AND embedding MATCH (SELECT embedding FROM queries WHERE id = 'query-1')
ORDER BY distance
LIMIT 10
This hybrid search—semantic + keyword—is surprisingly powerful and hard to replicate with a separate vector database.
When to use each approach
My recommendation:
- Use SQLite: Personal projects, knowledge bases, <100k vectors, simple deployment
- Use vector database: Team projects, >100k vectors, need horizontal scaling, want managed service
For most solo developers building RAG applications, SQLite is the right place to start. You can always migrate to a vector database if you outgrow it—but I haven’t hit that point yet.
The biggest lesson: infrastructure should match scale. Don’t run distributed systems for problems that fit in memory.