February 10, 2026

RAG with SQLite: Why I Dropped Vector Databases

Building a RAG pipeline with sqlite-vec. Simpler deployment, fewer dependencies, and surprisingly good performance for document retrieval.

7 min read

I spent six months wrestling with vector databases. Pinecone, Weaviate, Milvus—I tried them all. They worked, but the operational overhead never felt justified for a single-developer project. Last month, I switched to SQLite with sqlite-vec. Here’s why.

The problem with vector databases

Vector databases solve real problems at scale: sharding, replication, distributed query execution. But for a personal knowledge base with a few thousand documents, those problems don’t exist.

What I experienced instead:

  • Separate infrastructure — One more service to monitor, one more dependency to version
  • Network latency — Even hosted vector databases add 100-300ms per query
  • Complex deployment — Local development required Docker, production required cloud instances
  • Cost — Free tiers are generous but limiting; paid tiers feel excessive for small projects

The SQLite alternative

SQLite is already in my stack for most projects. It’s a single file, requires no network, and has excellent Python support. With sqlite-vec, I can store vectors alongside metadata and query them in the same database.

Here’s the setup:

import sqlite3
from sqlite_vec import sqlite_vec

# Enable the extension
conn = sqlite3.connect('rag.db')
conn.enable_load_extension(True)
sqlite_vec.load(conn)

# Create a table with vectors
conn.execute('''
  CREATE TABLE documents (
    id TEXT PRIMARY KEY,
    content TEXT,
    embedding BLOB
  )
''')

# Create a vector index
conn.execute('''
  CREATE INDEX documents_idx ON documents USING vec(embedding)
''')

Querying is straightforward:

query_embedding = embed("What is the revenue model?")
results = conn.execute('''
  SELECT id, content, distance
  FROM documents
  WHERE embedding MATCH ?
  ORDER BY distance
  LIMIT 5
''', [query_embedding.SerializeToString()])

Performance comparison

I tested on a corpus of 2,500 blog posts and documentation pages:

MetricPinecone (cloud)SQLite (local)
Query latency120-250ms15-30ms
Index build timeN/A (managed)~2 minutes
Disk usageUnknown~80MB
Deployment complexityMediumLow

For my use case, SQLite is 5-10x faster on queries and trivial to deploy.

The tradeoffs

SQLite isn’t perfect for vector search. Here’s where it falls short:

1. No automatic sharding

If you have millions of vectors, you’ll need to shard manually. For me, 10k vectors is plenty—and at that scale, a single file is fine.

2. Limited HNSW tuning

sqlite-vec uses HNSW for approximate nearest neighbor search, but tuning options are limited compared to dedicated vector databases. In practice, I haven’t found this to be a problem.

3. No managed service

You have to host it yourself. For me, this is a feature—I can deploy the database file alongside the application.

The surprising benefit

Keeping everything in SQLite enabled a workflow I hadn’t anticipated: ad-hoc SQL queries on both structured and unstructured data.

-- Find all documents about "pricing" that mention "freemium"
SELECT id, content
FROM documents
WHERE content LIKE '%pricing%'
  AND content LIKE '%freemium%'
  AND embedding MATCH (SELECT embedding FROM queries WHERE id = 'query-1')
ORDER BY distance
LIMIT 10

This hybrid search—semantic + keyword—is surprisingly powerful and hard to replicate with a separate vector database.

When to use each approach

My recommendation:

  • Use SQLite: Personal projects, knowledge bases, <100k vectors, simple deployment
  • Use vector database: Team projects, >100k vectors, need horizontal scaling, want managed service

For most solo developers building RAG applications, SQLite is the right place to start. You can always migrate to a vector database if you outgrow it—but I haven’t hit that point yet.

The biggest lesson: infrastructure should match scale. Don’t run distributed systems for problems that fit in memory.