Ran a production-realistic benchmark of the three most popular vector databases for RAG workloads — not just synthetic benchmarks, but real embedding distributions from enterprise documents.
Methodology
- Dataset — 2M embeddings from real enterprise documents (1536-dim, OpenAI ada-002)
- Queries — 10K real user queries from production RAG system
- Metrics — recall@10, p95 latency, cost/1M queries, operational burden
Results Summary
| Database | Recall@10 | p95 Latency | Cost/Month (2M vectors) |
|---|---|---|---|
| Qdrant (self-hosted) | 98.2% | 12ms | $150 |
| Pinecone (managed) | 97.8% | 18ms | $420 |
| Weaviate (self-hosted) | 97.5% | 22ms | $180 |
Recommendation
Qdrant wins on performance and cost for teams comfortable with self-hosting. Pinecone wins on operational simplicity. Weaviate’s multi-tenancy support is best for SaaS use cases.