HomePlaygroundExpressionsDocsDriversStatusChangelog GitHub
1.17x
Single Query
140μs vs 164μs
1.46x
Pool (100 concurrent)
16.2ms vs 23.6ms
4.00x
HTTP/2 Batch
4.8ms vs 19ms

📊 Single Query Performance

1000 sequential searches - Fair comparison

Driver Latency/Query Throughput Result
QAIL gRPC 140.3μs 7,126 ops/sec 1.17x faster
Official Client 164.0μs 6,096 ops/sec baseline

🔧 Key Optimizations

  • Buffer pooling: .split() vs .clone()
  • Direct h2 transport (no Tonic overhead)
  • Pre-computed protobuf tags
  • unsafe memcpy for 1536 floats → 1 operation

🚀 HTTP/2 Pipelining (Batch)

50 queries sent concurrently over single connection

Approach Total Time Per Query Result
HTTP/2 Pipelined 4.8ms 95μs 4.00x faster
Sequential 19.0ms 380μs baseline

💡 HTTP/2 multiplexing wins!

All 50 requests sent concurrently - perfect for RAG pipelines

Reproduce Results

git clone https://github.com/qail-io/qail
cd qail/qdrant

# Start Qdrant
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

# Seed data
python3 examples/seed_qdrant.py

# Run benchmarks
cargo run --example fair_benchmark --release
cargo run --example batch_benchmark --release