research project

opaque

privacy-preserving vector search engine

Search encrypted vector databases without the server seeing your query, your data, or your results. Sub-second latency on million-scale datasets with 99.8% recall on commodity 8-vCPU AWS EC2.

464mson 1M vectors (8 vCPU AWS)
99.8%Recall@10
0 bitsleaked to server
Encrypted vector search visualization

Every vector search query leaks information. Your embeddings encode meaning — medical questions, legal research, proprietary data. The server sees your query. The embedding provider understands what you're storing and what you're searching for.

Existing solutions force a choice: trust the server with everything, or run everything locally. Opaque is a third option — the server computes on encrypted data and learns nothing.

Three-layer architecture: HE centroid scoring, decoy bucket fetching, local AES decryptionthree-layer privacy architecture
01
Homomorphic encryption
Query encrypted with CKKS. Server scores centroids on ciphertext — never sees the query or results.
hides: Query content + similarity scores
02
AES-256-GCM blob encryption
All vectors encrypted at rest. Even a full server breach yields only encrypted garbage.
hides: Vector data + database contents
03
Decoy cluster fetching
Real requests mixed with random decoys. Server can't distinguish real from fake.
hides: Access patterns + which clusters matter
04
Threshold key committee
3-of-5 distributed decryption. No single party holds the full key.
hides: Key ownership + single point of compromise

SIFT 1M numbers measured on AWS EC2 (8 vCPU Intel Ice Lake), matching Pinecone p1.x1 / Qdrant 8-core pod tier. Reproducible from scratch in a fresh AWS account via deploy/bench-cpu/run_bench.sh. Other datasets measured on Apple M4 Pro. Recall computed against brute-force cosine similarity ground truth. Full privacy pipeline active for every query.

DatasetVectorsDimLatencyRecall@10
SIFT1Mprobe-8 ε=2.5, m6i.2xlarge (8 vCPU) — recommended1M128-dim464ms99.8%
SIFT1Mprobe-16 ε=2.5, m6i.2xlarge — max recall1M128-dim652ms100.0%
SIFT1MPQ-M8 probe-8 ε=2.5, m6i.2xlarge1M128-dim409ms98.4%
SIFT1MPQ-M8 probe-8 ε=2.71, m6i.2xlarge — fastest tier1M128-dim406ms98.4%
SIFT2MPQ-M8 probe-16, M4 Pro2M128-dim814ms98.0%
GIST100KPQ-M32 probe-8, M4 Pro100K960-dim497ms98.0%
GloVe100Kprobe-16, M4 Pro100K300-dim207ms91.2%
SIFT100Kprobe-16, M4 Pro100K128-dim160ms100%

Comparison with published private vector search systems. Different systems make different privacy-performance tradeoffs.

SystemLatencyRecallScaleApproach
Opaque (8 vCPU AWS)464ms99.8%1MCKKS HE + AES + decoys
Compass (OSDI '25)~600-900mshigh8.8MORAM + HNSW
RemoteRAG (ACL '25)670ms100%1MPHE + differential privacy
PPMI (arXiv '25)951ms>99%1MCKKS + AES-256
Pacmann (ICLR '25)~3.1s~90%100MPIR + graph ANN
SANNS (USENIX '20)~1.4s (72t)~90%10MHE + ORAM + garbled circuits
464ms1M vectors on 8 vCPU AWS
99.8%Recall@10 with full privacy
100%Recall@10 at 652ms (probe-16)
19.2xPQ speedup on 960-dim
6.2%of dataset scanned per query
0 bitsleaked to the server
Minimal crypto, maximum speed
Only centroid scoring runs under HE — one batched CKKS operation (~48ms). Everything else is fast local computation. This is why Opaque is fast where others aren't.
Scans 6% of the data, finds 99.8%
K-means clustering with redundant assignment and multi-probe selection. Search a tiny fraction of the database while missing almost nothing.
No single point of trust
Threshold CKKS splits the decryption key across a 3-of-5 committee with near-zero overhead. Compromise one node, learn nothing.
Tested on real datasets at scale
Benchmarked on SIFT1M, SIFT2M, GIST (960-dim), and GloVe embeddings — not just synthetic data. All recall numbers are against brute-force ground truth.
Optimization pipeline — from seconds to sub-second
Jan 2026
Initial prototype
Python + LightPHE. 6.5s per query. Proved the concept, not the performance.
Jan 2026
Go rewrite with Lattigo
367x faster encryption. CKKS scheme, k-means clustering, AES-256 data encryption.
Feb 2026
SIMD slot packing
Packed 64 centroids into a single ciphertext. 62x speedup on HE scoring.
Feb 2026
172ms on 100K vectors
Multi-probe, redundant assignment, worker pools. 95% Recall@10 with full privacy.
Mar 2026
Product quantization under encryption
19.2x speedup on high-dimensional GIST. Sub-second queries on 1M+ vectors.
Mar 2026
Threshold CKKS
3-of-5 distributed key committee. 0-10% latency overhead. No single point of trust.
Mar 2026
GPU acceleration research
89x speedup on batch dot product (Tesla T4). NTT domain bridge for Lattigo-HEonGPU interop.
Apr 2026
Multi-million scale
2M vectors with 98% recall at 814ms. Sub-second private search at multi-million scale.
Apr 2026
Production-tier AWS validation
SIFT 1M benchmarked on commodity AWS EC2 — sub-500ms private search on 1M vectors. Search latency saturates at 8 vCPU; scale out horizontally for throughput.
Go 1.25core runtime
Lattigo v5CKKS homomorphic encryption
gRPCclient-server protocol
HEonGPUGPU-accelerated HE (CUDA)
AES-256-GCMvector encryption at rest
i built private vector searchDeep dive into the early prototype — from 6.5s in Python to 172ms in Go. Covers the core architecture, CKKS scheme selection, and the first round of optimizations.

Paper in progress. For questions, collaboration, or early access — reach out.