research project

opaque

privacy-preserving vector search engine

Search encrypted vector databases without the server seeing your query, your data, or your results. Sub-second latency on million-scale datasets with 99.8% recall on commodity 8-vCPU AWS EC2.

464mson 1M vectors (8 vCPU AWS)

99.8%Recall@10

0 bitsleaked to server

the problem

Every vector search query leaks information. Your embeddings encode meaning — medical questions, legal research, proprietary data. The server sees your query. The embedding provider understands what you're storing and what you're searching for.

Existing solutions force a choice: trust the server with everything, or run everything locally. Opaque is a third option — the server computes on encrypted data and learns nothing.

architecture

three-layer privacy architecture

Homomorphic encryption

Query encrypted with CKKS. Server scores centroids on ciphertext — never sees the query or results.

hides: Query content + similarity scores

AES-256-GCM blob encryption

All vectors encrypted at rest. Even a full server breach yields only encrypted garbage.

hides: Vector data + database contents

Decoy cluster fetching

Real requests mixed with random decoys. Server can't distinguish real from fake.

hides: Access patterns + which clusters matter

Threshold key committee

3-of-5 distributed decryption. No single party holds the full key.

hides: Key ownership + single point of compromise

benchmarks

SIFT 1M numbers measured on AWS EC2 (8 vCPU Intel Ice Lake), matching Pinecone p1.x1 / Qdrant 8-core pod tier. Reproducible from scratch in a fresh AWS account via deploy/bench-cpu/run_bench.sh. Other datasets measured on Apple M4 Pro. Recall computed against brute-force cosine similarity ground truth. Full privacy pipeline active for every query.

Dataset	Vectors	Dim	Latency	Recall@10
SIFT1Mprobe-8 ε=2.5, m6i.2xlarge (8 vCPU) — recommended	1M	128-dim	464ms	99.8%
SIFT1Mprobe-16 ε=2.5, m6i.2xlarge — max recall	1M	128-dim	652ms	100.0%
SIFT1MPQ-M8 probe-8 ε=2.5, m6i.2xlarge	1M	128-dim	409ms	98.4%
SIFT1MPQ-M8 probe-8 ε=2.71, m6i.2xlarge — fastest tier	1M	128-dim	406ms	98.4%
SIFT2MPQ-M8 probe-16, M4 Pro	2M	128-dim	814ms	98.0%
GIST100KPQ-M32 probe-8, M4 Pro	100K	960-dim	497ms	98.0%
GloVe100Kprobe-16, M4 Pro	100K	300-dim	207ms	91.2%
SIFT100Kprobe-16, M4 Pro	100K	128-dim	160ms	100%

vs. published systems

Comparison with published private vector search systems. Different systems make different privacy-performance tradeoffs.

System	Latency	Recall	Scale	Approach
Opaque (8 vCPU AWS)	464ms	99.8%	1M	CKKS HE + AES + decoys
Compass (OSDI '25)	~600-900ms	high	8.8M	ORAM + HNSW
RemoteRAG (ACL '25)	670ms	100%	1M	PHE + differential privacy
PPMI (arXiv '25)	951ms	>99%	1M	CKKS + AES-256
Pacmann (ICLR '25)	~3.1s	~90%	100M	PIR + graph ANN
SANNS (USENIX '20)	~1.4s (72t)	~90%	10M	HE + ORAM + garbled circuits

by the numbers

464ms1M vectors on 8 vCPU AWS

99.8%Recall@10 with full privacy

100%Recall@10 at 652ms (probe-16)

19.2xPQ speedup on 960-dim

6.2%of dataset scanned per query

0 bitsleaked to the server

why it matters

Minimal crypto, maximum speed

Only centroid scoring runs under HE — one batched CKKS operation (~48ms). Everything else is fast local computation. This is why Opaque is fast where others aren't.

Scans 6% of the data, finds 99.8%

K-means clustering with redundant assignment and multi-probe selection. Search a tiny fraction of the database while missing almost nothing.

No single point of trust

Threshold CKKS splits the decryption key across a 3-of-5 committee with near-zero overhead. Compromise one node, learn nothing.

Tested on real datasets at scale

Benchmarked on SIFT1M, SIFT2M, GIST (960-dim), and GloVe embeddings — not just synthetic data. All recall numbers are against brute-force ground truth.

Optimization pipeline — from seconds to sub-second

timeline

Jan 2026

Initial prototype

Python + LightPHE. 6.5s per query. Proved the concept, not the performance.

Jan 2026

Go rewrite with Lattigo

367x faster encryption. CKKS scheme, k-means clustering, AES-256 data encryption.

Feb 2026

SIMD slot packing

Packed 64 centroids into a single ciphertext. 62x speedup on HE scoring.

Feb 2026

172ms on 100K vectors

Multi-probe, redundant assignment, worker pools. 95% Recall@10 with full privacy.

Mar 2026

Product quantization under encryption

19.2x speedup on high-dimensional GIST. Sub-second queries on 1M+ vectors.

Mar 2026

Threshold CKKS

3-of-5 distributed key committee. 0-10% latency overhead. No single point of trust.

Mar 2026

GPU acceleration research

89x speedup on batch dot product (Tesla T4). NTT domain bridge for Lattigo-HEonGPU interop.

Apr 2026

Multi-million scale

2M vectors with 98% recall at 814ms. Sub-second private search at multi-million scale.

Apr 2026

Production-tier AWS validation

SIFT 1M benchmarked on commodity AWS EC2 — sub-500ms private search on 1M vectors. Search latency saturates at 8 vCPU; scale out horizontally for throughput.

built with

Go 1.25core runtime

Lattigo v5CKKS homomorphic encryption

gRPCclient-server protocol

HEonGPUGPU-accelerated HE (CUDA)

AES-256-GCMvector encryption at rest

i built private vector searchDeep dive into the early prototype — from 6.5s in Python to 172ms in Go. Covers the core architecture, CKKS scheme selection, and the first round of optimizations.

Paper in progress. For questions, collaboration, or early access — reach out.

get in touch

opaque

the problem

architecture

benchmarks

vs. published systems

by the numbers

why it matters

timeline

built with

read more

you made it to the end.