How an inference provider can prove they're not serving a quantized model

16 hours ago

Tinfoil's Modelwrap provides cryptographic guarantees that specific, untampered model weights are served, verifiable by clients on each request.
Modelwrap consists of a public commitment to model weights, a binding mechanism to the inference server, and a client-side verification process.
Attestation in secure hardware enclaves measures launch state but not runtime state, requiring additional mechanisms to verify post-boot data loading.
Modelwrap uses Merkle trees for efficient data verification and dm-verity for kernel-level enforcement of cryptographic commitments on every read.
The system supports both public and private models, allowing verification without exposing proprietary weights.
Performance benchmarks show minimal storage overhead (0.8%) and manageable build times, with runtime overhead mainly affecting initial model loading.
Modelwrap is open-source, enabling users to verify deployments or generate commitments for private weights in their own enclaves.

Hasty Briefsbeta