How an inference provider can prove they're not serving a quantized model
16 hours ago
- #secure-inference
- #cryptographic-attestation
- #model-verification
- Tinfoil's Modelwrap provides cryptographic guarantees that specific, untampered model weights are served, verifiable by clients on each request.
- Modelwrap consists of a public commitment to model weights, a binding mechanism to the inference server, and a client-side verification process.
- Attestation in secure hardware enclaves measures launch state but not runtime state, requiring additional mechanisms to verify post-boot data loading.
- Modelwrap uses Merkle trees for efficient data verification and dm-verity for kernel-level enforcement of cryptographic commitments on every read.
- The system supports both public and private models, allowing verification without exposing proprietary weights.
- Performance benchmarks show minimal storage overhead (0.8%) and manageable build times, with runtime overhead mainly affecting initial model loading.
- Modelwrap is open-source, enabling users to verify deployments or generate commitments for private weights in their own enclaves.