Hasty Briefsbeta

Bilingual

Show HN: We made our own inference engine for Apple Silicon

10 months ago

#AI
#Apple Silicon
#Inference Engine

High-performance inference engine for AI models on Apple Silicon.
Simple, high-level API with hybrid architecture (GPU kernels or MPSGraph).
Unified model configurations for easy addition of new models.
Traceable computations for correctness against source-of-truth implementation.
Utilizes unified memory on Apple devices.
Add uzu dependency via Cargo.toml.
Create an inference Session with model and configuration.
Supports CLI mode with commands like run and serve.
Uses its own model format; export models using lalamo.
Prebuilt Swift framework (uzu-swift) available for SPM.
Licensed under MIT License.