a year ago
- ANEMLL is an open-source project for porting Large Language Models (LLMs) to tensor processors, starting with Apple Neural Engine (ANE).
- Provides a pipeline from model conversion to inference for LLMs on ANE, enabling on-device inference for privacy and security.
- Offers five main components: LLM Conversion Tools, Swift Reference Implementation, Python Sample Code, iOS/macOS Sample Applications, and ANEMLL-BENCH.
- Supports models like LLAMA 3.1 (1B and 8B), DeepSeek, and DeepHermes distilled models, with plans to add more.
- Includes sample applications and tools for Swift and Python, with a TestFlight app available for testing.
- Requires macOS Sequoia with ANE, 16GB RAM, Python 3.9, and Xcode Command Line Tools for CoreML compiler.
- Currently optimized for Meta's LLaMA 3.2 models and others, with contributions from the community.
- Licensed under MIT License and welcomes contributions from the community.