Show HN: Cactus – Ollama for Smartphones

10 months ago

Cross-platform framework for deploying LLM/VLM/TTS models locally in apps.
Supports Flutter and React-Native for cross-platform development.
Compatible with any GGUF model from Huggingface (e.g., Qwen, Gemma, Llama, DeepSeek).
Runs LLMs, VLMs, Embedding Models, and TTS models efficiently.
Supports models from FP32 down to 2-bit quantization for device efficiency.
Features MCP tool-calls for AI performance (e.g., reminders, gallery search).
Fallback to cloud models for complex tasks or device failures.
Includes chat templates with Jinja2 support and token streaming.
Provides installation and usage examples for Flutter and React-Native.
Offers cloud fallback modes: local, localfirst, remotefirst, remote.
Backend written in C/C++ for broad device compatibility (phones, TVs, laptops, etc.).
Includes build and setup instructions for Flutter, React-Native, and C/C++.
Encourages contributions with guidelines for bug fixes and feature additions.
Benchmarks provided for model performance across various devices.

Hasty Briefsbeta