- Lightweight CLI, API, and ChatGPT-like alternative to Open WebUI for accessing multiple LLMs offline.
- All data kept private in browser storage with support for local and API provider models.
- Configuration via llms.json for providers, models, and default settings.
- Automatic routing to available providers supporting requested models with failover.
- Supports multiple providers including OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral.
- OpenAI-compatible API for seamless integration with existing clients.
- Built-in analytics UI for monitoring costs, requests, and token usage.
- Easy provider management with enable/disable functionality and configuration.
- CLI interface for quick interactions and server mode for HTTP API.
- Multi-modal support: text, images (auto-resize and conversion), audio, and documents (PDF).
- Custom chat templates for different modalities and auto-discovery of Ollama models.
- Unified model naming across providers and support for over 160+ LLMs.
- GitHub OAuth for securing web UI and API endpoints with optional user restrictions.
- Docker and Docker Compose support for easy deployment.
- Detailed CLI options for model selection, system prompts, and raw JSON responses.
- Verbose logging and custom log prefixes for debugging.
- Health checks and multi-architecture support (linux/amd64, linux/arm64).
- Extensive documentation for setup, configuration, and troubleshooting.