Phlox: A full-featured AI platform you own
16 hours ago
- #RAG
- #AI Platform
- #Self-Hosted
- Phlox is a self-hostable AI platform that supports streaming chat, agentic tool use, document RAG, code execution, and per-user cost accounting.
- It acts as an OpenAI-compatible gateway and works with various model providers, including AWS Bedrock, local models like Ollama, and other OpenAI-compatible endpoints.
- Features include conversation history management, markdown with code highlighting, Mermaid diagrams, LaTeX math, and the ability to edit or regenerate messages.
- The platform offers a tool-using loop with filesystem, shell, Python/Node execution, document search, planning, sub-agents, memory, and checkpoints in a sandboxed workspace.
- Users can pause on sensitive tools for approval, with persisted run states that survive disconnects.
- Document RAG supports uploads of PDF, DOCX, TXT, MD, or code, using hybrid dense and sparse search with Qdrant, reranking, and citations, all operable offline.
- Per-user API keys enable calling Phlox via OpenAI SDK, with per-message token and cost tracking, plus admin chargeback views and CSV export.
- Monthly cost caps can be set per user or department, with warnings and blocks on priced models when limits are reached, while free local models remain accessible.
- Authentication includes local accounts (bcrypt + JWT) or Microsoft Entra ID SSO, with strict per-user isolation and user/admin roles.
- Code execution can run in ephemeral Podman/Docker containers or local subprocesses, with tools having auto/ask/deny policies.
- The backend is FastAPI handling LLM orchestration, agent harness, MCP, RAG, and more, while the frontend is React + Vite.
- Deployment involves setting up Python with uv, Node, and a model provider, with separate commands for backend and frontend in development.
- Open source under Apache 2.0, Phlox can be cloned and run with a model provider, featuring themes, logging, and comprehensive documentation.