Show HN: AgentReady – Drop-in proxy that cuts LLM token costs 40-60%
9 hours ago
- #proxy
- #LLM
- #optimization
- AgentReady Proxy reduces LLM token costs by 40-60% by compressing prompts.
- Features deterministic rule-based compression (~5ms overhead, no secondary LLM call).
- Removes filler words, verbose phrases, redundant connectors, duplicate lines, and excess whitespace.
- Preserves semantic meaning, code blocks, URLs, numbers, dates, and sentence structure.
- Supports multiple languages: English, Italian, French, German, Spanish.
- Easy integration with OpenAI SDK via base_url swap or monkey-patching.
- Offers three compression levels: light (10-20%), standard (20-40%), aggressive (35-55%).
- Free during beta, pay-per-token post-beta, with significant savings.
- Works with GPT-4, Claude, Gemini; minimal impact on output quality (<2% delta on BLEU/ROUGE scores).
- Secure: upstream API keys are forwarded directly, never stored or logged.