Show HN: AgentReady – Drop-in proxy that cuts LLM token costs 40-60%

2 months ago

AgentReady Proxy reduces LLM token costs by 40-60% by compressing prompts.
Features deterministic rule-based compression (~5ms overhead, no secondary LLM call).
Removes filler words, verbose phrases, redundant connectors, duplicate lines, and excess whitespace.
Preserves semantic meaning, code blocks, URLs, numbers, dates, and sentence structure.
Supports multiple languages: English, Italian, French, German, Spanish.
Easy integration with OpenAI SDK via base_url swap or monkey-patching.
Offers three compression levels: light (10-20%), standard (20-40%), aggressive (35-55%).
Free during beta, pay-per-token post-beta, with significant savings.
Works with GPT-4, Claude, Gemini; minimal impact on output quality (<2% delta on BLEU/ROUGE scores).
Secure: upstream API keys are forwarded directly, never stored or logged.

Hasty Briefsbeta