We decreased our LLM costs with Opus
3 hours ago
- #AI Agents
- #Cost Optimization
- #CI Logs
- Use a cheap model (Haiku) as a triager to filter out 80% of CI failures as duplicates, reducing cost by preventing expensive model (Opus) from running unnecessarily.
- Allow agents to pull context via SQL interfaces instead of pushing all logs into prompts, avoiding bias and enabling targeted queries.
- Implement a hierarchical model where Opus plans investigations and spawns Haiku sub-agents for specific tasks, optimizing cost and focus.
- Maintain context hygiene by discarding sub-agent contexts after use and using structured summaries to keep the orchestrator's context clean.
- Leverage semantic search and exact matching to detect duplicate failures, improving accuracy in identifying known issues.
- Generalize the architecture for high-volume event data like security logs or IoT telemetry, focusing expensive models on novel events.
- Continuously tune the system, including reassessment layers to verify insights and adjust sub-agent boundaries for cost efficiency.