We gave terabytes of CI logs to an LLM
4 hours ago
- #LogAnalysis
- #CI/CD
- #ClickHouse
- Agent traced a flaky test by scanning hundreds of millions of log lines in seconds.
- Agent uses a SQL interface to query job metadata (63%) and raw log lines (37%).
- Analyzed 8,534 agent sessions with 52,312 queries, averaging 4.4 queries per session.
- Agent starts broad with job metadata queries, then drills into raw logs for detailed investigations.
- ClickHouse stores 5.31 TiB of data (uncompressed) in 154 GiB (compressed) with a 35:1 compression ratio.
- Denormalized metadata columns in ClickHouse compress efficiently, making queries fast.
- Query latency scales linearly with rows scanned, from 10ms for <1K rows to 31s for 1B+ rows.
- Ingestion pipeline throttles GitHub API requests to stay within rate limits while maintaining fresh data.
- Durable execution engine (Inngest) handles rate limits by suspending and resuming workflows without crashes.
- Mendral (YC W26) automates CI log analysis to identify failures and changes.