Our agent found a bug with WireGuard in Google Kubernetes Engine
6 hours ago
- #debugging
- #distributed-systems
- #infrastructure
- Users experienced random errors like project opening failures, cloning timeouts, and connection resets.
- AI agent analysis of logs revealed anetd pods crashing due to a concurrent map-access panic in Google's WireGuard integration code.
- Disabling transparent node-to-node encryption temporarily stopped crashes, but soon after, random Valkey connection failures emerged.
- Packet capture revealed an MTU mismatch (1420 vs 1500 bytes) due to inconsistent node configurations after disabling WireGuard.
- Restarting all nodes resolved the MTU issue, and Google later patched the WireGuard concurrency bug.