LiteLLM (YC W23): Founding Reliability Engineer – $200K-$270K and 0.5-1.0% equity
3 hours ago
- #AI Infrastructure
- #Performance Optimization
- #Reliability Engineering
- LiteLLM is an open-source AI gateway with 36K+ GitHub stars, routing millions of LLM API calls daily for major companies like NASA, Adobe, Netflix, Stripe, and Nvidia.
- The company is seeking its first dedicated reliability hire to ensure production stability, with responsibilities split between operational reliability (60%) and deep performance engineering (40%).
- Key challenges include memory management in Python async services, database scalability, and maintaining compatibility across 100+ AI provider APIs.
- The role involves owning production reliability, performance engineering, and observability, including on-call duties, incident response, and optimizing hot paths.
- Ideal candidates have 2+ years of Python production experience, expertise in async Python, PostgreSQL, Kubernetes, and prior on-call experience.
- Strong signals include experience with proxies/API gateways, work at infrastructure companies like Meta or Cloudflare, and contributions to open-source projects.
- LiteLLM offers significant scale and impact, open-source visibility, ownership as the first reliability hire, and meaningful equity in a fast-growing startup.