Show HN: LLMs consume 5.4x less mobile energy than ad-supported web search
7 hours ago
- #Sustainable Computing
- #AI Energy Efficiency
- #Web Search Energy
- A 2023 comparison misleadingly claimed that an LLM response used ten times more energy than a Google search, but it ignored optimizations and full-system costs.
- In 2025, LLM inference energy dropped dramatically: Google's Gemini prompt used 0.24 Wh (a 33x reduction in a year), and ChatGPT used about 0.34 Wh.
- Modern mobile web pages are large (median 2.56 MB), and transmitting them over 4G networks consumes more energy (e.g., 0.44 Wh per page) than LLM inference alone.
- Search sessions often require visiting multiple ad-heavy web pages, adding network, device, and ad-tech energy costs, while LLMs deliver synthesized answers in one interaction.
- Ad-supported webpages incur significant energy overheads: client-side ad rendering increases device power by 15–44%, and server-side programmatic auctions add ~0.10–0.25 Wh per page.
- For complex synthesis tasks on mobile, LLM sessions are 4–9 times more energy-efficient than search sessions, mainly due to smaller payloads, no ad tax, and faster completion times.
- The efficiency advantage diminishes on Wi-Fi (network energy drops) and reverses for reasoning models (which can use 1–5 Wh per query) or agentic workflows.
- The Jevons paradox means efficiency gains may increase overall demand, but per-task efficiency is still higher for LLMs in mobile synthesis scenarios.
- Policy implications: prioritize mobile LLM use for synthesis tasks, audit ad-tech footprints, avoid overusing reasoning models, and consider full-system energy in regulations.
- Future research needs: empirical data on hallucination rates, independent energy benchmarks, Scope 3 lifecycle assessments, and economic impacts on publishers.