Hasty Briefsbeta

Bilingual

Show HN: LLMs consume 5.4x less mobile energy than ad-supported web search

7 hours ago
  • #Sustainable Computing
  • #AI Energy Efficiency
  • #Web Search Energy
  • A 2023 comparison misleadingly claimed that an LLM response used ten times more energy than a Google search, but it ignored optimizations and full-system costs.
  • In 2025, LLM inference energy dropped dramatically: Google's Gemini prompt used 0.24 Wh (a 33x reduction in a year), and ChatGPT used about 0.34 Wh.
  • Modern mobile web pages are large (median 2.56 MB), and transmitting them over 4G networks consumes more energy (e.g., 0.44 Wh per page) than LLM inference alone.
  • Search sessions often require visiting multiple ad-heavy web pages, adding network, device, and ad-tech energy costs, while LLMs deliver synthesized answers in one interaction.
  • Ad-supported webpages incur significant energy overheads: client-side ad rendering increases device power by 15–44%, and server-side programmatic auctions add ~0.10–0.25 Wh per page.
  • For complex synthesis tasks on mobile, LLM sessions are 4–9 times more energy-efficient than search sessions, mainly due to smaller payloads, no ad tax, and faster completion times.
  • The efficiency advantage diminishes on Wi-Fi (network energy drops) and reverses for reasoning models (which can use 1–5 Wh per query) or agentic workflows.
  • The Jevons paradox means efficiency gains may increase overall demand, but per-task efficiency is still higher for LLMs in mobile synthesis scenarios.
  • Policy implications: prioritize mobile LLM use for synthesis tasks, audit ad-tech footprints, avoid overusing reasoning models, and consider full-system energy in regulations.
  • Future research needs: empirical data on hallucination rates, independent energy benchmarks, Scope 3 lifecycle assessments, and economic impacts on publishers.