Show HN: LLMs consume 5.4x less mobile energy than ad-supported web search

7 hours ago

#Sustainable Computing
#AI Energy Efficiency
#Web Search Energy

A 2023 comparison misleadingly claimed that an LLM response used ten times more energy than a Google search, but it ignored optimizations and full-system costs.
In 2025, LLM inference energy dropped dramatically: Google's Gemini prompt used 0.24 Wh (a 33x reduction in a year), and ChatGPT used about 0.34 Wh.
Modern mobile web pages are large (median 2.56 MB), and transmitting them over 4G networks consumes more energy (e.g., 0.44 Wh per page) than LLM inference alone.
Search sessions often require visiting multiple ad-heavy web pages, adding network, device, and ad-tech energy costs, while LLMs deliver synthesized answers in one interaction.
Ad-supported webpages incur significant energy overheads: client-side ad rendering increases device power by 15–44%, and server-side programmatic auctions add ~0.10–0.25 Wh per page.
For complex synthesis tasks on mobile, LLM sessions are 4–9 times more energy-efficient than search sessions, mainly due to smaller payloads, no ad tax, and faster completion times.
The efficiency advantage diminishes on Wi-Fi (network energy drops) and reverses for reasoning models (which can use 1–5 Wh per query) or agentic workflows.
The Jevons paradox means efficiency gains may increase overall demand, but per-task efficiency is still higher for LLMs in mobile synthesis scenarios.
Policy implications: prioritize mobile LLM use for synthesis tasks, audit ad-tech footprints, avoid overusing reasoning models, and consider full-system energy in regulations.
Future research needs: empirical data on hallucination rates, independent energy benchmarks, Scope 3 lifecycle assessments, and economic impacts on publishers.

Hasty Briefsbeta

Show HN: LLMs consume 5.4x less mobile energy than ad-supported web search