I prompted ChatGPT, Claude, Perplexity, and Gemini and watched my Nginx logs
7 hours ago
- #Web Analytics
- #Log Analysis
- #AI Crawlers
- AI providers use different methods to fetch web content: provider-side fetch (using dedicated bots) and real clickthrough visits from human users.
- ChatGPT and Claude perform provider-side origin retrieval using specific user-agents (ChatGPT-User/1.0 and Claude-User/1.0), often in bursts without referrers.
- Perplexity can perform direct origin retrieval using Perplexity-User/1.0, but it may also rely on its own index; logs alone cannot confirm always fetching.
- Google and Gemini do not have a distinct retrieval user-agent; they use the Googlebot index for AI Overviews, making provider-side fetches unobservable in logs.
- Provider fetch tracking should focus on vendor-documented retrieval user-agents, while real visits involve normal browsers with AI referrers; mixing training or indexing bots creates noise.