Hasty Briefsbeta

Bilingual

Facebook's Fascination with My Robots.txt

2 days ago
  • #Facebook
  • #Web Crawling
  • #Robots.txt
  • Facebook has been repeatedly accessing the author's /robots.txt file on their self-hosted Forgejo instance for the past 4 days.
  • The requests are coming from Meta's IP ranges and use the user-agent 'facebookexternalhit/1.1'.
  • Only the robots.txt file is being accessed, with no other files or paths requested.
  • Facebook's documentation states that their crawler is meant to gather metadata for shared links, but the author doubts their site is being widely shared.
  • The author speculates whether this is a bug or misconfiguration on Meta's end, questioning the global bandwidth and energy usage of such repetitive requests.
  • Compared to previous AI bot traffic, this is mostly benign but remains an odd and interesting observation.