Hasty Briefsbeta

Bilingual

Gone but Not Forgotten: Recovering the Dead Web

4 hours ago
  • #web-archiving
  • #link-rot
  • #digital-preservation
  • 38% of webpages from 2013 are no longer accessible after a decade, and 25% of pages from 2013-2023 are dead.
  • The Wayback Machine rescues around 15% of dead pages from the Pew dataset, reducing overall vanished URLs from 26% to 10% for that sample.
  • Other studies report varying link-rot rates: Ahrefs finds 66.5% dead links over nine years, while a 2021 NYTimes link analysis shows 25% deep link rot.
  • The ODU study of 27.3 million URLs indicates 65% dead by 2023, but all sampled URLs are archived by the Wayback Machine.
  • Key terminologies include 'Rescued' (dead on live web but archived) and 'Endangered' (alive but unarchived, at risk of vanishing).
  • Limitations in archiving include resource constraints, JavaScript-heavy pages, bot blocking, and paywalls, but initiatives like IndexNow aim to improve link discovery.