Hasty Briefsbeta

Show HN: Webclone.js – A simple tool to clone websites

9 days ago
  • #command-line-tool
  • #offline-browsing
  • #web-archiving
  • WebClone.js is a command-line tool for creating offline archives of websites.
  • It crawls sites, saves pages and assets, rewrites links, and downloads videos.
  • Developed to address limitations of traditional tools like wget.
  • Features include full website archiving, link rewriting, and video downloading.
  • Supports authentication via interactive login or cookie files.
  • Highly configurable with options for crawl depth, concurrency, and scope.
  • Uses puppeteer-extra for stealth and robustness.
  • Supports lazy-loading content by auto-scrolling pages.
  • Requires Node.js 18+ and optionally yt-dlp and ffmpeg for video downloads.
  • Installation involves cloning the repo and installing dependencies.
  • Usage examples include basic archiving, private site archiving, and video downloading.
  • Command-line options allow for detailed configuration.
  • Open for contributions under the MIT License.