Show HN: Webclone.js – A simple tool to clone websites
9 days ago
- #command-line-tool
- #offline-browsing
- #web-archiving
- WebClone.js is a command-line tool for creating offline archives of websites.
- It crawls sites, saves pages and assets, rewrites links, and downloads videos.
- Developed to address limitations of traditional tools like wget.
- Features include full website archiving, link rewriting, and video downloading.
- Supports authentication via interactive login or cookie files.
- Highly configurable with options for crawl depth, concurrency, and scope.
- Uses puppeteer-extra for stealth and robustness.
- Supports lazy-loading content by auto-scrolling pages.
- Requires Node.js 18+ and optionally yt-dlp and ffmpeg for video downloads.
- Installation involves cloning the repo and installing dependencies.
- Usage examples include basic archiving, private site archiving, and video downloading.
- Command-line options allow for detailed configuration.
- Open for contributions under the MIT License.