On this page
Mirroring a Website
Introduction
There are software applications that allow you to mirror websites, but why use them when a simple command can do the same thing?
Usage
To mirror mozilla.org for example, use this command:
wget --random-wait -r -p -e robots=off -U mozilla http://www.mozilla.org
- -p: Include all files, images, etc.
- -e robots=off: Bypass the robot.txt file
- -U: Specify mozilla as the browser that will do the mirroring
- –random-wait: Allows wget to randomize download times in seconds to avoid blacklists
Other useful parameters:
- –limit-rate=20k: Limits download speed to 20k
- -b: wget continues to run even if you log out (like nohup)
- -o: $HOME/wget_log.txt log file
Last updated 20 Sep 2009, 15:51 CEST.