Mirroring a Website
Introduction
There are software applications that allow you to mirror websites, but why use them when a simple command can do the same thing?
Usage
To mirror mozilla.org for example, use this command:
- -p: Include all files, images, etc.
- -e robots=off: Bypass the robot.txt file
- -U: Specify mozilla as the browser that will do the mirroring
- --random-wait: Allows wget to randomize download times in seconds to avoid blacklists
Other useful parameters:
- --limit-rate=20k: Limits download speed to 20k
- -b: wget continues to run even if you log out (like nohup)
- -o: $HOME/wget_log.txt log file