Introduction

There are software applications that allow you to mirror websites, but why use them when a simple command can do the same thing?

Usage

To mirror mozilla.org for example, use this command:

  wget --random-wait -r -p -e robots=off -U mozilla http://www.mozilla.org
  
  • -p: Include all files, images, etc.
  • -e robots=off: Bypass the robot.txt file
  • -U: Specify mozilla as the browser that will do the mirroring
  • –random-wait: Allows wget to randomize download times in seconds to avoid blacklists

Other useful parameters:

  • –limit-rate=20k: Limits download speed to 20k
  • -b: wget continues to run even if you log out (like nohup)
  • -o: $HOME/wget_log.txt log file

Last updated 20 Sep 2009, 15:51 CEST. history