Skip to content

Mirroring a Website

Introduction

There are software applications that allow you to mirror websites, but why use them when a simple command can do the same thing?

Usage

To mirror mozilla.org for example, use this command:

wget --random-wait -r -p -e robots=off -U mozilla http://www.mozilla.org
  • -p: Include all files, images, etc.
  • -e robots=off: Bypass the robot.txt file
  • -U: Specify mozilla as the browser that will do the mirroring
  • --random-wait: Allows wget to randomize download times in seconds to avoid blacklists

Other useful parameters:

  • --limit-rate=20k: Limits download speed to 20k
  • -b: wget continues to run even if you log out (like nohup)
  • -o: $HOME/wget_log.txt log file