How to Wget - The basics of this great Downloader

Wget is a a powerful, console based network downloader. We are all aware of some good GUI based downloaders, but they would be of no use if you are working on consoles. And once you become aware of what wget is capable of, I am pretty sure you might actually start using it for your day to day needs, even on Graphical Environments.

So, lets begin with the basic syntax.

[shredder12]$ wget http://linuxers.org

For any webpage url, the above command will result in the download of that page(index.html in above case). If its a media(image, audio, video) or any download link, then wget will download the application for you. All of the downloaded stuff will be saved in the current directory.

Wget how to: Download multiple files

For downloading multiple files or weblinks at once just mentions them serially, like this

[shredder12]$ wget http://foo.bar/file1 http://foo.bar/file2

If you have a whole list of download activities then the best method is to store all of them in a file, one URL per line, and give the file as input to wget. Use the --input-file or -i option to do so.

[shredder12]$ wget -i list.txt

The file list.txt should contain the list of URLs. e.g the file list.txt should look something like this

http://foo.bar/file.tgz
http://10.2.3.4/downloads/somefile.ext
ftp://foo.bar/file.ext

Wget how to: Download files of a particular extension only (e.g all .pdf or all .jpeg files)

If you have used bash, you would know that *.pdf refers all the pdf files. The same convention can be used with wget too.

[shredder12]$ wget ftp://foo.bar/downloads/*.pdf

This should download all the pdf files available in the folder. Please note that, this only works for FTP.

Wget how to: Set the name of output file

Use the --output-document=new_filename or -O to set the output file.

[shredder12]$ wget -O software.tgz http://foo.bar/file.tgz

This will rename file.tgz and rename it to software.tgz. This option, -O, is not just for renaming a file. If you are downloading multiple files and mention this option, instead of getting saved as separate files, they will be concatenated into a single one.

Wget how to: set the connection timeout

I mainly use wget in scripts which involves downloading of multiple files and I definitely don't want the download of a single file to become the bottleneck. So, in such cases its better to specify a timeout. The timeout is to tell wget that if its unable to start downloading within that period of time then abort it. It can be easily done using the -T or --timeout flag. It is mostly used with the --tries flag mentioned below

[shredder12]$ wget -T 5 http://foo.bar/file.tgz

Wget will timeout, if the download doesn't start within 5 seconds. You can even use decimal values, say 2.5

Wget how to: set the number of re-tries wget should do before quitting the download

Similar to the timeout flag, this is useful to prevent a single download from becoming the bottleneck. If a download fails a particular number of times then wget should move on to the next one. This is done using the --tries or -t option.

[shredder12]$ wget --tries=5 http://foo.bar/file.tgz

If the download fails, it will abort the operation after 5 tries. The default is 20. Use 0 for infinite.

Wget how to: Run wget in background and log all the output in a file

If you want wget to go to background after startup then use the --background or -b flag. Since, the output won't be printed now, the output messages will be logged. We can either mention the logfile or let the messages get logged in the file named wget-log, which is the default behaviour.

[shredder12]$ wget -b http://linuxers.org
Continuing in background, pid 15278.
Output will be written to `wget-log'.

The --output-file or -o options are used to mention the log file.

[shredder12]$ wget -b -o logfile http://foo.bar/file.tgz

Wget how to: Control Wget's output

Use the -v or --verbose option to print all the available data, this is the default behaviour. If you want to completely turn off the output, use the -q or --quiet option. In order to turn off verbose, get just the errors and basic info use the -nv or --no-verbose option.

2 Comments

eriqk (not verified)
September 1st, 2010 05:57 pm
You forgot a very useful option: with the -c flag wget allows you to resume a download. Comes in very handy when downloading big files or downloading from a site with a shaky connection (or both, of course).
Jeremie (not verified)
September 1st, 2010 09:59 pm
Avoid using wildcards in commands without escaping them if they are not meant to list local files. Replace: [shredder12]$ wget ftp://foo.bar/downloads/*.pdf By: [shredder12]$ wget ftp://foo.bar/downloads/\*.pdf That way, my ZSh shell won't show errors for those things. PS: your captcha is awful, I can barely read it !

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <img> <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote> <h1> <h2> <h3> <h4> <h5> <h6> <p> <br>
  • Image links with 'rel="lightbox"' in the <a> tag will appear in a Lightbox when clicked on.
  • Search Engines will index and follow ONLY links to allowed domains.

More information about formatting options

Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.