Introduction
wget is a non-interactive command-line tool for downloading files over HTTP, HTTPS, and FTP. It’s available on virtually every Linux/Unix system and is indispensable for scripting, automation, and server-side downloads. Unlike a browser, wget works without a GUI, can resume interrupted downloads, and handles recursive site mirroring.
Basic Usage
# Download a single file to the current directory
wget https://example.com/file.tar.gz
# Download and save with a specific filename
wget -O myfile.tar.gz https://example.com/file.tar.gz
# Download to a specific directory
wget -P /tmp/downloads https://example.com/file.tar.gz
# Download quietly (no progress output)
wget -q https://example.com/file.tar.gz
# Download and print to stdout (pipe to another command)
wget -qO- https://example.com/data.json | jq .
Downloading Multiple Files
From a Loop (Shell Script)
#!/bin/bash
# Download a list of files from the same base URL
base_url="https://example.com/releases/"
files="app-1.0.tar.gz app-1.1.tar.gz app-1.2.tar.gz"
for file in $files; do
echo "Downloading: ${base_url}${file}"
wget -c -N -P downloads/ "${base_url}${file}"
done
Flags used:
-cโ continue/resume an interrupted download-Nโ only download if the remote file is newer than the local copy-P downloads/โ save files to thedownloads/directory
From a URL List File
# urls.txt โ one URL per line
# https://example.com/file1.tar.gz
# https://example.com/file2.tar.gz
# https://example.com/file3.tar.gz
wget -i urls.txt -P downloads/
Parallel Downloads with xargs
# Download 4 files in parallel
cat urls.txt | xargs -n 1 -P 4 wget -q -P downloads/
Resuming Downloads
# Resume an interrupted download
wget -c https://example.com/large-file.iso
# Useful for large files โ if the connection drops, just re-run the same command
Recursive Download / Site Mirroring
# Mirror an entire website
wget --mirror --convert-links --adjust-extension \
--page-requisites --no-parent \
https://example.com/
# Flags explained:
# --mirror = recursive + timestamps + infinite depth
# --convert-links = rewrite links for offline browsing
# --adjust-extension = add .html to files without extensions
# --page-requisites = download CSS, images, JS needed to display pages
# --no-parent = don't go up to parent directories
# Limit recursion depth
wget -r -l 2 https://example.com/docs/
Authentication
# HTTP Basic Auth
wget --user=username --password=secret https://example.com/protected/file.zip
# Or use a .netrc file (more secure โ keeps credentials out of shell history)
# ~/.netrc:
# machine example.com login username password secret
wget --netrc https://example.com/protected/file.zip
# FTP with credentials
wget ftp://username:[email protected]/file.tar.gz
Rate Limiting and Politeness
# Limit download speed to 500KB/s
wget --limit-rate=500k https://example.com/large-file.iso
# Wait 1 second between requests (for recursive downloads)
wget -r --wait=1 https://example.com/
# Random wait between 0.5 and 1.5 seconds
wget -r --wait=1 --random-wait https://example.com/
Handling Redirects and HTTPS
# Follow redirects (default behavior, but explicit)
wget --max-redirect=5 https://example.com/redirect-me
# Skip SSL certificate verification (use with caution)
wget --no-check-certificate https://self-signed.example.com/file
# Use a custom CA certificate
wget --ca-certificate=/path/to/ca.crt https://example.com/file
Custom Headers and User Agent
# Set a custom User-Agent
wget --user-agent="Mozilla/5.0 (compatible; MyBot/1.0)" https://example.com/
# Add custom headers
wget --header="Authorization: Bearer mytoken" \
--header="Accept: application/json" \
https://api.example.com/data
# Send a POST request
wget --post-data="key=value&other=data" https://example.com/api/endpoint
Logging and Output Control
# Save output log to a file
wget -o download.log https://example.com/file.tar.gz
# Append to an existing log
wget -a download.log https://example.com/file.tar.gz
# Show only errors
wget -q --show-progress https://example.com/file.tar.gz
# Verbose output (useful for debugging)
wget -v https://example.com/file.tar.gz
# Spider mode โ check if URLs exist without downloading
wget --spider https://example.com/file.tar.gz
Practical Scripts
Download and Verify Checksum
#!/bin/bash
set -e
URL="https://example.com/app-2.0.tar.gz"
EXPECTED_SHA256="abc123..."
wget -q -O app.tar.gz "$URL"
echo "$EXPECTED_SHA256 app.tar.gz" | sha256sum --check
echo "Download verified successfully"
Batch Download with Retry
#!/bin/bash
# Download files with automatic retry on failure
BASE_URL="https://example.com/data/"
FILES="dataset-01.csv dataset-02.csv dataset-03.csv"
OUTPUT_DIR="./data"
mkdir -p "$OUTPUT_DIR"
for file in $FILES; do
wget \
--tries=3 \
--retry-connrefused \
--waitretry=5 \
--timeout=30 \
-c \
-P "$OUTPUT_DIR" \
"${BASE_URL}${file}" \
&& echo "OK: $file" \
|| echo "FAILED: $file"
done
Mirror a Documentation Site
#!/bin/bash
# Download a documentation site for offline reading
SITE="https://docs.example.com/"
OUTPUT="./offline-docs"
wget \
--mirror \
--convert-links \
--adjust-extension \
--page-requisites \
--no-parent \
--directory-prefix="$OUTPUT" \
--wait=0.5 \
--random-wait \
--limit-rate=1m \
"$SITE"
echo "Site mirrored to $OUTPUT"
wget vs curl
Both tools download files, but they have different strengths:
| Feature | wget | curl |
|---|---|---|
| Recursive download | Yes (-r) |
No |
| Resume downloads | Yes (-c) |
Yes (-C -) |
| Multiple protocols | HTTP, HTTPS, FTP | HTTP, HTTPS, FTP, SFTP, SCP, and more |
| Output to stdout | Yes (-O -) |
Yes (default) |
| REST API calls | Limited | Excellent |
| Progress bar | Yes | Yes |
| Scripting | Good | Excellent |
Use wget for downloading files and mirroring sites. Use curl for API calls and when you need more protocol flexibility.
Common Options Reference
| Option | Description |
|---|---|
-O file |
Save to specific filename |
-P dir |
Save to directory |
-c |
Continue/resume download |
-N |
Only download if newer |
-q |
Quiet mode |
-v |
Verbose mode |
-r |
Recursive download |
-l N |
Recursion depth limit |
-i file |
Read URLs from file |
--limit-rate=N |
Limit download speed |
--wait=N |
Wait N seconds between requests |
--tries=N |
Number of retries |
--timeout=N |
Timeout in seconds |
--user-agent=S |
Set User-Agent string |
--no-check-certificate |
Skip SSL verification |
Comments