Skip to main content
โšก Calmops

wget Command Guide: Downloading Files from the Command Line

Introduction

wget is a non-interactive command-line tool for downloading files over HTTP, HTTPS, and FTP. It’s available on virtually every Linux/Unix system and is indispensable for scripting, automation, and server-side downloads. Unlike a browser, wget works without a GUI, can resume interrupted downloads, and handles recursive site mirroring.

Basic Usage

# Download a single file to the current directory
wget https://example.com/file.tar.gz

# Download and save with a specific filename
wget -O myfile.tar.gz https://example.com/file.tar.gz

# Download to a specific directory
wget -P /tmp/downloads https://example.com/file.tar.gz

# Download quietly (no progress output)
wget -q https://example.com/file.tar.gz

# Download and print to stdout (pipe to another command)
wget -qO- https://example.com/data.json | jq .

Downloading Multiple Files

From a Loop (Shell Script)

#!/bin/bash
# Download a list of files from the same base URL

base_url="https://example.com/releases/"
files="app-1.0.tar.gz app-1.1.tar.gz app-1.2.tar.gz"

for file in $files; do
    echo "Downloading: ${base_url}${file}"
    wget -c -N -P downloads/ "${base_url}${file}"
done

Flags used:

  • -c โ€” continue/resume an interrupted download
  • -N โ€” only download if the remote file is newer than the local copy
  • -P downloads/ โ€” save files to the downloads/ directory

From a URL List File

# urls.txt โ€” one URL per line
# https://example.com/file1.tar.gz
# https://example.com/file2.tar.gz
# https://example.com/file3.tar.gz

wget -i urls.txt -P downloads/

Parallel Downloads with xargs

# Download 4 files in parallel
cat urls.txt | xargs -n 1 -P 4 wget -q -P downloads/

Resuming Downloads

# Resume an interrupted download
wget -c https://example.com/large-file.iso

# Useful for large files โ€” if the connection drops, just re-run the same command

Recursive Download / Site Mirroring

# Mirror an entire website
wget --mirror --convert-links --adjust-extension \
     --page-requisites --no-parent \
     https://example.com/

# Flags explained:
# --mirror              = recursive + timestamps + infinite depth
# --convert-links       = rewrite links for offline browsing
# --adjust-extension    = add .html to files without extensions
# --page-requisites     = download CSS, images, JS needed to display pages
# --no-parent           = don't go up to parent directories

# Limit recursion depth
wget -r -l 2 https://example.com/docs/

Authentication

# HTTP Basic Auth
wget --user=username --password=secret https://example.com/protected/file.zip

# Or use a .netrc file (more secure โ€” keeps credentials out of shell history)
# ~/.netrc:
# machine example.com login username password secret
wget --netrc https://example.com/protected/file.zip

# FTP with credentials
wget ftp://username:[email protected]/file.tar.gz

Rate Limiting and Politeness

# Limit download speed to 500KB/s
wget --limit-rate=500k https://example.com/large-file.iso

# Wait 1 second between requests (for recursive downloads)
wget -r --wait=1 https://example.com/

# Random wait between 0.5 and 1.5 seconds
wget -r --wait=1 --random-wait https://example.com/

Handling Redirects and HTTPS

# Follow redirects (default behavior, but explicit)
wget --max-redirect=5 https://example.com/redirect-me

# Skip SSL certificate verification (use with caution)
wget --no-check-certificate https://self-signed.example.com/file

# Use a custom CA certificate
wget --ca-certificate=/path/to/ca.crt https://example.com/file

Custom Headers and User Agent

# Set a custom User-Agent
wget --user-agent="Mozilla/5.0 (compatible; MyBot/1.0)" https://example.com/

# Add custom headers
wget --header="Authorization: Bearer mytoken" \
     --header="Accept: application/json" \
     https://api.example.com/data

# Send a POST request
wget --post-data="key=value&other=data" https://example.com/api/endpoint

Logging and Output Control

# Save output log to a file
wget -o download.log https://example.com/file.tar.gz

# Append to an existing log
wget -a download.log https://example.com/file.tar.gz

# Show only errors
wget -q --show-progress https://example.com/file.tar.gz

# Verbose output (useful for debugging)
wget -v https://example.com/file.tar.gz

# Spider mode โ€” check if URLs exist without downloading
wget --spider https://example.com/file.tar.gz

Practical Scripts

Download and Verify Checksum

#!/bin/bash
set -e

URL="https://example.com/app-2.0.tar.gz"
EXPECTED_SHA256="abc123..."

wget -q -O app.tar.gz "$URL"

echo "$EXPECTED_SHA256  app.tar.gz" | sha256sum --check
echo "Download verified successfully"

Batch Download with Retry

#!/bin/bash
# Download files with automatic retry on failure

BASE_URL="https://example.com/data/"
FILES="dataset-01.csv dataset-02.csv dataset-03.csv"
OUTPUT_DIR="./data"

mkdir -p "$OUTPUT_DIR"

for file in $FILES; do
    wget \
        --tries=3 \
        --retry-connrefused \
        --waitretry=5 \
        --timeout=30 \
        -c \
        -P "$OUTPUT_DIR" \
        "${BASE_URL}${file}" \
        && echo "OK: $file" \
        || echo "FAILED: $file"
done

Mirror a Documentation Site

#!/bin/bash
# Download a documentation site for offline reading

SITE="https://docs.example.com/"
OUTPUT="./offline-docs"

wget \
    --mirror \
    --convert-links \
    --adjust-extension \
    --page-requisites \
    --no-parent \
    --directory-prefix="$OUTPUT" \
    --wait=0.5 \
    --random-wait \
    --limit-rate=1m \
    "$SITE"

echo "Site mirrored to $OUTPUT"

wget vs curl

Both tools download files, but they have different strengths:

Feature wget curl
Recursive download Yes (-r) No
Resume downloads Yes (-c) Yes (-C -)
Multiple protocols HTTP, HTTPS, FTP HTTP, HTTPS, FTP, SFTP, SCP, and more
Output to stdout Yes (-O -) Yes (default)
REST API calls Limited Excellent
Progress bar Yes Yes
Scripting Good Excellent

Use wget for downloading files and mirroring sites. Use curl for API calls and when you need more protocol flexibility.

Common Options Reference

Option Description
-O file Save to specific filename
-P dir Save to directory
-c Continue/resume download
-N Only download if newer
-q Quiet mode
-v Verbose mode
-r Recursive download
-l N Recursion depth limit
-i file Read URLs from file
--limit-rate=N Limit download speed
--wait=N Wait N seconds between requests
--tries=N Number of retries
--timeout=N Timeout in seconds
--user-agent=S Set User-Agent string
--no-check-certificate Skip SSL verification

Resources

Comments