Move Large Files Safely with Linux Built-in Tools

Transferring massive files—such as a 20GB database export—between servers can be a daunting task. Network interruptions, disk space constraints, and compatibility issues often get in the way. While there are many specialized tools available, most DevOps workflows can be handled efficiently using nothing but the built-in utilities found on almost every Linux and macOS system.

In this tutorial, we will walk through a robust workflow to compress, split, download, verify, and reassemble large files safely.

The Scenario

Imagine you have a large database dump named big_dump.sql (e.g., 20GB). Your goal is to move it from a remote server to your local machine (or another server) while ensuring:

Reduced Transfer Size: Through compression.
Reliability: By splitting the file into manageable chunks.
Integrity: Verifying that no data was corrupted during transit.

Step 1: Compress the File

The first step is to reduce the file size. gzip is the most universal tool for this.

gzip big_dump.sql

This will replace big_dump.sql with a compressed version: big_dump.sql.gz.

Step 2: Generate a Checksum

Before splitting or moving the file, generate a "health check" fingerpint. This ensures we can verify the file's integrity at the destination.

md5sum big_dump.sql.gz > big_dump.sql.gz.md5

Step 3: Split into Manageable Parts

Large files are prone to transfer failures. By splitting the file into 100MB chunks, you make the download more resilient. If one chunk fails, you only need to re-download that specific part.

mkdir split_parts
split -b 100M -d --suffix-length=3 --numeric-suffixes=1 big_dump.sql.gz split_parts/part_

This creates files like part_001, part_002, etc., in the split_parts directory.

Step 4: Download the Parts

You can now download the entire directory using scp, rsync, or even a simple HTTP server.

scp -r user@server:/path/to/split_parts .

Step 5: Reassemble the File

Once all parts are on your local machine, use cat to stitch them back together in order.

cd split_parts
cat part_* > big_dump.sql.gz

Step 6: Verify Integrity

Compare the MD5 checksum of the reassembled file with the original checksum you generated in Step 2.

# On macOS
md5 big_dump.sql.gz

# On Linux
md5sum big_dump.sql.gz

# Compare with original
cat big_dump.sql.gz.md5

If the strings match, your file is perfectly intact.

Step 7: Final Extraction and Restore

Now you can uncompress the file and proceed with your work.

gunzip big_dump.sql.gz

If it's a database dump, you can restore it:

mysql -u root -p dbname < big_dump.sql

Pro Tips for Large Transfers

Task	Command	Why?
Resume Downloads	`rsync -avP`	Automatically resumes interrupted transfers.
View Progress	`pv file`	Shows a progress bar while processing files.
Cleanup	`rm part_*`	Always clean up chunks after successful reassembly.
Check Space	`df -h`	Ensure you have enough disk space before starting.

Why This Method Works

Professional DBAs and DevOps engineers prefer this method because it is:

Universal: Works on any server (Ubuntu, CentOS, Alpine, macOS, WSL).
Tool-Agnostic: No need to install third-party software.
Corruption-Safe: Checksums guarantee reliability.
Cloud-Friendly: Ideal for moving data between cloud providers.

Summary

Moving large files doesn't have to be stressful. By leveraging gzip, split, and md5sum, you create a transfer process that is both resilient to failure and easy to verify. It is the most universal and reliable way to move large data sets safely across the modern web.

Shopping Cart

Shopping Cart

Downloading Large Files Safely Using Only Built-in Linux Tools

The Scenario

Step 1: Compress the File

Step 2: Generate a Checksum

Step 3: Split into Manageable Parts

Step 4: Download the Parts

Step 5: Reassemble the File

Step 6: Verify Integrity

Step 7: Final Extraction and Restore

Pro Tips for Large Transfers

Why This Method Works

Summary

Shell Scripts

Checksum

Compress

Download

Extract

Reassemble

Restore_db

Split

Verify