I want to walk you through a coding project I worked on recently. The goal was simple automate file compression and decompression using Python. As I dug into the task, I realized it could be enhanced with some additional functionality, error handling, and practical utilities to make the script versatile for real-world applications. I’ll share the enhanced code and explain every part of it in detail. By the end, you’ll have a reusable tool for managing compressed files!
Automate File Compression
File compression is a powerful way to save disk space and improve file transfer speeds. Gzip, a popular compression tool, is often used for compressing text files, logs, or other data-heavy files. However, manually compressing or decompressing files can get tedious, especially when you have multiple files to process. That’s where Python comes in. With just a few lines of code, you can create an efficient script to handle file compression and decompression for you.
The Enhanced Python Code
Here’s the complete code for automating file compression and decompression, along with explanations of its key features:
codeimport gzip
import shutil
import os
def compress_file(input_file, output_file, compression_level=9):
"""
Compresses a file using gzip.
Args:
input_file (str): Path to the input file.
output_file (str): Path to the output compressed file.
compression_level (int): Gzip compression level (1-9). Default is 9 (maximum compression).
Raises:
FileNotFoundError: If the input file does not exist.
"""
if not os.path.exists(input_file):
raise FileNotFoundError(f"Input file '{input_file}' does not exist.")
with open(input_file, 'rb') as f_in:
with gzip.open(output_file, 'wb', compresslevel=compression_level) as f_out:
shutil.copyfileobj(f_in, f_out)
print(f"File '{input_file}' compressed to '{output_file}' with compression level {compression_level}.")
def decompress_file(input_file, output_file):
"""
Decompresses a gzip file.
Args:
input_file (str): Path to the input compressed file.
output_file (str): Path to the output decompressed file.
Raises:
FileNotFoundError: If the input file does not exist.
"""
if not os.path.exists(input_file):
raise FileNotFoundError(f"Input file '{input_file}' does not exist.")
with gzip.open(input_file, 'rb') as f_in:
with open(output_file, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
print(f"File '{input_file}' decompressed to '{output_file}'.")
def get_file_size(file_path):
"""
Returns the size of the file in bytes.
Args:
file_path (str): Path to the file.
Returns:
int: File size in bytes.
Raises:
FileNotFoundError: If the file does not exist.
"""
if not os.path.exists(file_path):
raise FileNotFoundError(f"File '{file_path}' does not exist.")
return os.path.getsize(file_path)
# Example usage
if __name__ == "__main__":
try:
input_filename = 'example.txt' # Replace with your file name
compressed_filename = 'example.txt.gz'
decompressed_filename = 'example_decompressed.txt'
# Compress the file
compress_file(input_filename, compressed_filename, compression_level=5)
# Print file sizes
print(f"Original file size: {get_file_size(input_filename)} bytes")
print(f"Compressed file size: {get_file_size(compressed_filename)} bytes")
# Decompress the file
decompress_file(compressed_filename, decompressed_filename)
# Print decompressed file size
print(f"Decompressed file size: {get_file_size(decompressed_filename)} bytes")
except FileNotFoundError as e:
print(e)
except Exception as e:
print(f"An error occurred: {e}")
Key Features and Enhancements
- Compression Level Control:
- The
compress_file
function includes an optionalcompression_level
argument, allowing you to set the gzip compression level (1 = fastest, 9 = maximum compression). By default, it’s set to 9 for maximum efficiency.
- The
- Decompression Functionality:
- The
decompress_file
function lets you reverse the process and extract the original file from a.gz
archive. This makes the script useful for both compressing and decompressing files.
- The
- Error Handling:
- The script checks if the input file exists before performing any operation. If the file is missing, it raises a
FileNotFoundError
with a clear message.
- The script checks if the input file exists before performing any operation. If the file is missing, it raises a
- File Size Utility:
- The
get_file_size
function calculates the size of any file, making it easy to compare the sizes of original, compressed, and decompressed files.
- The
- Practical Example Block:
- The
if __name__ == "__main__"
block provides a complete workflow:- Compress a file.
- Print the sizes of the original and compressed files.
- Decompress the file.
- Print the size of the decompressed file for verification.
- The
- Console Feedback:
- The script provides feedback messages during compression and decompression, making it easy to understand what’s happening.
How to Use This Code
- Save the script in a
.py
file (e.g.,file_compressor.py
). - Replace the
input_filename
in the example usage with the path to the file you want to compress. - Run the script in a terminal or IDE. It will compress the file, print the sizes, and decompress it.
Example Output
Here’s an example of what you’ll see in the console when you run the script:
codeFile 'example.txt' compressed to 'example.txt.gz' with compression level 5.
Original file size: 1024 bytes
Compressed file size: 128 bytes
File 'example.txt.gz' decompressed to 'example_decompressed.txt'.
Decompressed file size: 1024 bytes
Real-World Applications
- Log File Management: Compress large log files before archiving them.
- Data Backup: Save disk space by compressing files before storing them.
- File Transfers: Reduce file sizes to speed up uploads and downloads.
Final Thoughts
This project was a fantastic way to explore Python’s gzip
and shutil
modules while building a practical tool. Whether you’re dealing with large log files, backups, or data transfers, this script can save you time and effort.