Site icon FSIBLOG

How do I Calculate a Hash of a File Using Python

How do I Calculate a Hash of a File Using Python

How do I Calculate a Hash of a File Using Python

I am excited to share with you one of my favorite little projects a simple Python script that calculates the SHA-256 hash of a file. Have you ever needed to verify the integrity of a file or ensure it hasn’t been tampered with? I certainly have, and I found that cryptographic hashing is a powerful way to do just that. Whether you’re validating downloads, auditing data, or simply exploring the world of cryptography, this tool can be incredibly handy. Let’s dive into the code and break it down step by step!

The Code

Below is the complete Python script that computes a file’s hash. I encourage you to copy and paste this into a .py file or even a Jupyter notebook, then run it with your file of choice.

import hashlib

def file_hash(file_path, algo="sha256"):
"""
Calculate the hash of a file using the specified algorithm.
Default algorithm is SHA-256.
"""
h = hashlib.new(algo)
with open(file_path, "rb") as f:
# Read the file in chunks to handle large files efficiently
while chunk := f.read(8192):
h.update(chunk)
return h.hexdigest()

# Example usage
file_path = "clcoding.txt"
print("SHA-256 Hash:", file_hash(file_path))

Importing the hashlib Library

I started by importing Python’s built-in hashlib library. This library offers a suite of cryptographic hash functions, which means I can create hash objects for various algorithms like SHA-256, MD5, or SHA-1 without any additional installations. It’s one of those fantastic features of Python’s standard library that makes coding fun and efficient.

Defining the file_hash Function

The heart of the script is the file_hash function. This function accepts two parameters:

Inside the function, here’s what I do:

while chunk := f.read(8192):
h.update(chunk)

This approach ensures that even if your file is several gigabytes in size, it won’t overwhelm your system’s memory.

Example Usage

In the example, the script calculates the SHA-256 hash of a file named clcoding.txt. You can replace "clcoding.txt" with any file path you wish to check. Running the code will print the hash to your console, giving you a quick and reliable way to verify file integrity.

Why Use SHA-256?

I chose SHA-256 for this project because it’s one of the most widely used cryptographic hash functions. Here’s why:

Customize Your Hashing

One of the great things about this script is its flexibility. If you need a different algorithm, simply change the algo parameter. For instance:

file_hash(file_path, "md5")
file_hash(file_path, "sha1")
file_hash(file_path, "sha512")

Just remember, while MD5 and SHA-1 might be faster, they are not recommended for security-critical applications due to their vulnerabilities.

Handling Large Files

I paid special attention to how the script handles large files. By reading the file in chunks (8192 bytes at a time), the script remains memory-efficient. This means you can compute the hash for files that are several gigabytes in size without running into memory issues. If you ever need to adjust the chunk size, feel free to tweak that number to suit your file sizes and system capabilities.

Final Thoughts

I believe that calculating file hashes is an essential skill for anyone working with data or interested in security. This script is more than just a piece of code; it’s a tool that adds an extra layer of trust and security to your workflow. Whether you’re a developer, a security enthusiast, or a data professional, knowing how to verify file integrity is invaluable.

Exit mobile version