How to Extract Text From Images Using Python

Have you ever wanted to extract text from an image automatically, Maybe you have a scanned document, a screenshot, or even a photograph containing important text that you need to process. I will show you how to use Python and Tesseract OCR to extract text from images efficiently.

We will start with a simple script and then enhance it with additional functionality, including GUI-based file selection, pre-processing for better accuracy, and multi-language support. By the end of this tutorial, you will have a fully functional OCR tool that saves the extracted text in multiple formats.

Explanation of the Basic Code

Import Required Libraries

We need two main libraries:

from PIL import Image
import pytesseract
  • PIL (Pillow): Used for image processing, such as opening and manipulating image files.
  • pytesseract: A Python wrapper for Google’s Tesseract OCR engine, which extracts text from images.

Define the Image Path

image_path = "image.png"  # Replace with your image file

This is the file path to the image that contains the text you want to extract.

Open and Process the Image

image = Image.open(image_path)

This line opens the image using PIL so that we can process it.

Extract Text from the Image

extracted_text = pytesseract.image_to_string(image)

This function processes the image using Optical Character Recognition (OCR) and extracts any readable text.

Print the Extracted Text

print("Extracted Text from Image:\n")
print(extracted_text)

The extracted text will be displayed in the console.

Save the Extracted Text to a File

with open("extracted_text.txt", "w", encoding="utf-8") as text_file:
    text_file.write(extracted_text)

print("Text extracted and saved to 'extracted_text.txt'")

This saves the extracted text into a text file (extracted_text.txt).

Enhancing the Script with More Practical Functionality

While the basic script works well, we can improve it in several ways:

  • GUI-based file selection (so the user can pick an image without modifying the script).
  • Image pre-processing (grayscale conversion and thresholding for better OCR accuracy).
  • Language selection (support for multiple languages in OCR).
  • Saving the text in multiple formats (.txt and .csv).

Enhanced Python Code

import pytesseract
from PIL import Image
import tkinter as tk
from tkinter import filedialog
import cv2
import numpy as np
import os

# Function to enhance and preprocess image
def preprocess_image(image_path):
    image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)  # Convert to grayscale
    _, processed_img = cv2.threshold(image, 150, 255, cv2.THRESH_BINARY)  # Apply thresholding
    return processed_img

# Function to extract text
def extract_text(image_path, lang="eng"):
    processed_img = preprocess_image(image_path)
    temp_filename = "temp_image.png"
    cv2.imwrite(temp_filename, processed_img)  # Save temporary preprocessed image

    # OCR processing
    extracted_text = pytesseract.image_to_string(Image.open(temp_filename), lang=lang)

    os.remove(temp_filename)  # Remove temporary file
    return extracted_text

# Function to select an image and extract text
def main():
    root = tk.Tk()
    root.withdraw()  # Hide the main window

    file_path = filedialog.askopenfilename(title="Select an Image", filetypes=[("Image Files", "*.png;*.jpg;*.jpeg")])
    
    if not file_path:
        print("No file selected.")
        return

    lang_choice = input("Enter language code (default 'eng' for English, 'spa' for Spanish, 'fra' for French, etc.): ").strip() or "eng"

    extracted_text = extract_text(file_path, lang_choice)

    if extracted_text.strip():
        print("\nExtracted Text:\n")
        print(extracted_text)

        # Save as .txt file
        text_filename = os.path.splitext(file_path)[0] + "_extracted.txt"
        with open(text_filename, "w", encoding="utf-8") as text_file:
            text_file.write(extracted_text)

        print(f"\nText extracted and saved to '{text_filename}'")

        # Save as .csv file
        csv_filename = os.path.splitext(file_path)[0] + "_extracted.csv"
        with open(csv_filename, "w", encoding="utf-8") as csv_file:
            csv_file.write("Extracted Text\n")
            csv_file.write(extracted_text.replace("\n", " "))  # Store in a single row

        print(f"CSV file saved as '{csv_filename}'")

    else:
        print("No text found in the image.")

if __name__ == "__main__":
    main()

Final Thoughts

I hope this guide helps you build your own OCR tool using Python. With the ability to extract text from images, this project has a variety of real-world applications, such as digitizing documents, automating data entry, and processing text from scanned receipts.

You can further improve this project by integrating it into a web app using Flask or Django, adding AI models for handwritten text recognition, or automating bulk image processing.

Related blog posts