How to Extract Text from Images with Python?

Last Updated : 4 Oct, 2025

OCR (Optical Character Recognition) is a technique used to convert text from images into editable and searchable digital text. For example, you can scan a printed page and turn it into editable text on your computer. In this article, we’ll use Python and the pytesseract library to extract text from images.

Installation

To enable OCR in Python, we use the pytesseract library:

pip install pytesseract

Note: On Windows, you also need to install the tesseract.exe binary. During installation, you’ll choose (or be given) an install path. Commonly it’s:

C:\Program Files\Tesseract-OCR\tesseract.exe

C:\Users\<username>\AppData\Local\Programs\Tesseract-OCR\tesseract.exe

Make sure to update your code with the correct path based on your system.

Steps to Extract Text from Images

1. Import required libraries

from PIL import Image
import pytesseract

2. Set the path to the Tesseract executable

pytesseract.pytesseract.tesseract_cmd = r"C:\Users\<username>\AppData\Local\Programs\Tesseract-OCR\tesseract.exe"

3. Open the image using PIL:

image = Image.open("example_image.png")

4. Convert the image to grayscale to improve OCR accuracy:

gray_image = image.convert("L")

5. Extract text using pytesseract:

text = pytesseract.image_to_string(gray_image)

6. Clean the extracted text by removing unwanted characters (like page-break symbols):

clean_text = text.replace("\x0c", "").strip()
print(clean_text)

Examples

Example 1:

Image for demonstration:

👁 Image

An image of white text with black background

Code:

Output

now children state should after above same long made such
point run take call together few being would walk give

Example 2:

Image for demonstration:

👁 Image

Code:

Output

Geeksforgeeks

Comment

Article Tags:

Python

Image-Processing

Python-pil

Explore

Python Fundamentals

Python Data Structures

Advanced Python

Data Science with Python

Web Development with Python

Python Practice

Python Courses

URL: https://www.geeksforgeeks.org/python/how-to-extract-text-from-images-with-python/