VOOZH about

URL: https://www.geeksforgeeks.org/artificial-intelligence/building-a-math-application-with-langchain-agents/

⇱ Building a Math Application with LangChain Agents - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Building a Math Application with LangChain Agents

Last Updated : 30 Sep, 2025

Math Application with LangChain Agents is built to make solving math problems from images easier. Instead of having to type out a question from a book, worksheet or even handwritten notes, the web app reads the text directly from the image and then solves it step by step, giving us the final answer.

Let's look at step by step implementation of this web app:

Step 1: Install Dependencies

Installing all required libraries for OCR, LLM reasoning and the Gradio app.

Step 2: Environment Setup

Setting up environment using OpenAI API Key, we can also use Gemini's API Key.

Refer to this article for using OpenAI's API Key: Fetching OpenAI API Key

Step 2: Import Libraries

  • gradio: Builds the web interface.
  • PIL.Image: Handles image loading and processing.
  • pytesseract: Extracts text from images (OCR).
  • ChatOpenAI: Connects to OpenAI LLM for reasoning.
  • initialize_agent, Tool: Sets up AI agents and tools in LangChain.
  • PromptTemplate: Formats prompts for the AI model.

Step 3: Optical Character Recognition (OCR)

  • extract_text(image): Extracts text from the uploaded image using OCR.
  • llm setup: Initializes the GPT-4 model for math reasoning.
  • math_prompt: Defines the template to ask the model to solve math step by step.
  • solve_math(query): Sends the extracted question to GPT-4 and returns the final answer.

Step 4: Training Pipeline

  • Takes an image as input.
  • Uses OCR (extract_text) to extract text from the image.
  • Checks if any text was detected, if not, returns a warning.
  • Sends the extracted text to solve_math to get the final answer.
  • Returns a formatted string showing both the extracted question and the final answer.

Step 5: Gradio UI

  • Takes user-uploaded image, extracts text via OCR and sends it to the LLM to solve the math question.
  • Displays both the extracted question and the final answer in the output box.

Output:

  1. Upload Question Image by dragging the image or from internal documents.
  2. Upload Question Image by taking picture through webcam.
  3. Upload Question Image by pasting the image URL on clipboard.
  4. The Sample Question.
  5. The Sample Question while uploading.
  6. Output.

You can download source code from here.

Comment
Article Tags:

Explore