VOOZH about

URL: https://thenewstack.io/tutorial-using-a-pre-trained-onnx-model-for-inferencing/

⇱ Tutorial: Using a Pre-Trained ONNX Model for Inferencing - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2020-07-10 10:21:49
Tutorial: Using a Pre-Trained ONNX Model for Inferencing
feature,tutorial,
Software Development

Tutorial: Using a Pre-Trained ONNX Model for Inferencing

In this tutorial, we will explore how to use an existing ONNX model for inferencing.
Jul 10th, 2020 10:21am by Janakiram MSV
👁 Featued image for: Tutorial: Using a Pre-Trained ONNX Model for Inferencing
Feature image by DavidRockDesign from Pixabay.
This post is the second in a series of introductory tutorials on the Open Neural Network Exchange (ONNX). Read part one here.

In the previous part of this series, I introduced the Open Neural Network Exchange (ONNX) and the ONNX Runtime as the interoperable toolkit and platform for machine learning and deep models.

In this tutorial, we will explore how to use an existing ONNX model for inferencing. In just 30 lines of code that includes preprocessing of the input image, we will perform the inference of the MNIST model to predict the number from an image.

The objective of this tutorial is to make you familiar with the ONNX file format and runtime.

Setting up the Environment

To complete this tutorial, you need Python 3.x running on your machine. We will start by creating a Python3 virtual environment to isolate it from the main Python environment on the machine.

python3 -m venv onnx_mnist
source onnx_mnist/bin/activate

With the virtual environment in place, let’s install the Python modules needed by our program. The following command will install ONNX, ONNX Runtime, and OpenCV in your environment.

pip install onnx onnxruntime opencv-python

Let’s download and expand the MNIST pre-trained model trained in Microsoft CNTK Toolkit from the ONNX Model Zoo.

wget https://www.cntk.ai/OnnxModels/mnist/opset_7/mnist.tar.gz
tar xvcf mnist.tar.gz

The above command results in a new directory called mnist that has the model and the test data serialized into ProtoBuf files. We are not going to use the test data for the tutorial.

We can now examine the model through the Netron tool by opening the model.onnx file.

The MNIST model from the ONNX Model Zoo uses maxpooling to update the weights in its convolutions as shown in the graph from Netron.

👁 Image

The model has two convolutional layers, two maxpool layers, one dense layer, and an output layer that can classify one of the 10 values representing the labels used in the MNIST dataset.

👁 Image

Writing Inference Code for Prediction

We will now write code for performing inference on the pre-trained MNIST model.

Let’s start by importing the right Python modules.

import json
import sys
import os
import time
import numpy as np
import cv2
import onnx
import onnxruntime
from onnx import numpy_helper

Notice that we are using ONNX, ONNX Runtime, and the NumPy helper modules related to ONNX.

The ONNX module helps in parsing the model file while the ONNX Runtime module is responsible for creating a session and performing inference.

Next, we will initialize some variables to hold the path of the model files and command-line arguments.

model_dir ="./mnist"
model=model_dir+"/model.onnx"
path=sys.argv[1]

In the next step, we will load the image and preprocess it with OpenCV.

#Preprocess the image
img = cv2.imread(path)
img = np.dot(img[...,:3], [0.299, 0.587, 0.114])
img = cv2.resize(img, dsize=(28, 28), interpolation=cv2.INTER_AREA)
img.resize((1, 1, 28, 28))

The above code snippet is responsible for converting the image to grayscale and resizing it to 28X28 array. This array will be used as an input to the model.

We will now convert the image into a NumPy array of type float32.

data = json.dumps({'data': img.tolist()})
data = np.array(json.loads(data)['data']).astype('float32')

We are now ready to pass the data to the model for inference.

data = json.dumps({'data': img.tolist()})
data = np.array(json.loads(data)['data']).astype('float32')
session = onnxruntime.InferenceSession(model, None)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
print(input_name)
print(output_name)

We need to use the same name as the input layer and the output layer of the neural network. You can easily retrieve them from the session.getinputs() and session.getoutputs() methods. The output from the above snippet matches the input and output node names shown by Netron.

👁 Image

Let’s pass the input to the session and print the prediction.

result = session.run([output_name], {input_name: data})
prediction=int(np.argmax(np.array(result).squeeze(), axis=0))
print(prediction)

We apply the argmax function of NumPy to retrieve the value with the highest probability.

Try running the code by passing an image of a handwritten number. It predicts that with good probability.

Here is the complete code for your reference:

import json
import sys
import os
import time
import numpy as np
import cv2
import onnx
import onnxruntime
from onnx import numpy_helper

model_dir ="./mnist"
model=model_dir+"/model.onnx"
path=sys.argv[1]

#Preprocess the image
img = cv2.imread(path)
img = np.dot(img[...,:3], [0.299, 0.587, 0.114])
img = cv2.resize(img, dsize=(28, 28), interpolation=cv2.INTER_AREA)
img.resize((1, 1, 28, 28))

data = json.dumps({'data': img.tolist()})
data = np.array(json.loads(data)['data']).astype('float32')
session = onnxruntime.InferenceSession(model, None)
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
#print(input_name)
#print(output_name)

result = session.run([output_name], {input_name: data})
prediction=int(np.argmax(np.array(result).squeeze(), axis=0))
print(prediction)

In the next part of this tutorial, we will learn how to export a PyTorch model and converting that into a TensorFlow saved model file. Stay tuned.

Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.

TRENDING STORIES
Janakiram MSV (Jani) is a practicing architect, research analyst, and advisor to Silicon Valley startups. He focuses on the convergence of modern infrastructure powered by cloud-native technology and machine intelligence driven by generative AI. Before becoming an entrepreneur, he spent...
Read more from Janakiram MSV
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.