VOOZH about

URL: https://thenewstack.io/4-reasons-your-ai-agent-needs-code-interpreter/

⇱ 4 Reasons Your AI Agent Needs Code Interpreter - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-05-13 10:00:27
4 Reasons Your AI Agent Needs Code Interpreter
contributed,
AI / Large Language Models

4 Reasons Your AI Agent Needs Code Interpreter

We will see code interpreters powering even more AI agents and apps as a part of the new ecosystem being built around LLMs, where a code interpreter represents a crucial part of an agent’s brain.
May 13th, 2024 10:00am by Vasek Mlejnsky
👁 Featued image for: 4 Reasons Your AI Agent Needs Code Interpreter
Image via Pixabay.

Building AI agents is hard. You’ll struggle with hallucinations, keeping the agents on track and navigating them to use the right tools.

One way to overcome these problems is to give agents code-execution capabilities.

Here are some reasons why your AI agent should have a code interpreter.

1. Extra Skills

Agents with code interpreters gain powers like performing a statistical analysis of CSV files or plotting charts.

When you ask different agents for the same thing, it becomes evident how much those with an underlying code interpreter differ. The following tasks are almost impossible to finish without running code:

  • Analyze NVIDIA stock and predict its development.
  • Play a Poker game with me.
  • Book me a flight.

See how Perplexity (an agent without a code interpreter) deals with a data analysis task. Even when provided a data file, the agent cannot finish the task — the best it can do is provide advice on what code I should run.

👁 Image

Here is how ChatGPT with an underlying code interpreter would deal with the same task…

👁 Image

… including the installation of new packages and generating a chart.

👁 Image

Note that the end users don’t need to be aware that the app carries out coding tasks behind the scenes since the primary objective (like “book me a flight”) often doesn’t revolve around coding.

2. Complex Reasoning

Large language models (LLMs) are great at generating text but struggle with reasoning and complex thinking.

Google’s team made an interesting parallel from the famous book “Thinking, Fast and Slow” by Daniel Kahneman. The ability to execute code equips agents with slow thinking (effortful, logical and calculating) versus fast thinking (intuitive and automatic), and is represented by how agents act without a code interpreter.

In their analogy, agents relying purely on LLMs can be thought to operate without slow thinking, quickly producing text without a deeper thought. Below is an example of how even simple tasks might require some system and cannot be answered just intuitively.

👁 Image

3. Reducing LLM Hallucinations

A recent paper confirmed that LLMs are hallucinating on multistep tasks even when given reasoning prompts. As a follow-up to the findings from the paper, a software engineer demonstrated how using a code-interpreter-style LLM engine successfully reduces hallucinations by an order of magnitude. He found that code interpreters can reduce the GPT-4 hallucination rate from <10% to <1%.

Code interpreters can handle uploads and downloads, write code to look up data from source files and arrive at conclusions instead of reasoning freestyle like simpler agents usually do.

Other ways to battle LLM hallucinations include RAG, fine-tuning and increasing the size of LLM context windows.

4. Testing

Another big challenge is the LLM code generation. When an agent can not only generate but also run code, it’s able to test the functioning of its own output and iterate on it.

Building with Code Interpreters

I think we will see code interpreters powering even more AI agents and apps as a part of the new ecosystem being built around LLMs, where a code interpreter represents a crucial part of an agent’s brain. For inspiration to build, see popular open source products like Open Interpreter or AutoGen.

There are still challenges to overcome, such as finding a secure and optimal way to run the LLM-generated code, which can be solved by executing the processes in an isolated cloud environment.

👁 Image

TRENDING STORIES
Vasek Mlejnsky is the CEO and co-founder of E2B — the open-source cloud runtime for AI agents. Over 200,000 agents have run on E2B. E2B has raised $3M pre-seed led by Kaya, Sunflower Capital investors like CEOs of Vercel, Supabase,...
Read more from Vasek Mlejnsky
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.