VOOZH about

URL: https://zapier.com/blog/ai-models-on-zapier/

⇱ Which AI models can you automate on Zapier?


Which AI models can you automate on Zapier? (Opus 4.8, Gemini 3.5 Flash, and more)

By Steph Spector · June 10, 2026

New AI models launch practically every week, and keeping up with which ones to use for specific workflows is a job in itself. Consider this article your living reference.

At Zapier, we run every model through AutomationBench. It's our benchmark for testing how well models carry out multi-step workflows, not just static prompts.

Below, I'll walk through every major AI provider available on Zapier, the models you can plug into your Zap workflows today, and what each one is best for based on Zapier's AutomationBench. You'll also learn about direct AI integrations with hundreds of other AI apps—and how easy it is to automate AI with our built-in tool, AI by Zapier.

Zapier is the most connected AI orchestration platform—integrating with thousands of apps from partners like Google, Salesforce, and Microsoft. Use forms, data tables, and logic to build secure, automated, AI-powered systems for your business-critical workflows across your organization's technology stack. .

Table of contents

AutomationBench, Zapier's benchmarking tool

As you scroll through the content below for OpenAI, Anthropic, and Gemini models, you'll notice a "best for" section based on AutomationBench. That's Zapier's benchmarking tool for measuring AI model efficacy.

The Zapier team built AutomationBench to determine which models to deploy on our platform. We couldn't find an AI benchmark that measured whether an AI model could do the messy, complicated work businesses actually rely on. Realizing that gap existed in the market, we made it public.

Every measured task is modeled on real workflow patterns we noticed on our platform. (No PII was used in the process, though.) To make scoring meaningful, we complicated those tasks to reflect the friction that shows up in real business environments. That included adding irrelevant data, hiding key info behind tool calls, introducing ambiguity about where the right info could be found, using similar naming conventions to create plausible wrong answers, and enforcing strict business policy rules with overriding priorities.

To show you what we mean by "complicated," here's an example task used for testing purposes (you can find more in the ):

There’s a scheduling conflict on February 20, 2026 at 2:00 PM — a Zoom meeting and a Google Calendar event overlap. Check the meeting priority policy in the spreadsheet to determine which one wins, then reschedule the loser by prepending [RESCHEDULED] to its topic/title. Post a summary to #ops-updates on Slack noting which meeting won and which was rescheduled, including both the Zoom meeting ID and Calendar event ID.

When it comes to scoring models, we don't evaluate how an agent completes the task. It doesn't matter which tools are called or in what order. We only look at the end state: if it did the job, and whether it had any side effects. This means a model that costs more but gets the job done will score higher than a cheaper one that doesn't.

Here are the top five models from our leaderboard. Percentages represent the share of workflow tasks each AI model was able to fully complete.

Model

Score

1. Fable 5.0 (Max) — Anthropic

17.4%

2. Fable 5.0 (XHigh) — Anthropic

16.0%

3. Claude Opus 4.8 (XHigh) — Anthropic

15.5%

4. Claude Opus 4.8 (Max) — Anthropic

15.4%

5. Gemini 3.5 Flash (Medium)

14.5%

OpenAI (ChatGPT) models

's model lineup is the broadest on Zapier, spanning everything from budget-friendly mini models to advanced reasoning engines and specialized tools for transcription and image generation.

Best for: Sales and Marketing workflows. GPT-5.5 (XHigh) tops the leaderboard for both Sales and Marketing—the only model to lead a domain outside of Anthropic's lineup this round.

What's new: OpenAI recently launched GPT-5.5, built for the most complex professional work: coding, research, data analysis, and autonomous multi-step tasks across tools. In Zapier's early testing through AutomationBench, GPT-5.5 became the first model to break 10%.

Model

Best for

Inputs

Outputs

Context window

Output pricing (per 1M tokens)

GPT-5.5 Pro

Problems that need the deepest reasoning and highest reliability, where getting it right matters more than speed or cost

Text, images

Text

1 million tokens

$180

GPT-5.5

Complex professional work, including coding, research, data analysis, and autonomous multi-step tasks across tools

Text, images

Text

1 million tokens

$30

GPT-5.4 nano

High-volume, repeatable tasks where speed and cost matter most, like classification, data extraction, and ranking

Text, images

Text

400,000 tokens

$1.25

GPT-5.4 mini

Complex, multi-step workflows that need fast reasoning across different content types and tools

Text, images

Text

400,000 tokens

$4.50

GPT 5.4

Complex, multi-step professional workflows that need deep reasoning and planning

Text, images, audio

Text

1,050,000 tokens

$15

GPT 5.3

Fast, context-aware chat and search

Text, images

Text

128,000 tokens

$14

GPT-5.2

Advanced coding and agentic tasks with reliable multi-step reasoning

Text, images

Text

128,000 tokens

$14

GPT-5 mini*

Affordable reasoning and logic for well‑defined tasks

Text, images

Text

400,000 tokens

$2

GPT-5 nano*

Very affordable reasoning and logic for summaries, classification, and other lightweight tasks

Text

Text

400,000 tokens

$.40

GPT-4o mini*

Multimodal on a budget

Text, images, audio

Text

128,000 tokens

$.60

GPT-4o

Multimodal tasks, especially live, human‑like voice and vision interaction

Text, images, audio, video

Text

128,000 tokens

$10

GPT-4.1 mini

Balancing power, performance, and affordability for general‑purpose workloads

Text, images

Text

1,047,576 tokens

$1.60

GPT-4.1

Complex tasks that don't require advanced reasoning, with very long context windows

Text, images

Text

1,047,576 tokens

$8

GPT-4.1 nano*

Simple tasks where speed and price matter more than raw capability

Text

Text

1,047,576 tokens

$.40

o4-mini

Fast, cost‑efficient reasoning

Text, images

200,000 tokens

$4.40

o3-mini

Lightweight, lower‑cost alternative to o3 for reasoning‑heavy tasks

Text

Text

200,000 tokens

$4.40

o3

Advanced reasoning and logic

Text, images

Text

200,000 tokens

$8

GPT Image 1.5

State‑of‑the‑art image generation

Text, images

Text, images

N/A

$10

GPT Image 1

Image generation

Text, images

Images

N/A

$40

*Can be used for free in AI by Zapier

Note: You'll see additional models inside the OpenAI integration on Zapier. OpenAI sometimes retires models from its product while keeping them in the API, deprecating them from the API on a separate schedule. We recommend building new workflows on the models listed above, but you can also see the complete list below.

Related reading:

Anthropic (Claude) models

's Claude models are known for strong writing quality, careful instruction-following, and a safety-first design philosophy. Claude is a popular choice for tasks like drafting long-form content, analyzing documents, and powering customer-facing chatbots that need a natural, conversational tone.

Best for: Operations, Support, Finance, and HR workflows. Fable 5.0 (Max) tops Operations, Finance, and HR, while Claude Opus 4.8 (XHigh) leads Support. Opus 4.8 (Max) also takes second place in Sales.

What's new: Opus 4.8 is now available on Zapier. Like the model before it, it excels at complex reasoning and agentic coding. But where Opus 4.7 would decline sensitive tasks outright, this new model is more persistent. As proof, Opus 4.8 scored roughly six times higher than Opus 4.7 on HR workflows, which tend to involve exactly the kind of sensitive, policy-adjacent tasks that tripped up Opus 4.7.

Model

Best for

Inputs

Outputs

Context window

Output pricing (per 1M tokens)

Fable 5.0

The most demanding reasoning and long-horizon agentic work

Text, images

Text

1 million tokens

$50

Sonnet 4.6

Coding, agents, enterprise workflow—best balance of price and performance

Text, images

Text

1 million tokens

$15

Opus 4.8

Complex reasoning, agentic coding, long-running tasks

Text, images

Text

1 million tokens

$25

Opus 4.7

Complex reasoning, agentic coding

Text, images

Text

1 million tokens

$25

Opus 4.6

Complex reasoning, coding, long-horizon tasks

Text, images

Text

1 million tokens

$25

Haiku 4.5

High-volume, latency-sensitive, cost-efficient tasks

Text, images

Text

200,000 tokens

$5

Sonnet 4.5

Complex agents, coding; highest general intelligence

Text, images

Text

200,000 tokens

$15

Opus 4.1

Complex reasoning, analysis, creative tasks

Text, images

Text

200,000 tokens

$75

Sonnet 4

Balanced coding and workflows

Text, images

Text

200,000 tokens

$15

Haiku 3

Fast, simple, cost-effective classification tasks

Text, images

Text

200,000 tokens

$1.25

Related reading:

Gemini (Google AI Studio) models

family stands out for its massive context windows, competitive pricing, and strong multimodal capabilities across text, images, audio, and video. Gemini models are a great fit for processing long documents, research-heavy workflows, and tasks where keeping costs low matters.

Best for: High-volume workflows where cost matters. Gemini 3.5 Flash (Medium) lands fifth overall at 14.5%—within three points of the top score, at roughly a quarter of the cost per task ($0.87 vs. $3.67). Gemini 3.5 Flash (Low) is the least expensive model on the leaderboard at $0.65 per task.

What's new: Gemini 3.5 Flash is now available on Zapier. It excels at step coordination and strict policy adherence, the kind of work that tends to result in drift in other models. But it can struggle at following strict output formats and making decisions based on math it has to do on its own.

Model

Best for

Inputs

Outputs

Context window

Output pricing (per 1M tokens)

Gemini 3.5 Flash

Sub-agent deployment, multi-step workflows, and long-horizon tasks at scale

Text, images, audio, video, PDF

Text

1 million

$9

Gemini 3.1 Pro

Complex reasoning, high-stakes coding, and massive data synthesis

Text, images, audio, video, PDF

Text, code, reasoning

1 million

$30

Gemini 3 Flash

High-speed automation, real-time chat, and cost-effective scaling

Text, images, audio, video

Text, code

1 million

$.30

Gemini 3 Pro

Balanced professional workflows and creative content generation

Text, images, audio, video

Text, code

1 million

$3.75

Gemini 2.0 Flash Lite*

Basic classification, ultra-low latency tasks, and simple extraction

Text, images

Text

1 million

$.15

Gemini 2.0 Flash*

Legacy support for high-throughput 2.0-era applications

Text, images, audio

Text, code

1 million

$.30

Gemini 2.5 Pro

Detailed multimodal analysis with high accuracy for older pipelines

Text, images, audio, video

Text, code

2 million

$10.50

Gemini 2.5 Flash

Transition-tier speed for multimodal processing

Text, images, audio, video

Text, code

1 million

$.90

Nano Banana Pro

Professional-grade high-fidelity image generation and editing

Text, images

Images

N/A

$.05

*Can be used for free in AI by Zapier

Note: You'll see additional models inside the Google (Gemini) integration on Zapier. We recommend building new workflows on the models listed above, but you can also see the complete list below.

Related reading:

What is AI by Zapier?

AI by Zapier is our built-in integration that lets you add AI steps directly to any Zap. It comes with several OpenAI and Google models out of the box, no account required, plus a prompt optimizer. But the real value is in how easy it is to swap models inside AI by Zapier without breaking your existing workflows.

When you're configuring an AI by Zapier step, you can select the model you want from a dropdown menu with just a couple of clicks. That's handy when an AI provider releases a model that leapfrogs the one you're currently using, or you're handing off a Zap template to a team that prefers automating with another model. Whoever manages the Zap can swap in their preferred model without having to fuss over deleting the original step and reconfiguring a new one from scratch.

Here's a snapshot of the models available through AI by Zapier today:

Provider

Models

OpenAI (ChatGPT)

GPT-5.5 Pro, GPT-5.5, GPT-5.4 nano, GPT-5.4 mini, GPT-5.4, GPT-5.2, GPT-5, GPT-5 mini*, GPT-5 nano*, GPT-4o mini*, GPT-4.1 nano*, o3, o3-mini, o1

Anthropic (Claude)

Opus 4.8, Opus 4.7, Opus 4.6, Haiku 4.5, Opus 4.5, Sonnet 4.6

Google (Gemini)

Gemini 3.5 Flash, Gemini 3.1 Pro, Gemini 3 Pro, Gemini 2.5 Pro, Gemini 2.5 Flash Lite*, Gemini 2.5 Flash*, Gemini 2.0 Flash Lite*, Gemini 2.0 Flash

Azure OpenAI

Uses the AI models you've already set up in your own Azure OpenAI account. The exact models depend on what your Azure admin has turned on.

Amazon Bedrock

Uses the AI models your company has access to in Amazon Bedrock. The exact models depend on what's enabled in your AWS account and region.

*Can be used for free in AI by Zapier

Looking for setup guidance or automation inspiration?

Other AI apps on Zapier

These aren't the only providers in town. Zapier also integrates directly with hundreds of specialized AI apps, including:

  • —xAI's conversational model, with real-time access to web and X data and a more irreverent tone than most assistants.

  • —A cost-efficient model with strong coding and reasoning chops, popular for technical workflows on a budget.

  • —Models with strong instruction-following and multilingual performance, ranging from fast lightweight options to larger frontier models.

  • —A single integration that gives you access to models from dozens of providers, so you can mix and match without managing multiple connections.

  • —Not a model, but a hardware-accelerated inference engine. Use it when speed is the priority and you need near-instant response.

  • —Specializes in speech-to-text and audio intelligence, including transcription, speaker detection, and sentiment analysis.

  • —Google's enterprise AI platform, ideal for teams already in the Google Cloud ecosystem who need more control and customization.

With Zapier, you're never or provider. You can take advantage of every app's unique strengths and experiment with whatever fits your workflow best. Browse the full list in our ever-growing .

Connect to the latest AI models on Zapier

Whether you're just getting started with AI automation or you're deep into building multi-step workflows, Zapier gives you the flexibility to use the AI tools and models that actually fit your needs. The landscape keeps evolving, and so will this guide. Bookmark it and check back whenever a new model drops.

This article was originally published in March 2026. It was most recently updated in June 2026.

Get productivity tips delivered straight to your inbox

We’ll email you 1-3 times per week—and never share your information.

👁 Steph Spector picture

Steph Spector

In the fifth grade, Steph defeated the school bully in a bongo drum contest, her greatest achievement to date. Between writing about AI and automation for Zapier, she provides executive writing coaching from her home in Austin, Texas. To say hi, visit stephspector.com.

Related articles