👁 Blank white background with no objects or features visible.

TrueFoundry recognized in Gartner Hype Cycle for Platform Engineering 2026. Read the full report →

Join our VAR & VAD ecosystem — deliver enterprise AI governance across LLMs, MCPs & Agents. Become a Partner →

Book Demo

👁 Three horizontal black bars of varying lengths on a white background, menu or list icon symbol.

👁 bg

👁 Blank white background with no objects or features visible in the empty space provided entirely.

Go back

👁 TrueFoundry Logo

Try TrueFoundry — Live, Right Now

Get instant access to a live TrueFoundry environment. Deploy models, route LLM traffic, and explore the full platform — your sandbox is ready in seconds, no credit card required.

9.9

👁 Red star symbol on white background, a five-pointed star icon in a blurry coral color.
👁 C2 logo with stylized orange letter and arrow symbol on a white background.

Loved by Enterprises and Startups

👁 Cargill logo with stylized gray swoosh above the company name on a white background.
👁 MAVENIR logo with stylized text and underline on the letter M in black on white background.
👁 Whatfix software logo with stylized letter W and trademark symbol on white background.
👁 Wadhwani AI logo featuring a stylized starburst design on a clean white background.
👁 Games logo with stylized sunburst design on white background.
👁 Grey Aviso logo featuring a stylized triangle with a dot on a white background.
👁 Aviva logo displayed on a white background with dark grey text and distinctive dot design element.
👁 JanitorAI Logo

Benchmarking the TrueFoundry LLM Gateway: it's blazing fast ⚡

👁 Image

By Srihari Radhakrishna

Published: January 19, 2026

👁 Image

Built for Speed: ~10ms Latency, Even Under Load

Blazingly fast way to build, track and deploy your models!

Handles 350+ RPS on just 1 vCPU — no tuning needed
Production-ready with full enterprise support

Get Started with Truefoundry Now Talk to the Expert

TrueFoundry LLM Gateway provides a unified OpenAI compatible interface to various LLM providers like Anthropic, OpenAI, Bedrock, Gemini and many others
TrueFoundry LLM Gateway scales seamlessly to 350 RPS on a single replica of 1 unit CPU while using 270 MB of memory. We compared with another gateway product, LiteLLM, on a similar setup and LiteLLM failed to scaled beyond 50 RPS
TrueFoundry LLM Gateway only adds an extra latency of 3-5 ms, while LiteLLM adds between 15-30 ms per request.

Why does your org need an LLM Gateway?

An LLM Gateway provides a unified interface to manage your organisation's LLM usage:

Unified API: Access multiple LLM providers through a single OpenAI compatible interface, no code changes needed
API Key Security: Secure, centralised credential management
Governance & Control: Set limits, access controls, and content filtering
Rate Limiting: Prevent abuse and ensure fair usage
Observability: Track usage, costs, latency and performance
Load Balancing: Route requests across providers automatically
Cost Management: Monitor spending and set budget alerts
Audit Trails: Log all LLM interactions for compliance

How fast is TrueFoundry LLM Gateway?

Load Test Setup

For our load testing experiment, we setup a deployed this fake OpenAI endpoint service using TrueFoundry. The service would simulate OpenAI request and response format without actually producing tokens.

We also deployed the TrueFoundry LLM Gateway and LiteLLM Proxy Server, both running of a single replica with 1 unit CPU and 1 GB memory.

👁 Image

We added our fake OpenAI provider into both TrueFoundry and LiteLLM gateways. While load testing, we made requests to the fake OpenAI server in 3 different ways:

Setup 1: Directly without using any proxy or gateway
Setup 2: Through the TrueFoundry LLM Gateway deployed on 1 unit CPU and 1 GB memory
Setup 3: Through the LiteLLM Proxy Server deployed on 1 unit CPU and 1 GB memory

RPS	10 RPS	50 RPS	200 RPS	300 RPS
OpenAI direct (Setup 1)	73 ms	73 ms	73 ms	73 ms
TrueFoundry LLM Gateway (Setup 2)	76 ms (+3 ms)	76 ms (+3 ms)	76 ms (+3 ms)	77 ms (+4 ms)
LiteLLM Proxy (Setup 3)	88 ms (+15 ms)	99 ms (+26 ms)	Could not scale to 200 RPS	Could not scale to 300 RPS

Observations

TrueFoundry Gateway adds only extra 3 ms in latency upto 250 RPS and 4 ms at RPS > 300
TrueFoundry LLM Gateway was able to scale without any degradation in performance until about 350 RPS (1 vCPU, 1 GB machine) before the CPU utilisation reached 100% and latencies started getting affected. With more CPU or more replicas, the LLM Gateway can scale to tens of thousands of requests per second.
LiteLLM on the same machine was not able to scale beyond 40-50 RPS before reaching CPU limit

More metrics

Setup 1: Direct OpenAI endpoint calling

👁 Image

Stats @ 200 RPS

👁 Image

Stats @ 300 RPS

👁 Image

Response Time v/s RPS

‍Setup 2: TrueFoundry LLM Gateway

👁 Image

Stats @ 200 RPS

👁 Image

Stats @ 300 RPS

👁 Image

Response Time v/s RPS

Setup 3: LiteLLM

👁 Image

Stats @ ~58 RPS

👁 Image

Response times v/s RPS

Speed features of LLM Gateway

Near-Zero Overhead: Just 3-5 ms added latency
Optimised Backend: Built with performant Node.js framework
Config Caching: Config is stored in memory for quick look up
Smart Routing: Minimal processing overhead
Edge Ready: Deploy close to your apps
High Capacity: A t2.2xlarge AWS instance (43$ per month on spot) machine can scale upto ~3000 RPS with no issues.

👁 Image

Edge Deployment of TrueFoundry LLM Gateway

Supported Providers

Below is a comprehensive list of popular LLM providers that is supported by TrueFoundry LLM Gateway:

Provider	Streaming Supported
GCP	✅
AWS	✅
Azure OpenAI	✅
Self Hosted Models on TrueFoundry	✅
OpenAI	✅
Cohere	✅
AI21	✅
Anthropic	✅
Anyscale	✅
Together AI	✅
DeepInfra	✅
Ollama	✅
Palm	✅
Perplexity AI	✅
Mistral AI	✅
Groq	✅
Nomic	✅

TrueFoundry AI Gateway delivers ~3–4 ms latency, handles 350+ RPS on 1 vCPU, scales horizontally with ease, and is production-ready, while LiteLLM suffers from high latency, struggles beyond moderate RPS, lacks built-in scaling, and is best for light or prototype workloads.

Built for Speed: ~10ms Latency, Even Under Load

Schedule your Demo Now