![]() |
VOOZH | about |
TrueFoundry recognized in Gartner Hype Cycle for Platform Engineering 2026. Read the full report →
Join our VAR & VAD ecosystem — deliver enterprise AI governance across LLMs, MCPs & Agents. Become a Partner →
Get instant access to a live TrueFoundry environment. Deploy models, route LLM traffic, and explore the full platform — your sandbox is ready in seconds, no credit card required.
Blazingly fast way to build, track and deploy your models!
We conducted a webinar with the team at Palo Alto Networks with 400+ people joining from different regions and functions. Our goal at TrueFoundry was always to create infrastructure that supports your GenAI stack and we decided to pick up security as our next milestone. In this blog, we distill down all important key factors that we discussed and learnt during the webinar and boil it down to some key factors which will help you make decisions on scaling AI initiatives across your company. It’s a quick 6-7 min read, and also serves as a practical guide on AI Security that you can share with security leaders, platform teams, and AI builders.
Traditional security assumed that if an adversary couldn’t pass your firewall, they can't reach your sensitive data. Today, an engineer can paste snippets of source code into a chat to “fix a bug” and walk sensitive IP straight out the door, without any firewall being touched. In this world, context and identity become the new perimeters: who is asking, what are they asking for, and what data and tools are being pulled into that context?
It’s a fundamental shift in the threat model. Instead of blocking ports or hardening endpoints alone, we have to secure the language interface to systems and data, and everything that language can cause the system to do.
Four failure modes show up first for most teams:
If you want to learn more, OWASP's AI Top 10 is a solid map. In practice we have seen these four areas are where most programs begin.
We ran quick polls during the session. The top concern, by a wide margin: data leakage/exfiltration, followed by prompt injection and lack of observability into model and agent behavior. That aligns with what we see in enterprise rollouts: if you secure inputs and outputs first, and then make every interaction traceable, you have mostly covered what's needed for AI security
Here’s the short checklist we’ve seen work across industries:
Now, LLMs no longer just talk—they can act. With the Model Context Protocol (MCP), agents discover tools and perform actions across GitHub, Jira, Slack, cloud APIs, internal systems, and more. That makes MCP a huge accelerator for developer productivity and a new class of risk. A poorly scoped agent can close 500 tickets overnight, exfiltrate data, or delete a branch because a sneaky instruction slipped in during a retrieval step.
Now you can see how the shift goes from “wrong answer” to “unauthorized actions”. Tools multiply both value and risk, so changes must be made to accommodate these.
Key Metrics for Evaluating Gateway
| Criteria | What should you evaluate ? | Priority | TrueFoundry |
|---|---|---|---|
| Latency | Adds <10ms p95 overhead for time-to-first-token? | Must Have | ✅ Supported |
| Data Residency | Keeps logs within your region (EU/US)? | Depends on use case | ✅ Supported |
| Latency-Based Routing | Automatically reroutes based on real-time latency/failures? | Must Have | ✅ Supported |
| Key Rotation & Revocation | Rotate or revoke keys without downtime? | Must Have | ✅ Supported |
| Key Rotation & Revocation | Rotate or revoke keys without downtime? | Must Have | ✅ Supported |
| Key Rotation & Revocation | Rotate or revoke keys without downtime? | Must Have | ✅ Supported |
| Key Rotation & Revocation | Rotate or revoke keys without downtime? | Must Have | ✅ Supported |
| Key Rotation & Revocation | Rotate or revoke keys without downtime? | Must Have | ✅ Supported |
When MCP enters the picture, elevate your baseline in these areas:
Agents are most useful when they can call tools to get real work done. MCP standardizes how agents discover and invoke those tools. But power demands control. Our recommended pattern is a two-layer design:
Putting control in one place gives platform teams a great developer experience (one endpoint, unified SDKs, self-serve onboarding). Putting validation in its own layer gives security teams deep hooks and evidence (guardrail verdicts, DLP masking, URL/code inspection, and posture checks) without blocking developer velocity.
Our CTO Abhishek walkthroughs a simple model for MCP authorization within TrueFoundry's AI Gateway:
This “3-layer authN/Z” pattern removes one-off secrets, keeps attribution, and prevents over-privileged bots from doing irreversible damage.
Example: A PR-review agent can list PRs and leave comments, but AI gateway policy denies branches.delete even if a prompt tries “also delete main after summarizing.” The denial—and the identity behind the attempt—are logged.
We also covered what are virtual MCP servers. In short, don’t hand agents the entire Jira/GitHub/Confluence catalogs. Compose a Virtual MCP that exposes just what the workflow needs—for instance: Jira.create_issue, GitHub.create_pr, and Confluence.search_docs as a single Engineering MCP endpoint. You get cleaner prompts, faster approvals, and a smaller blast radius by design.
The best MCP gateways act like a policy-aware switchboard between agents and tools:
If you compare “no mcp gateway” vs. “with mcp gateway,” the difference is stark: fewer bespoke connections, central authentication flows, centralized credential management, enterprise-grade audit trails, and a vetted server catalog instead of a sprawl of unmanaged scripts.
You can tune safety profiles per application to balance latency with protection, and all decisions are logged for audit purposes.
We demonstrated an agent that lists open pull requests for a teammate and nudges them on Slack. The trace shows each step—tool calls, model reasoning, timing—and the guardrail decisions along the way. The dashboard highlights which tools consume the most calls, latency spikes, and any loop-like behavior. That’s the difference between hoping an agent is safe and knowing it is. You can watch it in the youtube link, we have attached later in the blog.
“Why not just use a normal API gateway?”
Because AI workloads add new requirements: token translation per tool, tool-level RBAC, input/output guardrails, rich auditing/traceability, and smart tool routing. A classic gateway knows nothing about prompts, traces, or model/tool semantics.
“Can we monitor existing tools like ChatGPT or cloud IDEs?”
If the tool lets you set a proxy or custom model endpoint, route it through the gateway and enforce guardrails/observability there. If it does not expose that control, you cannot add transparent inspection in the middle.
“How do we avoid harmful fine-tuning?”
Watch your data: exclude PII and secrets. Validate provenance for any base model you ingest. Scan model files for hidden behaviors before you train.
“Do developers get SDKs?”
Standard MCP clients work out of the box. Prisma AIRS exposes a server and SDK, and the gateway provides copy-paste snippets for Python/TypeScript and popular agent frameworks.
If you missed it—or want your platform and security teams to catch up—here’s the recording:
YouTube: https://youtu.be/hWNV2v3C8SA
We’re offering a one-month trial of the MCP Gateway—available as SaaS or self-hosted. If you want a 30-minute architecture review to calibrate costs, latency, and guardrail profiles, we’re happy to help. You’ll get:
Huge thanks to the Palo Alto Networks team for joining us and for pushing the state of AI security forward.
Enterprise AI security is the multi-layered framework of protocols, tools, and policies designed to protect AI models and corporate data from specialized threats like prompt injection and data exfiltration. It involves implementing strict access controls and real-time monitoring to ensure that model interactions remain safe and compliant. TrueFoundry centralizes these requirements into a single control plane, allowing organizations to manage security across various models and internal systems without increasing architectural complexity.
Maintaining enterprise AI security is critical for preventing the accidental exposure of sensitive PII and intellectual property to external model providers. Without a dedicated security layer, organizations risk regulatory non-compliance and vulnerability to adversarial attacks that can compromise autonomous agents. TrueFoundry addresses these concerns by deploying the entire AI stack within your private VPC, ensuring that data residency is maintained and every tool invocation is fully auditable and governed.
Within a holistic enterprise AI security strategy, MCP gateway runtime guardrails act as a proactive defense layer that inspects every interaction in real-time. These guardrails automatically redact sensitive information and block malicious inputs before they reach the model or internal backend systems. TrueFoundry’s gateway uses these guardrails to enforce fine-grained permissions on tool usage, providing a layer of dynamic protection that ensures agents operate within safe, predefined boundaries.
TrueFoundry AI Gateway delivers ~3–4 ms latency, handles 350+ RPS on 1 vCPU, scales horizontally with ease, and is production-ready, while LiteLLM suffers from high latency, struggles beyond moderate RPS, lacks built-in scaling, and is best for light or prototype workloads.
Product
Company
Resources