VOOZH about

URL: https://thenewstack.io/the-case-for-a-federated-data-access-layer-with-graphql/

⇱ The Case for a Federated Data Access Layer with GraphQL - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-09-13 07:25:51
The Case for a Federated Data Access Layer with GraphQL
contributed,sponsor-stepzen,sponsored,sponsored-post-contributed,
Data

The Case for a Federated Data Access Layer with GraphQL

While data catalog and SQL federation have a role to play, GraphQL leads to faster time to value across a larger and heterogeneous set of backends.
Sep 13th, 2022 7:25am by Anant Jhingran
👁 Featued image for: The Case for a Federated Data Access Layer with GraphQL
Feature image via Pixabay.
StepZen sponsored this post.
The problem of data being siloed across multiple systems, yet applications wanting one view of the data, has been a universal one for decades. Traditionally, there are three parts to effectively achieving one view of the data:
  • A metadata view of enterprise data sources. Data catalogs or integration hubs typically specialize in these.
  • A uniform approach to accessing all the data sources, leveraging the metadata.
  • A runtime that supports translations and executions across all the backends.
👁 Image

Federated Data Access

Anant Jhingran
Anant is the founder and CEO of StepZen, a startup with a new approach for simplifying the way developers access the data they need to power digital experiences.
Gartner and others are beginning to describe these parts and the best practices surrounding them as data meshes or data fabrics. The exact terminology is not relevant (and no two people will agree on what any one term means), but what is important is that this is viewed as a “federated data access tier.” There are many technologies that federate over some subset of data sources. For example, Trino is a popular choice for federating over SQL backends. And there are specialized technologies for data catalogs, such as Alation and Informatica’s Integration Hub. In this article, we assert that GraphQL lays claim to an important piece of the puzzle. It’s not that GraphQL has all the capabilities that a Trino has or that its understanding of metadata is as powerful as a data catalog. But there are three very specific reasons why GraphQL is an excellent federated data access tier:
  • GraphQL can handle any backend, not just relational databases. It is based on JSON data merging, and JSON is much better than a table representation of databases as a universal representation of a wide variety of data sources.
  • GraphQL is designed for ease of consumption. A GraphQL query is easy to write, with no complex outer joins, equality clauses, case statements, etc. Of course, ease of consumption means that it does not have the full flexibility of SQL, but we believe that the tradeoffs it makes are exactly right — they support a better developer experience and democratize access to the data.
  • Perhaps, most importantly, good GraphQL servers balance the three parts needed to effectively achieve one view of the data — metadata, query and runtime — in an iterative manner. What that means is that you are not building a data catalog before a use case has emerged. Or designing query specs for some future, ideal space. As you bring in more backends, you expand your metadata, expand your query scope, and always have a running runtime. And the federated data layer can itself be a federation of federated data layer — turtles all the way down, leading to organizational efficiencies.
Let’s elaborate on each of these.
StepZen enables developers to easily build and deploy a single GraphQL API that gets the data they need from multiple backends. The API delivers the right data reliably, irrespective of backend protocols, schemas and authentications. We manage the API so that developers manage zero infrastructure.
Learn More
The latest from StepZen

Handling Heterogeneous Backends

Let us say that a frontend developer wants a “view” of a logged-in customer and all their orders and each order’s delivery status. Customer data may come from a microservice that returns the data in a JSON format. Order data might come from a database returning flat tabular data. And delivery status might come from a SOAP service that returns XML format. One could denormalize the data from each into its tabular structure and then join across, producing a massive denormalized table that the frontend developer must parse and restructure into a nested format. Alternatively, one could convert all data into a JSON format and stitch things together, leading to efficiencies in data generation and ease for the frontend developer. When data comes from heterogeneous backends and the frontend applications need JSON, it is better to treat all backends as JSON producers and the middleware as a JSON stitcher. And this is the view of GraphQL and why it is natural for these use cases.

Ease of Consumption

The power of the federated data tier is that it abstracts away the backends. What good is it if those abstractions add cognitive complexity for the frontend developers? Compare this:
select 
 row_to_json(cod) as customers 
from 
 (
 select 
 c.*, 
 json_agg(
 row_to_json(od.*)
 ) as orders 
 from 
 customer c 
 left outer join (
 select 
 o.*, 
 json_agg(
 row_to_json(d.*)
 ) as deliveries 
 from 
 orders o 
 left outer join (
 select 
 * 
 from 
 delivery
 ) d on (d.orderid = o.id) 
 group by 
 o.id, 
 o.customerid, 
 o.carrier
 ) od on (c.id = od.customerid) 
 where c.email = ‘john.doe@example.com’
) cod

With a three-level query in GraphQL:
{
 customer (email: “john.doe@example.com”) { 
	name
 orders {
 carrier
 delivery {
 status
 }
 }
 }
}

Clearly, the latter GraphQL query is much easier and more intuitive for frontend developers. (The author has been doing SQL for a very long time but still has to look up complex queries, and the `json_agg` syntax is so complicated and non-standard.) GraphQL has been touted as reducing data transfer. It does. But that is only one reason for its popularity. It is very intuitive and enables the frontend developers to ask for, and get, exactly what they want.

Iterative Approach to Federation

With any cross-enterprise initiative, you have be very practical. Building a metadata catalog is well and good, but is it putting the cart before the horse? Good GraphQL implementations (such as StepZen’s) pull in just enough metadata to ensure that queries can be scattered to various backends correctly and gathered (stitched) correctly. As more backends are brought in, the metadata information gets richer, but so do the query possibilities. Because the runtime keeps up with the metadata, the system is always up and running and provides the usefulness of the data and query at all times. Furthermore, by having stitching naturally built into it, GraphQL enables federation across teams. So an enterprise can have an e-commerce team, a marketing team and a supply chain team each building its own federated data layer, and yet the enterprise can federate over these federated layers. The only difference is that for the enterprisewide federated layer the backends would be GraphQL backends, not database or REST or SOAP backends. 👁 Image

Summary

While we are still in the early innings of truly enterprisewide federation architectures, we are very bullish on GraphQL providing a major underpinning of this advancement. We believe that while individual technologies like data catalog and SQL federation have a role to play, the GraphQL approach leads to faster time to value across a much larger and heterogeneous set of backends. Of course, not all GraphQL implementations are the same. At StepZen, having learned a lot from databases, we have taken a declarative approach to how this layer is built. This approach enables the team building the federated layer to focus on the needs of the frontend developers and the capabilities of the backend systems and leave the hard job to the middleware.
StepZen enables developers to easily build and deploy a single GraphQL API that gets the data they need from multiple backends. The API delivers the right data reliably, irrespective of backend protocols, schemas and authentications. We manage the API so that developers manage zero infrastructure.
Learn More
The latest from StepZen
TRENDING STORIES
Anant is the founder and CEO of StepZen, a startup with a new approach for simplifying the way developers access the data they need to power digital experiences. With a career that spans IBM Fellow, CTO of IBM’s Information Management...
Read more from Anant Jhingran
StepZen sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.