VOOZH about

URL: https://thenewstack.io/how-to-overcome-challenges-in-an-api-centric-architecture/

⇱ How to Overcome Challenges in an API-Centric Architecture - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-01-09 09:00:50
How to Overcome Challenges in an API-Centric Architecture
sponsor-wso2,sponsored-post-contributed,
API Management / Networking / Operations / Software Development

How to Overcome Challenges in an API-Centric Architecture

Although an API-led approach is key to enable agility, several hurdles can develop due to the wide use of APIs and API reuse.
Jan 9th, 2023 9:00am by Srinath Perera
👁 Featued image for: How to Overcome Challenges in an API-Centric Architecture
Image via Pixabay.
WSO2 sponsored this post.

This is the second in a two-part series. For an overview of a typical architecture, how it can be deployed and the right tools to use, please refer to Part 1

Most APIs impose usage limits on number of requests per month and rate limits, such as a maximum of 50 requests per minute. A third-party API can be used by many parts of the system. Handling subscription limits requires the system to track all API calls and raise alerts if the limit will be reached soon.

Often, increasing the limit requires human involvement, and alerts need to be raised well in advance. The system deployed must be able to track API usage data persistently to preserve data across service restarts or failures. Also, if the same API is used by multiple applications, collecting those counts and making decisions needs careful design.

Rate limits are more complicated. If handed down to the developer, they will invariably add sleep statements, which will solve the problem in the short term; however, in the long run, this leads to complicated issues when the timing changes. A better approach is to use a concurrent data structure that limits rates. Even then, if the same API is used by multiple applications, controlling rates is more complicated.

An option is to assign each API a portion of the rates, but the downside of that is some bandwidth will be wasted because while some APIs are waiting for capacity, others might be idling. The most practical solution is to send all calls through an outgoing proxy that can handle all limits.

Apps that use external APIs will almost always run into this challenge. Even internal APIs will have the same challenge if they are used by many applications. If an API is only used by one application, there is little point in making that an API. It may be a good idea to try to provide a general solution that handles subscription and rate limits.

Founded in 2005, WSO2 enables the composable enterprise. Our open source, API-first, and decentralized approach helps developers and architects to be more productive and rapidly build digital products to meet demand.
Learn More
The latest from WSO2

Overcoming High Latencies and Tail Latencies

Given a series of service calls, tail latencies are the few service calls that take the most time to finish. If tail latencies are high, some of the requests will take too long or time out. If API calls happen over the internet, tail latencies keep getting worse. When we build apps combining multiple services, each service adds latency. When combining several services, the risk of timeouts increases significantly.

Tail latency is a topic that has been widely discussed, which we will not repeat. However, it is a good idea to explore and learn this area if you plan to run APIs under high-load conditions. See [1], [2], [3], [4] and [5] for more information.

But, why is this a problem? If the APIs we expose do not provide service-level agreement (SLA) guarantees (such as in the 99th percentile in less than 700 milliseconds), it would be impossible for downstream apps that use our APIs to provide any guarantees. Unless everyone can stick to reasonable guarantees, the whole API economy will come crashing down. Newer API specifications, such as the Australian Open Banking specification, define latency limits as part of the specification.

If the use case allows it, the best option is to make tasks asynchronous.

There are several potential solutions. If the use case allows it, the best option is to make tasks asynchronous. If you are calling multiple services, it inevitably takes too long, and often it is better to set the right expectations by promising to provide the results when ready rather than forcing the end user to wait for the request.

When service calls do not have side effects (such as search), there is a second option: latency hedging, where we start a second call when the wait time exceeds the 80th percentile and respond when one of them has returned. This can help control the long tail.

The third option is to try to complete as much work as possible in parallel by not waiting for a response when we are doing a service call and parallelly starting as many service calls as possible. This is not always possible because some service calls might depend on the results of earlier service calls. However, coding to call multiple services in parallel and collecting the results and combining them is much more complex than doing them one after the other.

When a timely response is needed, you are at the mercy of your dependent APIs. Unless caching is possible, an application can’t work faster than any of its dependent services. When the load increases, if the dependent endpoint can’t scale while keeping the response times within the SLA, we will experience higher latencies. If the dependent API can be kept within the SLA, we can get more capacity by paying more for a higher level of service or by buying multiple subscriptions. When that is possible, keeping within the latency becomes a capacity planning problem, where we have to keep enough capacity to manage the risk of potential latency problems.

Another option is to have multiple API options for the same function. For example, if you want to send an SMS or email, there are multiple options. However, it is not the same for many other services. It is possible that as the API economy matures, there will be multiple competing options for many APIs. When multiple options are available, the application can send more traffic to the API that responds faster, giving it more business.

If our API has one client, then things are simple. We can let the client use the API as far as our system allows. However, if we are supporting multiple clients, we need to try to reduce the possibility of one client slowing down others. This is the same reason why other APIs will have a rate limit. We should also define rate limits in our API’s SLA. When a client sends too many requests too fast, we should reject their requests using a status code such as HTTP status code 503. Doing this communicates to the client that it must slow down. This process is called backpressure, where we communicate to upstream clients that the service is overloaded and the message will eventually be handed out to the end user.

It is important to have enough tracing and logs to help you find out whether an error is happening on our side of the system or the side of third-party APIs.

If we are overloaded without any single user sending requests too fast, we need to scale up. If we can’t scale up, we still need to reject some requests. It is important to note that rejecting requests, in this case, makes our system unavailable, while rejecting requests in the earlier case where one client is going over his SLA does not count as unavailable time.

Cold start times (the time for the container to boot up) and service requests are other latency sources. A simple solution is to keep a replica running at all times; this is acceptable for high-traffic APIs. However, if you have many low-traffic APIs, this could be expensive. In such cases, you can guess the traffic and warm up the container before (using heuristics, AI or both). Another option is to optimize the startup time of the servers to allow for fast bootup.

Latency, scale and high availability are closely linked. Even a well-tuned system would need to scale to keep the system running within acceptable latency. If our APIs need to reject valid requests due to load, the API will be unavailable from the user’s point of view.

Managing Transactions across Multiple APIs

If you can run all code from a single runtime (such as JVM), we can commit it as one transaction. For example, premicroservices-era monolithic applications could handle most transactions directly with the database. However, as we break the logic across multiple services (hence multiple runtimes), we cannot carry a single database transaction across multiple service invocations without doing additional work.

One solution for this has been programming language-specific transaction implementations provided by an application server (such as Java transactions). Another is using Web Service atomic transactions if your platform supports it. Yet another has been to use a workflow system (such as Ode or Camunda), that has support for transactions. You can also use queues and combine database transactions and queue system transactions into a single transaction through a transaction manager like Atomikos.

This topic has been discussed in detail under microservices, and we will not repeat those discussions here. Please refer to [6], [7] and [8] for more details

Finally, with API-based architectures, troubleshooting is likely more involved. It is important to have enough tracing and logs to help you find out whether an error is happening on our side of the system or the side of third-party APIs. Also, we need clear data we can share in case help is needed from a third-party API to isolate and fix the problem.

I would like to thank Frank Leymann, Eric Newcomer and others for their thoughtful feedback to significantly shape these posts.

Founded in 2005, WSO2 enables the composable enterprise. Our open source, API-first, and decentralized approach helps developers and architects to be more productive and rapidly build digital products to meet demand.
Learn More
The latest from WSO2
TRENDING STORIES
Srinath Perera is the chief architect at WSO2. He is a scientist, author, software architect and programmer who works on distributed systems. He is a member of the Apache Software Foundation and a key architect behind several widely used projects...
Read more from Srinath Perera
WSO2 sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Camunda, Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.