VOOZH about

URL: https://thenewstack.io/hey-service-where-is-your-data/

⇱ Hey Service, Where Is Your Data? - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2021-04-27 10:08:01
Hey Service, Where Is Your Data?
contributed,sponsor-cockroach-labs,sponsored,sponsored-post-contributed,
Software Development

Hey Service, Where Is Your Data?

Even if you use a highly scalable database, you need to be mindful of your data architecture to meet your scaling needs.
Apr 27th, 2021 10:08am by Lee Atchison
👁 Featued image for: Hey Service, Where Is Your Data?
Lead image via Pixabay.
Cockroach Labs sponsored this post.
Lee Atchison
Lee is a recognized industry thought leader in cloud computing, and the author of the best selling book Architecting For Scale, published by O’Reilly Media. Lee has 34 years of industry experience, including eight years at New Relic and seven years at Amazon and AWS.
When you build a large, multiservice application, deciding where you put the data is just as important as determining how you architect the application itself. An essential but often overlooked aspect of architecting service and microservice-based architectures is deciding where your application data resides. Does the data reside with the service? Is it shared with other services? Is it in a shared central database? Whether you are building a new application or migrating an existing application to a service-based architecture, it is critically important to be mindful of where you store data — and the rest of the application state — within your application or system.

Stateless Services

Not all services make use of stored data. Many services do not use any stored data — they do not maintain state information. All the data they require to perform their work is passed into the service when the service is called, or the data is referenced from another source. The service itself does not maintain state. Take, for example, a service that performs a simple mathematical calculation. In this case, a service that takes a pair of Latitude/Longitude coordinates and determines the distance between those two points. A call to this service might look like this:
find-distance?(start:(48.590870, -122.937424),end:(37.333041, -121.932043)) -> (miles)
This service takes two sets of coordinates and converts them into a distance. This service performs calculations, but other than the data passed into the service (the coordinates), no additional data is required. The service does not need to maintain any state information.
👁 Image

Figure 1. A stateless service

Stateless services offer a huge advantage for scaling. Because they are stateless, it is usually easy to add additional server capacity to a service to scale it to a larger capacity, both vertically and horizontally. You get maximum flexibility in how and when you can scale your service if your service does not maintain state. Additionally, specific caching techniques on the frontend of the service become possible if the cache does not need to concern itself with service state. This caching lets you handle higher scaling requirements with fewer resources. Not all services can be made stateless, obviously, but it is a considerable advantage for scalability for those services that can be stateless.

Stateful Services

A stateful service is a service that requires data (application state) retained during the life of the application, and multiple requests to the service use the data. Take, for example, a service that tracks the location of delivery vehicles in a fleet. A call to such a service could be a call that tells the service where a specific vehicle is located, such as:
set-vehicle-location(vehicle-id: 133928, location: (48.590870, -122.937424))
Then, a specific vehicle can be located by requesting the location of it:
get-vehicle-location(vehicle-id: 133928) -> (lat,long)
Or, find the vehicle that’s closest to a given location:
locate-a-nearby-vehicle(location: (37.483577 , -122.225983 )) -> (vehicle-id)
This service performs a useful function, but to implement these commands, the service would have to maintain data — a list of vehicles and their current location. This data is stored in a database and used by the service to perform its operations. This is a stateful service.
👁 Image

Figure 2. A stateful service

Stateful services are harder to scale because it is not simply a matter of adding CPU power to make the service grow to handle more requests. You also have to consider where the data is stored and how you scale the database that is holding the data. This complicates the ability to scale a service.
Cockroach Labs makes CockroachDB, the most highly evolved cloud-native, distributed SQL database on the planet. It helps companies of all sizes — and the apps they develop — scale fast, survive disaster, and thrive everywhere.
Learn More
The latest from Cockroach Labs

Where to Store Data

When you are building services that require data, it might seem obvious to store data in as few services and systems as possible — making as many services as possible stateless services. This might lead you to keep all data together in a centralized location. In theory, keeping the data close together reduces the number of services that store data. Nothing could be farther from the truth. Instead, it’s important to localize your data as much as possible when building a service-based architecture. Have services and data stores manage only the data they need to manage to perform their jobs. In the above example, store the data that specifies where the vehicles are located in the vehicle location service. This tends to spread out your application data across a larger number of services, putting the data closer to the services that require the data. Localizing data this way provides a few benefits:
  • Reduce the size of individual datasets. Because your data is split across datasets spit across multiple services, each dataset is individually smaller in size. Smaller dataset size means reduced interaction with the data, making the scalability of the database easier. This is called functional partitioning. You are splitting your data based on functional lines rather than on the size of the dataset.
  • Localized access. When you access data in a database or data store, you often access all the data within a given record or set of records. Often, much of that data is not needed for a given interaction. By using your data in multiple datasets that are smaller in size, you reduce the amount of unneeded/unused data from each of your queries.
  • Optimized access methods. By splitting your data into different datasets, you can optimize the type of data store appropriate for each dataset. Does a particular dataset need a relational data store? Or is a simple key/value data store acceptable? Keeping your data associated with the services that consume the data will create a more scalable solution, and easier to manage architecture, and allow your data requirements to expand more easily as your application grows.

Architecting Your Data with Your Services

Architecting a large, highly scalable web application is a complex task. Sometimes, you have to make decisions that seem wrong but ultimately improve your application scalability — and hence the availability — of your application or service. Determining your data architecture is one of those tasks. When architecting the structure of your application and the services that make up the application, it is essential to consider the data needs and requirements of those services. Scaling your data storage and access is hard, and your data architecture can dramatically affect your data scalability. Even if you use a highly scalable database, such as AWS DynamoDB or Cockroach Labs’ CockroachDB, you need to be mindful of your data architecture to meet your scaling needs. A free two-chapter excerpt of O’Reilly’s Architecting for Scale is available for download.
Cockroach Labs makes CockroachDB, the most highly evolved cloud-native, distributed SQL database on the planet. It helps companies of all sizes — and the apps they develop — scale fast, survive disaster, and thrive everywhere.
Learn More
The latest from Cockroach Labs
TRENDING STORIES
Lee Atchison is a software architect, author, public speaker, and recognized thought leader on cloud computing and application modernization. His most recent book is “Architecting for Scale” (O’Reilly Media), and he also is author of “The Definitive Guide to Caching...
Read more from Lee Atchison
Cockroach Labs sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.