![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
When designing any sort of software, deciding on your data format, types and structure informs many design decisions that you may benefit from or regret. When designing Cerbos, we made an early decision to have structured, type-safe and serializable data, as we knew we wanted to be able to expose a predictable interface to the application, with predictable data types that could be used by other services in a myriad of languages.
The technology chosen to deliver this was Protocol Buffers (protobuf), along with gRPC, in which all the core features — including the policies, API, data storage, data interchange and even the test cases — are defined.
Protocol buffers are a language- and platform-neutral mechanism for serializing structured data. Unlike the more widely known JSON format, protobufs have a compact, binary wire format and mandatory schemas for every message.
The schema language supports a wide range of data types, nested messages, unions and enumerations. These schemas are used by the protobuf compiler to generate code for the developer’s programming language of choice. The generated code contains type-safe, native data types and structures that are equivalent to the abstract types defined in the schema along with specialized utility functions such as optimized encode/decode functions for each type.
gRPC is an RPC framework that uses protobufs for data exchange. It uses HTTP/2 as the transport mechanism that allows it to make use of all the speed, security and efficiency features provided by the HTTP/2 spec for interservice (or even interprocess) communication. gRPC also benefits from code generation to make the RPC calls resemble native function calls in the programming language.
We were intimately familiar with protobufs and gRPC from our previous roles building highly available, data-intensive, internet-facing API services and large-scale data-processing systems. Protobufs’ concise and lightweight interface definition language (IDL) allowed us to create flexible schemas describing the data we were working with and evolve them over time by adding new fields or deprecating old ones while maintaining backward and forward compatibility.
The efficient binary encoding helped us save bandwidth and transmit messages between different processing pipelines quickly and efficiently. (At the scale of dealing with billions of messages, even a few bytes shaved off each message makes a massive difference.) Because encoded protobufs are language-agnostic, they were an ideal format to exchange data between applications written in different languages such as API services written in Go and data-processing pipelines written in Java or Python.
They were also a good choice for storing data with loosely known schemas, such as cache entries, that were still accessible in a type-safe, native language structure in the programming language of choice. Using gRPC for services was a natural extension of our use of protobufs. Again, it allowed us to build fast and efficient API services that exchanged data in binary protobuf encoding and worked over HTTP/2, providing advanced features like bi-directional streaming and connection multiplexing.
Despite all the positives listed above, using protobufs in a nontrivial way used to be quite painful. The polyglot system described above required a shared set of protobuf schemas with both local imports and third-party imports, such as Google protobufs. Every time there was an update to the schemas, we had to generate code for multiple programming languages, package, version, and distribute them to various package registries.
We had to build our own custom tooling and CI jobs to properly version the changes, download external dependencies, download protobuf code generators, and compile toolchains for each programming language, generate packaging projects and, finally, build and upload the packages to the appropriate package registry. There were some community-built tools, such as ProtoEasy and ProtoTool, that helped address some of the pain points, such as downloading dependencies and language generators, but none of them addressed all aspects of the process.
We loved gRPC for its streaming capabilities, speed and efficiency — resource usage of our services was measurably lower compared to previous JSON-based REST services, and they had much better latency and throughput — but it was difficult to expose pure gRPC services externally without a translation layer like grpc-gateway in front. Streaming — especially bi-directional streaming — was out of the question with a translation layer, so we were losing some functionality as well.
Most cloud providers did not support HTTP/2 over their load balancers (support is still quite spotty, in fact). Even if that limitation could be worked around using TCP load balancers, we still had the problem of generating and distributing gRPC client code to external customers. Even our own JavaScript code running in browsers could not access our gRPC services. The introduction of grpc-web was a partial solution to this problem, but the project was in very early stages with limited functionality and required extra infrastructure such as an Envoy proxy to do the translation, which was not ideal.
The protobuf/gRPC ecosystem has significantly improved over the past few years. More organizations are investing in and adopting the technology. Projects like etcd, CockroachDB and Vitess are examples of large-scale, critical infrastructure built on top of protobufs and gRPC. Almost all popular service meshes, proxy servers and load balancers now have native support for gRPC services. Frameworks like Dapr use gRPC to provide language-agnostic, standardized component building blocks for application development. (Cerbos is quite similar in that we aim to provide a plug-n-play access control solution for any application.)
Many great tools and utilities, such as Buf, grpcurl, ghz and others, have made developing and working with protobufs and gRPC a much more pleasant experience. Buf deserves a special mention here because it has solved almost all of the annoyances and pain points associated with protobuf development mentioned earlier in this article. (We are not affiliated with Buf in any way; we are just a bunch of happy users with deep appreciation for an awesome product.)
When we first started building Cerbos, we had a clear set of principles that we wanted to follow.
Given the above requirements, it was a natural choice to pick protobufs and gRPC for some of the most fundamental portions of the Cerbos product. Today, almost all of the core data structures and most of our extensive test suite are defined using protobufs, and the primary API of Cerbos is a gRPC API.
The main API surface of Cerbos is implemented using gRPC. On the server side, we use interceptors to implement many important features like request validation, audit log capture, metrics collection, distributed trace propagation, error handling and authentication. Grpc-gateway is used to provide a REST+JSON translation layer for the benefit of humans and languages that don’t have a gRPC implementation. Some RPCs, such as the one for retrieving audit log entries, are built as streaming RPCs that can efficiently stream large volumes of data to clients with backpressure.
Almost all of our client SDKs use the gRPC API primarily for the speed and efficiency gains. But, almost as importantly, using gRPC allows us to generate type-safe client stubs with built-in, low-level plumbing (HTTP/2 transports with support for mTLS, Unix domain sockets, etc.) for most popular programming languages. This base layer gives us a solid foundation on top of which we can add a thin, convenience layer to provide idiomatic language constructs for working with Cerbos.
Most popular service meshes and load balancers provide tracing, retries, circuit breaking and load balancing of gRPC requests, which gives users full control over how their services are configured to communicate with Cerbos services or sidecars in their environment. It also saves us from having to reinvent the wheel for those complex resiliency features and, instead, rely on battle-tested implementations built by experts.
Our protobuf/gRPC workflow extensively makes use of Buf, an almost magical tool that makes working with protobufs an absolute pleasure. We use the Buf CLI and GitHub Actions to format, lint, detect breaking changes and generate code from protobufs. Buf automatically downloads dependencies and plugins required for the build and saves us the pain of having to manage them manually.
On each successful build, Cerbos protobuf definitions are automatically uploaded to the Buf schema registry (BSR), which allows us to effortlessly distribute the service and schema definitions for anyone to use. The BSR eliminates the need to maintain copies of the protobuf definitions in each SDK repository. With a single command, developers can pull down the latest definitions from the BSR and regenerate client code. Buf’s managed mode and remote plugins become extremely handy during this process to customize the output and manage toolchains.
The decision to invest in protobufs and gRPC has had a massive positive impact on our productivity and velocity. Even with a small (four-person) team, we have managed to build a fast, lean, feature-rich product and a plethora of tools, SDKs and demos that would have taken much more time and effort to build without the convenience, safety and productivity provided by the protobuf/gRPC ecosystem. Going forward, we have plenty of exciting, new Cerbos features in the pipeline that will be built using the same proven and reliable technical foundation.