VOOZH about

URL: https://thenewstack.io/scylladbs-take-on-webassembly-for-user-defined-functions/

⇱ ScyllaDB's Take on WebAssembly for User-Defined Functions - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-12-08 09:34:26
ScyllaDB's Take on WebAssembly for User-Defined Functions
sponsor-scylladb,sponsored-post-contributed,
Data / Open Source / Storage

ScyllaDB’s Take on WebAssembly for User-Defined Functions

We're adding helper libraries for Rust and C++, which will make writing a user-defined function no harder than writing a regular native function in any language.
Dec 8th, 2022 9:34am by Piotr Sarna
👁 Featued image for: ScyllaDB’s Take on WebAssembly for User-Defined Functions
Image via Pixabay.
ScyllaDB sponsored this post.
WebAssembly, also known as Wasm, is a binary format for representing executable code, designed to be easily embedded into other projects. It turns out that Wasm is also a perfect candidate for user-defined functions (UDFs) on the backend, thanks to its ease of integration, performance and popularity. ScyllaDB, the database for data-intensive applications that require high throughput and low latency, supports user-defined functions expressed in WebAssembly, based on an open source runtime written natively in Rust called Wasmtime. In fact, we recently added Rust support to our build system to make future integrations even smoother. This article provides a look inside how and why we integrated with WebAssembly.

Choosing the Right Engine

WebAssembly is a format for executable code designed first and foremost to be portable and embeddable. As its name suggests, it’s a good fit for web applications. Since it’s quite fast, it’s also generally a good choice for an embedded language. One of WebAssembly’s core features is isolation. Each module is executed in a sandboxed environment separate from the host application. Such a limited trust environment is really desired for an embedded language because it vastly reduces the risk of somebody running malicious code from within your project. Wasm is a binary format, but it also specifies a human-readable text format called WebAssembly Text format or WAT. 👁 Image
To integrate WebAssembly into a project, you need to pick an engine. The most popular engine is Google’s v8, which is implemented in C++ with support for JavaScript and a very rich feature set. Unfortunately, it’s also quite heavy and not very easy to integrate with asynchronous frameworks like Seastar, an open source C++ framework for high-performance server applications on modern hardware, which is a building block of ScyllaDB. Fortunately, there’s also Wasmtime, a smaller (but not small!) project implemented in Rust. It supports WebAssembly but not JavaScript, which makes it more lightweight. It also has good support for asynchronous environments and offers C++ bindings, making it a good fit for injecting into ScyllaDB for a proof-of-concept implementation. ScyllaDB selected Wasmtime because it’s lighter than v8 and has the potential to be async-friendly. Though we currently use the existing C++ bindings provided by Wasmtime, we plan to implement this whole integration layer in Rust and then compile it directly into ScyllaDB.
ScyllaDB is engineered to deliver predictable performance at scale. It’s adopted by organizations that need ultra-low latency, even over millions of ops/sec & PBs of data. Our unique architecture leverages the power of modern infrastructure – translating to fewer nodes, less admin & lower costs.
Learn More
The latest from ScyllaDB
Hear more from our sponsor

Coding in WebAssembly

So how would you create a WebAssembly program?

WebAssembly Text Format

First, modules can be coded directly in WebAssembly text format. It’s not the most convenient way, at least for me, due to Wasm’s limited type system and specific syntax with lots of parentheses. But it’s possible, of course. All you need in this case is a text editor. Being in love with Lisp wouldn’t hurt either.
```wat
(module
 (func $fib (param $n i64) (result i64)
 (if
 (i64.lt_s (local.get $n) (i64.const 2))
 (return (local.get $n))
 )
 (i64.add
 (call $fib (i64.sub (local.get $n) (i64.const 1)))
 (call $fib (i64.sub (local.get $n) (i64.const 2)))
 )
 )
 (export "fib" (func $fib))
)
```

C++

C and C++ enthusiasts can compile their language of choice to Wasm with the clang compiler.
```cpp
int fib(int n) { 
 if (n < 2) {
 return n;
 } 
 return fib(n - 1) + fib(n - 2);
}
```
```sh
clang -O2 --target=wasm32 --no-standard-libraries -Wl,--export-all -Wl,--no-entry fib.c -o fib.wasm
wasm2wat fib.wasm > fib.wat
```

The binary interface is well defined, and the resulting binaries are also well optimized underneath. The code is compiled to WebAssembly with the use of LLVM representation, which makes many optimizations possible.

Rust

Rust also has the ability to reduce Wasm output in its ecosystem, and a `wasm32 target` is already supported in Cargo, the official Rust build toolchain.
```rust
use wasm_bindgen::prelude::*;

#[wasm_bindgen]
pub fn fib(n: i32) -> i32 {
 if n < 2 {
 n
 } else {
 fib(n - 1) + fib(n - 2)
 }
}
```
```sh
rustup target add wasm32-unknown-unknown
cargo build --target wasm32-unknown-unknown
wasm2wat target/wasm32-unknown-unknown/debug/fib.wasm > fib.wat
```

AssemblyScript

There’s also AssemblyScript, a TypeScript-like language that compiles directly to WebAssembly. AssemblyScript is especially nice for quick experiments because it’s a scripting language. It’s also the only language that was actually invented and designed with WebAssembly as a compilation target in mind.
```assemblyscript
export function fib(n: i32): i32 {
 if (n < 2) {
 return n
 }
 return fib(n - 1) + fib(n - 2)
}
```
```sh
asc fib.ts --textFile fib.wat --optimize
```

User-Defined Functions

Why does ScyllaDB need WebAssembly? Our first use case involves user-defined functions (UDFs). UDF is a Cassandra Query Language (CQL) feature that allows you to define a function in a given language, then call that function when querying the database. The function will be applied on the arguments by the database itself, and only then will it be returned to the client. UDF also makes it possible to express nested calls and other more complex operations. Here’s how you can use a user-defined function in CQL:
```cql
cassandra@cqlsh:ks> SELECT id, inv(id), mult(id, inv(id)) FROM t;

 id | ks.inv(id) | ks.mult(id, ks.inv(id))
----+------------+-------------------------
 7 | 0.142857 | 1
 1 | 1 | 1
 0 | Infinity | NaN
 4 | 0.25 | 1

(4 rows)
```

UDFs are cool enough by themselves, but a more important purpose is enabling user-defined aggregates (UDAs). UDAs are custom accumulators that combine data from multiple database rows into potentially complex outputs. UDAs consist of two functions: one for accumulating the result for each argument, and another for finalizing and transforming the result into the output type. The code example below shows an aggregate that computes the average length of all requested strings. The functions below are coded in Lua, which is yet another language that ScyllaDB supports. First, let’s create all the building blocks: functions for accumulating partial results and transforming the final result:
```cql
CREATE FUNCTION accumulate_len(acc tuple<bigint,bigint>, a text)
	RETURNS NULL ON NULL INPUT
	RETURNS tuple<bigint,bigint>
	LANGUAGE lua as 'return {acc[1] + 1, acc[2] + #a}';
CREATE OR REPLACE FUNCTION present(res tuple<bigint,bigint>)
	RETURNS NULL ON NULL INPUT
	RETURNS text
	LANGUAGE lua as
	'return "The average string length is " .. res[2]/res[1] .. "!"';
```

Next, let’s combine them all into a user-defined aggregate:
```cql
CREATE OR REPLACE AGGREGATE avg_length(text)
	SFUNC accumulate_len
	STYPE tuple<bigint,bigint>
	FINALFUNC present INITCOND (0,0);
```

Here’s how you can use the aggregate after it’s created:
```cql
cassandra@cqlsh:ks> SELECT * FROM words;

 word
------------
 monkey
 rhinoceros
 dog
(3 rows)

cassandra@cqlsh:ks> SELECT avg_length(word) FROM words;

 ks.avg_length(word)
-----------------------------------------------
 The average string length is 6.3333333333333!
(1 rows)
```

One function accumulates partial results by storing the total sum of all lengths and the total number of strings. The finalizing function divides one by the other to return the result. In this case, the result is in the form of rendered text. The potential here is quite large — user-defined aggregates allow using database queries in a more powerful way; for instance, by gathering complex statistics or transforming whole partitions into different formats.

Enter WebAssembly

To create a user-defined function in WebAssembly, we first need to write or compile a function to Wasm text format. The function body is then simply registered in a CQL statement called `create function`. That’s it!
```cql
CREATE FUNCTION ks.fib (input bigint)
RETURNS NULL ON NULL INPUT
RETURNS bigint LANGUAGE xwasm
AS '(module
 (func $fib (param $n i64) (result i64)
 (if
 (i64.lt_s (local.get $n) (i64.const 2))
 (return (local.get $n))
 )
 (i64.add
 (call $fib (i64.sub (local.get $n) (i64.const 1)))
 (call $fib (i64.sub (local.get $n) (i64.const 2)))
 )
 )
 (export "fib" (func $fib))
 (global (;0;) i32 (i32.const 1024))
 (export "_scylla_abi" (global 0))
 (data $.rodata (i32.const 1024) "\\01")
)'
```
```cql
cassandra@cqlsh:ks> SELECT n, fib(n) FROM numbers;
 n | ks.fib(n)
---+-----------
 1 | 1
 2 | 1
 3 | 2
 4 | 3
 5 | 5
 6 | 8
 7 | 13
 8 | 21
 9 | 34
(9 rows)
```

Note that the declared language here is xwasm, which stands for “experimental Wasm.” Support for this language is currently still experimental in ScyllaDB. The current design document is maintained here. You’re welcome to take a look at it: https://github.com/scylladb/scylladb/blob/master/docs/dev/wasm.md

Our Roadmap

ScyllaDB’s WebAssembly support is in active development; here are some of our top goals.

Helper Libraries for Rust and C++

Writing functions directly in WAT format is not trivial because ScyllaDB expects the functions to follow our application binary interface (ABI) specification. To hide these details from developers, we’re in the process of implementing helper libraries for Rust and C++, which seamlessly provide ScyllaDB bindings. With our helper libraries, writing a user-defined function will be no harder than writing a regular native function in your language of choice.

Rewriting the User-Defined Functions Layer in Rust

We currently rely on Wasmtime’s C++ bindings to expose a Wasm runtime for user-defined functions to run on. These C++ bindings have certain limitations, though. Specifically, they lack support for asynchronous operations, which is present in Wasmtime’s original Rust implementation. The choice is abundantly clear — let’s rewrite it in Rust! Our precise plan is to move the entire user-defined functions layer to Rust, where we can fully utilize Wasmtime’s potential. With such an implementation, we’ll be able to run user-defined functions asynchronously, with strict latency guarantees. We’ll only provide a thin compatibility layer between Seastar and Rust’s async model to enable polling Rust futures directly from ScyllaDB. The rough idea for binding Rust futures straight into Seastar is explained here. We already added Rust support to our build system. The next step is to start rewriting the user-defined functions engine to a native Rust implementation, and then we can compile it right into ScyllaDB.

Keeping Latency Low for User-Defined Functions with WebAssembly

I shared more details about how we integrated WebAssembly and Wasmtime into our project in a latency-friendly manner at the recent P99 CONF, an open source, community-focused conference for engineers who obsess over low latency. The talk, “Keeping Latency Low for User-Defined Functions with WebAssembly,” is available on demand.
ScyllaDB is engineered to deliver predictable performance at scale. It’s adopted by organizations that need ultra-low latency, even over millions of ops/sec & PBs of data. Our unique architecture leverages the power of modern infrastructure – translating to fewer nodes, less admin & lower costs.
Learn More
The latest from ScyllaDB
Hear more from our sponsor
TRENDING STORIES
Piotr Sarna is a software engineer who is keen on open source projects and the Rust and C++ languages. He previously developed an open source distributed file system and had a brief adventure with the Linux kernel. He's also a...
Read more from Piotr Sarna
ScyllaDB sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.