VOOZH about

URL: https://slatedb.io/docs/tutorials/s3/

⇱ Connect SlateDB to S3 | SlateDB


Skip to content

Connect SlateDB to S3

Learn how to connect SlateDB to Amazon S3 using LocalStack

This tutorial shows you how to connect SlateDB to S3. We’ll use LocalStack to simulate S3.

Let’s start by creating a new Rust project:

Terminal window
cargoinitslatedb-playground
cdslatedb-playground

Now add SlateDB and the required dependencies to your Cargo.toml:

Terminal window
cargoaddslatedbtokio--featurestokio/macros,tokio/rt-multi-thread
cargoaddobject-store--featuresobject-store/aws
cargoaddanyhow

You will need to have LocalStack running. You can install it using Homebrew:

Terminal window
brewinstalllocalstack/tap/localstack-cli
Terminal window
localstackstart-d

For a more detailed setup, see the LocalStack documentation.

You’ll also need the AWS CLI:

Terminal window
brewinstallawscli

SlateDB requires a bucket to work with S3.

Terminal window
# Create S3 bucket
aws--endpoint-url=http://localhost:4566s3apicreate-bucket--bucketslatedb--regionus-east-1

Stick this into your src/main.rs file:

main.rs
useslatedb::Db;
usestd::sync::Arc;
#[tokio::main]
asyncfnmain() ->anyhow::Result<()> {
let object_store =Arc::new(
object_store::aws::AmazonS3Builder::new()
// These will be different if you are using real AWS
.with_allow_http(true)
.with_endpoint("http://localhost:4566")
.with_access_key_id("test")
.with_secret_access_key("test")
.with_bucket_name("slatedb")
.with_region("us-east-1")
.build()?,
);
let db =Db::open("/tmp/slatedb_s3_compatible", object_store.clone()).await?;
// Call db.put with a key and a 64 meg value to trigger L0 SST flush
let value:Vec<u8> =vec![0; 64*1024*1024];
db.put(b"k1", value.as_slice()).await?;
db.close().await?;
Ok(())
}

Now you can run the code:

Terminal window
cargorun

This will write a 64 MiB value to SlateDB.

Now’ let’s check the database path in the bucket:

Terminal window
%aws--endpoint-url=http://localhost:4566s3lss3://slatedb/tmp/slatedb_s3_compatible/
PREcompacted/
PREcompactions/
PREmanifest/
PREwal/

There are four folders:

  • manifest: Contains the manifest files. Manifest files define the state of the DB, including the set of SSTs that are part of the DB.
  • wal: Contains the write-ahead log files.
  • compacted: Contains the compacted SST files (may not appear in short examples).
  • compactions: Contains the compaction-state snapshots.

Let’s check the wal folder:

Terminal window
%aws--endpoint-url=http://localhost:4566s3lss3://slatedb/tmp/slatedb_s3_compatible/wal/
2024-09-0418:05:576400000000000000000001.sst
2024-09-0418:05:586710899600000000000000000002.sst

Each of these SST files is a write-ahead log (WAL) file, and each WAL file can contain many RowEntry values. They get flushed based on the flush_interval config. The last WAL file is 64 MiB because it contains the value we wrote.

Finally, let’s check the compacted folder:

Terminal window
%aws--endpoint-url=http://localhost:4566s3lss3://slatedb/tmp/slatedb_s3_compatible/compacted/
2024-09-0418:05:596710899601J6ZVEZ394GCJT1PHZYY1NZGP.sst

Again, we see the 64 MiB SST file. This is the L0 SST file that was flushed with our value. Over time, the WAL entries will be removed, and the L0 SSTs will be compacted into higher levels.

👁 Image

Copyright © SlateDB. All rights reserved. For details on our trademarks, please visit our Trademark Policy and Trademark List.
Trademarks of third parties are owned by their respective holders and their mention here does not suggest any endorsement or association.