VOOZH about

URL: https://thenewstack.io/duck-db-query-processing-is-king/

⇱ DuckDB: Query Processing Is King - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-01-05 07:00:32
DuckDB: Query Processing Is King
sponsor-percona,sponsored-topic,tutorial,
Databases

DuckDB: Query Processing Is King

With in-process, open source DuckDB, you can create an in-memory database that does not save data, or you can use a local file. Here's how to get started.
Jan 5th, 2025 7:00am by David Eastman
👁 Featued image for: DuckDB: Query Processing Is King
Featured image by Curated Lifestyle, from Unsplash+.
One of the ways in which modern databases silently helped rather poor programs during the first dot-com boom of the late ’90s was that they handled asynchronous queries — often when the rest of the system didn’t. This led to some comments from Oracle Co-Founder Larry Ellison, to the Guardian in 2001, that are very interesting in retrospect: “Some of these New Economy companies — take pets.com — it’s good that they’re gone. Selling cat food on the internet was lunacy.” Coming back to today, the attraction of DuckDB, an open source project that released its version 1.0 in June. is that it is an in-process database. So, I build it as part of my executable; I’m not connecting to some totally separate system. It isn’t for selling cat food. The first thing to get your head around is that the data persistence is not the central concern here — just query processing. You can create an in-memory database that does not persist (i.e., save data) at all, or you can use a local file. The purpose of a DuckDB database is likely to be sucking up some data, querying over it, maybe making some transformations, then going away. So for working with DuckDB, we just need a library or plug-in, not a new application or service. If I look at the website’s front page, C# is’t mentioned. However, C# is supported via an open source ADO.NET provider. That this exists already certainly proves the ecosystem is probably already quite healthy.

Design Goals

Let’s contemplate the very first line of the example code below, before we even fire up Visual Studio Code:
var duckDBConnection = new DuckDBConnection("Data Source=file.db");
Obviously, “database” is synonymous with persistence, even though that isn’t the primary purpose for DuckDB. In the example above, we use a file as a persistent data source. The file format is forward and backward compatible, and obviously this is somewhat important for maintenance. But you would probably be unwise to commit to a long-term data strategy with a company that might have a shorter lifetime than your data. But that just underlines the same concept: persistence is not the mainstay here. If we left the data source argument out, or used the keyword :memory:, then we would have an in-memory database. What about concurrency? The design goals of this project clearly point to not attempting to support complex scenarios, although they have simple modes to allow for optimistic concurrency. The idea is to set up the data, whack it with queries, then go.

Getting Started

So let’s get started with Visual Studio Code: 👁 Image
So, we get a fresh Visual Studio Code in a new project folder. I’ve written about setting up Visual Studio Code before; for now, I’ll just say these are the relevant extensions that I have installed for working with C#: 👁 Image
Create a new project using “.NET new project” with the command palette then add DuckDB.NET.Data.Full using “nuget: add package” with the palette again. These all have command-line equivalents, but I’ll stay in the IDE for this post. You should now have an empty project, so let’s add the following into the file “Program.cs”:
using DuckDB.NET.Data;

var duckDBConnection = new DuckDBConnection("Data Source=file.db");
duckDBConnection.Open();

var command = duckDBConnection.CreateCommand();
command.CommandText = "CREATE TABLE integers(foo INTEGER, bar INTEGER);";
var executeNonQuery = command.ExecuteNonQuery();

command.CommandText = "INSERT INTO integers VALUES (3, 4), (5, 6), (7, NULL);";
executeNonQuery = command.ExecuteNonQuery();

command.CommandText = "Select count(*) from integers";
var count = command.ExecuteScalar();
Console.WriteLine($"Rows = {count}");

command.CommandText = "SELECT foo, bar FROM integers";
var reader = command.ExecuteReader();

Console.Write($"Columns: ")
for (var index = 0; index < reader.FieldCount; index++)
{
 var column = reader.GetName(index);
 Console.Write($"{column} ");
}

Console.WriteLine();

while (reader.Read())
{
 for (int index = 0; index < reader.FieldCount; index++)
 {
 if (reader.IsDBNull(index))
 {
 Console.WriteLine("NULL");
 continue;
 }

 var val = reader.GetFieldValue<int>(index);
 Console.Write(val);
 Console.Write(" ");
 }
}
Running this, we get the following in the terminal output:
Rows = 3 
Columns: foo bar 
3 4 5 6 7 NULL
Let’s take a look at what happened. First, we know we should have made a persistence file called “file.db”, and this is visible in the bin directory. Indeed, if you try to run this same code twice we get:
Unhandled exception. DuckDB.NET.Data.DuckDBException 0x0000000D): 
Catalog Error: Table with name "integers" already exists!
This is correct, but only because we opened up a persistent database. Comment out the Database Definition Language (DDL) and we can run it as many times as we like. Or change the first line to:
var duckDBConnection = new DuckDBConnection("Data Source=:memory:");
for an in-memory db to run multiple times. We then used some DDL to create the table.
command.CommandText = "CREATE TABLE integers(foo INTEGER, bar INTEGER);"; 
var executeNonQuery = command.ExecuteNonQuery();
I guess, as it isn’t Database Manipulation Language (DML), it is executed as a “nonquery.” Somewhat strange nomenclature, to be sure. When we want a result, we ask specifically for the result type after execution:
command.CommandText = "Select count(*) from integers"; 
var count = command.ExecuteScalar(); 
Console.WriteLine($"Rows = {count}");
But count clearly resolves into something we can print. For detailed results, we get a “Reader” after executing the query. (Yes, these commands could be better named.)
command.CommandText = "SELECT foo, bar FROM integers"; 
var reader = command.ExecuteReader();
Before we use it, we kind of know the Reader holds a result array that we will have to access via the API. We can simply summarise the rest of the code by looking at how we do this.
for (var index = 0; index < reader.FieldCount; index++) { 
 var column = reader.GetName(index); 
 Console.Write($"{column} "); 
}
The FieldCount method just counts the columns, and GetName returns the column names. (We could use GetDataTypeName to get the column type.) Finally, we do a full row-by-row read of the data.
while (reader.Read())
{
 for (int index = 0; index < reader.FieldCount; index++)
 {
 if (reader.IsDBNull(index))
 {
 Console.WriteLine("NULL");
 continue;
 }

 var val = reader.GetFieldValue<int>(index);
 Console.Write(val);
 Console.Write(" ");
 }
}
We see that the iterator is just calling Read() until it returns null. For each row of records we can then just use GetFieldValue to extract the Integer value. This is quicker than using a separate result set or similar, but makes the code less readable as the Reader is holding both temporary and static data simultaneously.

Conclusion

DuckDB directly supports Python, R, Java, Node.js, Go and Rust, so your environment is almost certainly catered for. This should be a useful tool for both testing scenarios and transforming data on the fly. But I also like the idea of using it to gain SQL query support without worrying about the weight of a full database system. But if you are planning on rewriting pets.com, use something else.
Percona is widely recognized as a world-class open source database software, support, and services company for MySQL®, MongoDB®, and PostgreSQL® databases. We are dedicated to helping make your databases and applications run better through a unique combination of expertise and open source software.  
Learn More
The latest from Percona
TRENDING STORIES
David has been a London-based professional software developer with Oracle Corp. and British Telecom, and a consultant helping teams work in a more agile fashion. He wrote a book on UI design and has been writing technical articles ever since....
Read more from David Eastman
SHARE THIS STORY
TRENDING STORIES
Oracle is a sponsor of The New Stack.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.