VOOZH about

URL: https://thenewstack.io/how-llms-helped-me-build-an-odbc-plugin-for-steampipe/

⇱ How LLMs Helped Me Build an ODBC Plugin for Steampipe - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-09-28 10:50:22
How LLMs Helped Me Build an ODBC Plugin for Steampipe
tutorial,
Data / Large Language Models / Software Development

How LLMs Helped Me Build an ODBC Plugin for Steampipe

Jon Udell uses ChatGPT, Cody and GitHub Copilot to help him build an ODBC Plugin for Steampipe, an extensible SQL interface to cloud APIs.
Sep 28th, 2023 10:50am by Jon Udell
👁 Featued image for: How LLMs Helped Me Build an ODBC Plugin for Steampipe
Image via Pexels

I’d written my first two Steampipe plugins (Hypothesis, Mastodon) in the pre-LLM era, so I was eager to work with my team of assistants on my next project: a plugin for ODBC (Open Database Connectivity).

Steampipe nominally maps APIs to database tables. When you select * from aws_sns_topic, Steampipe winds up calling the AWS ListTopics API. Many Steampipe plugins work like that: a table corresponds to a specific API call.

But some plugins work in a more general way. The net_http_request table of the Net plugin turns Steampipe into an HTTP client. The exec plugin creates a SQL facade over the universe of shell commands, and the Terraform plugin does the same for infrastructure-as-code config files. By broadening the definition of what counts as an API, Steampipe keeps extending its embrace of structured data in all forms.

Databases present another kind of API. Steampipe plugins for databases can’t use fixed schemas, but rather must discover them on the fly. When the plugin SDK added support for dynamic schemas, the CSV plugin was the first to use the feature. So it became one inspiration for an ODBC plugin that would create a SQL facade over any database with an ODBC driver.

Another inspiration was the Postgres plugin from Jose Reyes. (Which, to be clear, is just a small part of his deep dive into Steampipe.) The Postgres plugin enables Steampipe queries against remote Postgres tables

See These Examples? Do the Analogous Thing for ODBC.

That was the dream. Hey, it never hurts to ask, right? But that wasn’t a great use of my team. I wasn’t able to get ChatGPT, Sourcegraph Cody, or GitHub Copilot to extrapolate from the examples to anything close to a working plugin. Instead, as usual, we broke things down into manageable chunks. And as usual that worked well.

Here’s a small example of useful assistance. The plugin requires a config file that defines ODBC data sources and table names. Those definitions are written in HCL. It took some iteration to arrive at a format that would work with the Steampipe config schema. Back-and-forth dialogue with the team helped me arrive at this ODBC connection format.

connection "odbc" {
 plugin = "odbc"

 data_sources = [
 "SQLite:foo",
 "PostgreSQL:jose"
 ]
}

Given that, the LLMs were then able to write the boilerplate code needed for the plugin’s config machinery. Small things like that add up.

Showstopper and Workaround

ODBC is a portal to a universe of data sources. First, you install a driver manager (like unixODBC on Linux), and then you add drivers that talk to SQLite or Postgres or sources that aren’t even databases (they’re portals to other universes of data sources). One such portal is CData, which offers a broad set of ODBC drivers, some that overlap with Steampipe plugins and many that don’t. That sounded like an interesting first test for the plugin so I installed CData drivers for RSS and Slack and began working on getting the plugin to discover their schemas.

When I tried calling the ODBC driver from the plugin’s initialization phase, though, nothing worked; and there were ominous messages in the log about low-level OS signal handling. It was debugging that was beyond me — was it Steampipe? CData? unixODBC? a combination? — but I wanted to make progress if possible. So I tried a few workarounds: guarding the plugin’s use of the ODBC driver with a mutex, fiddling with timing, and — what finally worked — running schema discovery after initialization and caching schemas on the file system. It’s “a bit hacky,” said ChatGPT. But the speed with which I was able to iterate through those options, with its help, made all the difference.

Schema Discovery

Steampipe plugins are written in Go, and they rely heavily on its ecosystem of data-source SDKs. The best option for the ODBC plugin was github.com/alexbrainman/odbc. It works well and supports some introspection, but the most universal method seems to be the dumbest: select one row of data, capture column names, and try to sniff their types. The LLMs (mostly ChatGPT) made quick work of implementing that strategy.

We did discuss its flaws. For example, what if the sample’s first row contains nulls? That isn’t a fatal flaw, though, it just means the column will always be a string type, and the Steampipe query author will have to write where number::int > 1 instead of where number > 1, which isn’t the end of the world. We also agreed that if the plugin survives and matures, it might be worth investing in a way for users of the plugin to provide hints that activate database-specific discovery mechanisms. But meanwhile, the dumb method was good enough to keep moving.

Implementing SQL-to-SQL Pushdown

Here’s a query to find open issues assigned to you.

select
 repository_full_name,
 number
 title
from
 github_my_issue
where
 state = 'OPEN';

If the GitHub plugin didn’t implement pushdown, Steampipe would map the query to the GitHub API that lists all your issues and return a table with all of them. Steampipe’s Postgres engine would then apply the WHERE condition to filter results to just open issues.

Of course, you’d rather push that filtering down into the API where possible. So what actually happens here is that the plugin defines state as an optional key column (aka qualifier or “qual”). When the query includes where state = 'OPEN', the plugin adjusts the API call to include that filter condition.

The same idea works with plugins where the API is SQL. You can see that happening here in the Postgres plugin. The table definition‘s List function sets up all the columns in each discovered schema as optional key columns, so any or all can be mentioned in the Steampipe WHERE clause and pushed down into the WHERE clause handled by the remote Postgres.

It’s the same with the ODBC plugin. You can’t tell just by looking at this query.

select
 name,
 number,
 _metadata
from
 odbc.sqlite_foo
where
 number = 1
+------+--------+-------------------------------------------+
| name | number | _metadata |
+------+--------+-------------------------------------------+
| jon | 1 | {"connection_name":"odbc","dsn":"sqlite"} |
+------+--------+-------------------------------------------+

But under the covers, because the plugin implements pushdown, its debug log shows that the WHERE filter is handled by SQLite, not by Steampipe.

ChatGPT didn’t get this right on the first try. Despite the clear example of the Postgres plugin, it offered a partial solution that correctly adjusted the SQL passed to SQLite but neglected to define the optional key columns. That was an easy fix, and together we implemented the feature much more easily than I’d have done on my own.

Test Strategy

I asked my team to discuss ways to test the plugin, and the responses were pretty good all around. Copilot proposed a reasonable test for the getSchemas function, but after resolving hallucinations there were still issues getting it to run. Logging was problematic, as was mocking the database connection.

Working through these issues was far easier with LLM assistance than it would be otherwise. So easy, in fact, that I lost the thread. It was interesting to learn details about how to interact with the plugin SDK’s logging mechanism, and how to work with mocked connections to ODBC drivers. With the ability to rapidly iterate through solutions, I made rapid progress. But as the test code grew more complex, it all started to feel like too much effort for too little return.

So I made the executive decision to switch to an end-to-end testing strategy: populate various ODBC sources with sample data, and run Steampipe queries against them. I’ve found LLMs to be spectacularly good at generating test data. In this case, for starters, that meant writing a standalone program to populate a SQLite database. All three assistants did that easily, but ChatGPT’s version was the most interesting. Given our discussion of the first-row-sampling strategy, it “knew” the first row should contain nulls.

Post Mortem: Review and Explain

Finally, I invited the team to review the code and explain how it all works. ChatGPT, which had plenty of context, did a great job. Because Cody and Copilot hadn’t participated as much they had less context, and I think that made this a useful test. The ability of LLMs to help you orient to unfamiliar code is a key strength.

Both Cody and Copilot produced useful explanations. Given that both can see the local repo where the code lives, though, I was surprised to see Copilot hallucinate the names of files and functions that Cody got right.

👁 Image

I then asked Cody and Copilot to evaluate the strategy for schema discovery. I’d already discussed this extensively with ChatGPT and had concluded the obvious flaw — that sampling the first row risks finding nulls for some columns — was an acceptable risk for a first version of the plugin that might later be enhanced with database-specific logic.

👁 Image

Cody’s more complete and cogent response noted the key flaw, Copilot’s skimpier response missed it.

In general, I’m finding it helpful to ask LLMs to review both code and prose. When the rubber duck talks back, the responses may or may not be useful and accurate. But either way, the interaction can prompt you to think differently about what you’re trying to do. It feels intrinsically valuable.

TRENDING STORIES
Jon Udell is an author and software developer who explores software tools and technologies and explains them in writing, audio, and video. He is the author of the cult classic Practical Internet Groupware. Past gigs include Lotus, BYTE magazine, Safari...
Read more from Jon Udell
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Sourcegraph.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.