VOOZH about

URL: https://thenewstack.io/the-future-of-sql-conversational-hands-on-problem-solving/

⇱ The Future of SQL: Conversational Hands-on Problem Solving - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-04-23 06:04:38
The Future of SQL: Conversational Hands-on Problem Solving
tutorial,
AI / Data / Software Development

The Future of SQL: Conversational Hands-on Problem Solving

With modern SQL features like JSON and CTEs, a large language model can be a "reasoning partner" that helps accelerate learning and working.
Apr 23rd, 2024 6:04am by Jon Udell
👁 Featued image for: The Future of SQL: Conversational Hands-on Problem Solving
Image via Unsplash+.

If you’re returning to SQL after a long absence, as I did a few years back, there are important changes to know about. First, JSON. Many SQL-oriented databases now support JSON columns for arbitrary tree-structured data. Second, common table expressions (CTEs) that you can use to express a complex query as a pipeline of steps that are simple to understand and verify.

The JSON features can be confusing, for example in Steampipe queries like this one which implicitly joins the table github_my_gist with the expansion of its JSON column files.

select
 file ->> 'language' as language,
 count(*)
from
 github_my_gist g,
 jsonb_array_elements(g.files) file
group by
 language
order by
 count desc;

Exhibit A

The query counts GitHub gists by language, and produces output like this.

| language | count |
|-------------|-------|
| Python | 15 |
| Markdown | 34 |
| JavaScript | 7 |
| null | 7 |

Here’s a different version of the query that produces the same result.

-- cte 1 to unnest the json

with expanded_files as (
 select
 g.id as gist_id,
 jsonb_array_elements(g.files) AS file
 from
 github_my_gist g
),

-- sample cte 1 output

-- | gist_id | file |
-- |---------|------------------------------|
-- | 1 | {"language": "Python"} |
-- | 2 | {"language": "Markdown"} |
-- | 3 | {"language": "JavaScript"} |
-- | 4 | {"language": "Python"} |


-- cte 2 to extract the language

languages AS (
 select
 file ->> 'language' as language
 from
 expanded_files
)

-- sample cte 2 output

-- | language |
-- |-------------|
-- | Python |
-- | Markdown |
-- | JavaScript |
-- | Python |

-- final phase to count languages

select
 language,
 count(*) as count
from
 languages
group by
 language
order by
 count desc;

-- sample final output

-- | Python | 2 |
-- | Markdown | 1 |
-- | JavaScript | 1 |

Exhibit B

Levels of Expertise

If you’re well versed in set-returning JSON functions like Postgres’ jsonb_array_elements, which converts a JSON list into a set of rows, and if you can visualize how that transformation interacts with joins, you can craft powerful queries like exhibit A very concisely.

That economy of expression can be good for experts, but newcomers can struggle to mentally unroll the implied steps of the transformation. By “newcomer” I do not mean novice; I mean not yet an expert in this combination of disciplines. (That includes me, by the way, despite years of engagement with SQL at this level.)

From that perspective, you might want to see the steps spelled out explicitly, as in exhibit B. Creating versions of exhibit B is something I do in our support channel, and wanted to do more easily. So I made a simple GPT for that — and when I say “simple GPT,” I mean something like a simple bash script: a quickly built tool that can save more time and/or effort than it costs to build it.

Because it’s broken down into a pipeline of checkable steps, exhibit B is easier to debug, use confidently and revise safely. You could then collapse it down to exhibit A, which might be more efficient, though that’s not necessarily true.

You could even make both versions available, so experts and nonexperts can each see through their preferred lenses. Arguably that’s another form of accessibility, along with everything else that we mean by accessibility.

Here’s the prompt for this GPT. I used it to ask for the names of the issue templates for a given repo, given this schema and an issue_templates column that looks like this:

+------------------------------------------------------------------------------------------------------------------
| issue_templates
+------------------------------------------------------------------------------------------------------------------
| [
| {
| "body": "**Describe the bug**\nA clear and concise description of what the bug is.\n\n**Steampipe version
| "name": "Bug report",
| "about": "Create a report to help us improve",
| "title": "",
| "filename": "bug_report.md"
| },
| {
| "body": "**Is your feature request related to a problem? Please describe.**\nA clear and concise descript
| "name": "Feature request",
| "about": "Suggest an idea for this project",
| "title": "",
| "filename": "feature_request.md"
| }
| ]
+------------------------------------------------------------------------------------------------------------------

In that context, I literally just asked for a query that (per above) lists the names of issue templates for a given repo. Here’s the concise version of the generated (and documented) query.

SELECT
 template ->> 'name' AS template_name
FROM
 github_repository,
 jsonb_array_elements(issue_templates) AS template
WHERE
 full_name = 'your-repository-full-name'; -- Replace 'your-repository-full-name' with the actual full name of your repository

And here’s the expanded version.

-- CTE to expand the JSON array of templates
WITH expanded_templates AS (
 SELECT
 r.id AS repo_id,
 jsonb_array_elements(r.issue_templates) AS template
 FROM
 github_repository r
 WHERE
 r.full_name = 'your-repository-full-name' -- Replace 'your-repository-full-name' with the actual full name of your repository
),

-- Sample data after CTE 1
-- | repo_id | template |
-- |---------|-----------------------------------------|
-- | 1 | {"name": "Bug report", ...} |
-- | 1 | {"name": "Feature request", ...} |

-- CTE to extract the template names
template_names AS (
 SELECT
 template ->> 'name' AS template_name
 FROM
 expanded_templates
)

-- Sample data after CTE 2
-- | template_name |
-- |------------------|
-- | Bug report |
-- | Feature request |

-- Final selection
SELECT
 template_name
FROM
 template_names;

This approach works well enough for simple cases like these, but not so much for more complex ones like this one, which finds S3 buckets with policies that grant external access. In that situation, you don’t just need Postgres knowledge: You also need to know how AWS policies are constructed, and then you need to work out how to use Postgres joins and JSONB operators to query them. If the GPT initially fails to do that for you, that isn’t the end of the story. Having provided a description of the result you want, along with the schema for a table and a sample of a required JSON column, you’ve set up a context for a conversation with an entity that has seen vastly more SQL patterns and AWS policy patterns than you ever will.

Conversational Hands-on Learning

I keep returning to the theme of choral explanations (#4 on my list of best practices), and it’s especially relevant in the SQL domain where there are just so many ways to write a query.

Exploring the range of possibilities used to be arduous, time-consuming and hard to justify. Now it’s becoming hard to justify not doing that; optimizations (sometimes major ones) can and do emerge.

Arguably it has always required a kind of alien intelligence to grok SQL, not to mention query planners. In conversation with LLMs, we can now rapidly explore the space of possibilities and more easily evaluate how different approaches will perform. How else could I write this query? Why would I do it that way? How will the database handle it? (Maybe you can fluently read and understand query plans but I can’t, and I gratefully accept all the help I can get.)

I routinely ask LLMs these kinds of questions and receive answers that are not theoretical, but are versions of my query — working with my data — that I can immediately try, and that lead to follow-up questions that I can also explore cheaply.

Arguably it has always required a kind of alien intelligence to grok SQL, not to mention query planners.

In one test of my latest GPT, I wondered about translating Postgres idioms to SQLite. Postgres and SQLite JSON patterns are quite different. Holding both sets of patterns in your head, and mentally mapping between them, is only a means to an end. If I’m considering whether it’s feasible to switch databases, I don’t want to invest in a deep understanding of SQLite patterns that I wind up never needing. I just want to know what’s possible.

The GPT, which was nominally about Postgres, was happy to help. All you’re really doing with these GPTs is setting an initial context. At any point, you can steer the conversation wherever you want it to go.

Here’s the SQLite counterpart to the query that counts gists by language.

select
 json_extract(value, '$.language') as language,
 count(*) as count
from
 github_my_gist,
 json_each(github_my_gist.files)
group by
 language
order by
 count desc;

ChatGPT gave it to me instantly, I tested it and it worked. Of course, I then wanted to unroll this compact version to visualize the query in a step-by-step manner. It turns out that you can’t eliminate the join, as far as I can tell. Here is ChatGPT’s explanation.

json_each: This is SQLite’s equivalent to jsonb_array_elements, but it functions slightly differently. It must be used in the FROM clause and typically combined directly with the table it is extracting data from due to the less flexible nature of SQLite’s query planner regarding complex JSON operations.

Is that strictly accurate? I don’t know, but it comports with the behavior I’m seeing, which of course is behavior that ChatGPT enabled me to effortlessly conjure into being. This sort of conversational hands-on learning is the signal I’m following to cut through the noise and hype around AI.

Ultimately I don’t care about SQL or JSON; I want to climb up the ladder of cognition in order to solve problems that yield to data acquisition and analysis. I’m not blind to the dark patterns embodied in the most powerful LLMs, but I’m unable to ignore the boost they can deliver. Many kinds of work require us to reason over information at scale, and not just over your code and documentation, though that’s our focus here. I don’t want radiologists to rely solely on AI, but I do want them to consult entities that have seen far more X-rays and diagnostic interpretations than they ever will. In the realm of infotech, I want wranglers of code and data to make the best possible use of these new reasoning partners.

TRENDING STORIES
Jon Udell is an author and software developer who explores software tools and technologies and explains them in writing, audio, and video. He is the author of the cult classic Practical Internet Groupware. Past gigs include Lotus, BYTE magazine, Safari...
Read more from Jon Udell
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.