VOOZH about

URL: https://thenewstack.io/how-to-use-llms-for-dynamic-documentation/

⇱ How to Use LLMs for Dynamic Documentation - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-10-12 05:00:25
How to Use LLMs for Dynamic Documentation
Large Language Models / Open Source / Software Development

How to Use LLMs for Dynamic Documentation

Some explanations should be written by code authors. Others may best be generated on the fly by LLM-assisted code readers.
Oct 12th, 2023 5:00am by Jon Udell
👁 Featued image for: How to Use LLMs for Dynamic Documentation
Image via Unsplash

The other day I revisited a SQL query I’d written a while ago, in order to adapt it to the latest version of the GitHub plugin. The minor adjustments took longer than they should have. Why? I’d forgotten how the query works! And of course, I hadn’t written any comments; it seemed simple at the time, right?

Here’s the original query. It reports recent commits for a set of Steampipe plugins.

with repos as (
 select
 name_with_owner as repo_name,
 'author-date:>' || $2|| ' repo:' || name_with_owner as query
 from
 github_search_repository
 where
 query = $1 || ' in:name ' 
)
select
 r.repo_name,
 c.commit -> 'author' ->> 'name' as author,
 substring(c.commit -> 'committer' ->> 'date' from 1 for 10) as date,
 c.commit ->> 'message' as message
from
 repos r
join
 github_search_commit c 
on
 r.repo_name = c.repository_full_name
 and r.query = c.query 
order by
 date desc

Here’s how it works. If you fill in the $1 and $2 parameters with the values turbot/steampipe and 2023-10-04, the with repos CTE (common table expression) runs this query.

select
 g.name_with_owner as repo_name,
 'turbot/steampipe in name' as query
from
 github_search_repository g
where
 query = 'turbot/steampipe in:name' 

github_search_repository is a table, provided by the GitHub plugin, that encapsulates GitHub’s syntax for searching repositories. When you write a SQL query that uses the table, you’re required to provide a WHERE clause with a qualifier called query and a value that is a piece of query syntax for searching GitHub repositories.

At the moment, this query returns 208 rows like this.

+---------------------------------------------------------------+-----------------------------------------------+
| repo_name | query |
+---------------------------------------------------------------+-----------------------------------------------+
| turbot/steampipe-plugin-ldap | author-date:>2023-10-04 repo:turbot/steampipe |
| turbot/steampipe-plugin-bitbucket | author-date:>2023-10-04 repo:turbot/steampipe |
| turbot/steampipe-plugin-alicloud | author-date:>2023-10-04 repo:turbot/steampipe |
...

The query column contains another piece of GitHub search syntax. In this case, it’s the required qualifier for a different Steampipe table, github_search_commit, which encapsulates the GitHub syntax for searching commits. Here’s a standalone version of a query against this table.

select
 *
from
 github_search_commit
where
 repository_full_name = 'steampipe-plugin-ldap'
 and query = 'author-date:>2023-10-04 repo:turbot/steampipe'

So we’ve got two different qualifiers in play.

used in table example qualifier
the with repos CTE github_search_repository where query = ‘turbot/steampipe in name’
the body of the main query github_search_commit where query = ‘author-date:>2023-10-04 repo:turbot/steampipe’

Now let’s focus on the body of the main query.

select
 r.repo_name,
 c.commit -> 'author' ->> 'name' as author,
 substring(c.commit -> 'committer' ->> 'date' from 1 for 10) as date,
 c.commit ->> 'message' as message
from
 repos r
join
 github_search_commit c 
on
 r.repo_name = c.repository_full_name
 and r.query = c.query 
order by
 date desc

There are no WHERE clauses in play here but the same qualifiers are required; they can also be provided by way of a JOIN.

So there are two levels at which GitHub search syntax is dynamically built.

  1. to find GitHub repositories matching steampipe-plugin
  2. for each of the 200+ repositories thus found, to find commits after a given date

Now that I’ve written out the explanation, it hardly seems simple at all. If you could only write a single comment to help clarify things, what would it be?

The Most Important Single Comment

When I first showed the query to ChatGPT it wasn’t to ask for documentation, but rather to resolve a name conflict I’d introduced when adapting to some changed column names in the new version of the GitHub plugin. Nevertheless, the LLM provided two critical improvements — unrelated to those column changes — that I hadn’t even asked for.

– ‘author-date:>’ || $2|| ‘ repo:’ || g.name_with_owner as query
+ ‘author-date:>’ || $2|| ‘ repo:’ || g.name_with_owner as commit_query– and r.query = c.query
+ and r.query = c.commit_query — use constructed commit_query

That was exactly right. The query column produced by the CTE shouldn’t have shared the same name as the query qualifier used in the body of the query. This wasn’t a syntactic problem but it was a conceptual disaster. Renaming the CTE’s query column to commit_query signals that it’s distinct from the query qualifier: a different kind of thing that powers the github_search_commit query.

This is the kind of tacit knowledge transfer discussed in Learning While Coding: How LLMs Teach You Implicitly. It was a bonus outcome that I didn’t even ask for.

I think this is a powerful prompt:

If you could add just one comment to clarify this uncommented code, what would it be? And why?

I put the question to both Copilot Chat and Sourcegraph Cody.

Copilot

“If I were to add just one comment to the query to clarify how it works, I would add a comment above the `WITH` clause to explain what it does.”

-- This CTE retrieves the top 10 repositories that match the search query

Bzzzt. Wrong answer!

Cody

“If I were to add one explanatory comment to this query, it would be here:

-- Construct dynamic commit search per repo
'author-date:>' || $2|| ' repo:' || g.name_with_owner as query

The reason I would add a comment here is because this line is constructing a custom search query for commits for each repository.

This is a key part of the query — it allows searching commits within the date range specifically for each repo.

Without this dynamic construction, we wouldn’t be able to filter commits effectively per repo.

So adding a quick comment explaining that this line is creating a custom commit search per repo helps clarify why it is needed and how it works with the rest of the query to connect repos to matching commits.”

Exactly right. Well done Cody!

Dynamic Documentation

I understand why programmers often resist writing comments. Laziness? Sure, but sometimes laziness is a virtue. The computer doesn’t care about comments, it only cares about code, so if you can write code in a way that works for the computer while also helping the programmer understand the code, that’s a great start. The code itself is already a maintenance burden; comments that can easily drift from the source of truth become an additional burden.

Still, I’ve always been sympathetic to the idea of literate programming. A year ago, in Literate Programming in Go, I described a project to improve the documentation of the Steampipe plugin SDK. It was interesting to learn that Go source code can embed wiki-like documentation, and we made some headway toward enhancing the otherwise uninspiring docs generated from Go source. But we haven’t made further progress, and I’m keen to take another run at it in the LLM era.

For starters, I’d revisit the overview docs that we wrote to explain the high-level components and concepts. For that, I think Cody will serve us well, once it has indexed github.com/turbot/steampipe-plugin-sdk. For these overview docs, it will be worthwhile to iterate with LLMs to create documentation that will become a permanent part of the code and will be maintained accordingly.

But for function- and line-level comments, I’m now wondering whether a dynamic approach may sometimes (or often!) turn out to be best. I have never found mechanically-generated function-level comments to be particularly useful. But we have new machines to partner with now. Will the comments they can write dynamically be useful enough to avoid baking in function- and line-level documentation that’s liable to drift from the source of truth?

This isn’t an either/or question. We’re entering a phase in which both methods will coexist. But my hunch is that we’re about to see a fascinating new twist on the old idea of literate programming. Some explanations can, will, and should be written by code authors alone, or by those authors in partnership with LLMs. Others can, will, and should be conjured dynamically by code readers who ask LLMs for explanations on the fly.

TRENDING STORIES
Jon Udell is an author and software developer who explores software tools and technologies and explains them in writing, audio, and video. He is the author of the cult classic Practical Internet Groupware. Past gigs include Lotus, BYTE magazine, Safari...
Read more from Jon Udell
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Sourcegraph.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.