VOOZH about

URL: https://thenewstack.io/human-insight-llm-grunt-work-creative-publishing-solution/

⇱ Human Insight + LLM Grunt Work = Creative Publishing Solution - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-06-25 07:03:03
Human Insight + LLM Grunt Work = Creative Publishing Solution
Large Language Models / Software Development

Human Insight + LLM Grunt Work = Creative Publishing Solution

A hybrid Google Docs / Markdown workflow, with the help of LLMs, can speed production and unify the review/revise cycle.
Jun 25th, 2024 7:03am by Jon Udell
👁 Featued image for: Human Insight + LLM Grunt Work = Creative Publishing Solution
Image via Unsplash+. 

If you publish product documentation to a website, this might be a familiar pattern. You write an initial draft in Google Docs so your team can review and propose changes. It’s a rich authoring environment in which you can insert images by pasting bits copied from screens, and easily create and rearrange tables. When the team reaches a consensus, you have a nice representation of the web page you’d like to appear on your site.

But how do you make that happen? In this post, we’ll look at a surprisingly simple method that’s proven effective. In our case it’s tuned for a workflow that pushes GitHub repositories into Next.js-based sites hosted by Vercel, so our Markdown text and associated images are copied into those repos. But the method will work for any Markdown-oriented publishing system.

Why not use Markdown syntax in Google Docs and pass it through to publishable Markdown?

The naive solution is, of course, to just export the Google Doc as HTML. That gets you most of the way there, but the last mile is a slippery slope. Google Docs won’t let you create custom styles to align elements with their counterparts in your published web page. And the images are sourced from googleusercontent which is convenient — they seem to just magically work — but you probably want those images stored as named files in your publishing system.

So what I’ve seen happen, in several different environments, is manual transfer and reformatting that becomes a tax on the collaborative benefit of Google Docs.

I’d taken a few runs at this problem in the pre-LLM era, so for starters I revisited what my new team of assistants might bring to the party. Feeding an HTML export from Google Docs into Python’s Markdownify got us tantalizingly close, but that last mile really is a slippery slide into the weeds of document conversion.

Then it occurred to me: people are already writing triple backticks for code blocks, single backticks for inline code, ** and * for bold and italic, and Markdown-style links and lists. Why not use that Markdown syntax in Google Docs and and pass it through to publishable Markdown? That turned out to be a really fruitful idea that was easy to implement.

Progressive Pattern Simplification

Here’s the syntax I’m using to identify images in Google Docs.

[image: image_1 70%]

Here’s how that looks in the exported HTML.

[image: image_1 70%]<br></span><span style=”overflow: hidden; display: inline-block; margin: 0.00px 0.00px; border: 0.00px solid #000000; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px); width: 620.50px; height: 648.64px;”><img alt=”” src=”https://lh7-us.googleusercontent.com/FcQFpX3FcEHFx0WmbZlMSKryKxd9hnkdtvSwnlS-v5Y92x6VKfYFhb_b8yKlrx6q8j2gARKB5AGfQRJlMVpz4JbTzdQEg0GiHnreS7U0bD-XehoZT6S_DydXtSgpnlPutG8pso1XBZChKvl0″ style=”width: 620.50px; height: 648.64px; margin-left: 0.00px; margin-top: 0.00px; transform: rotate(0.00rad) translateZ(0px); -webkit-transform: rotate(0.00rad) translateZ(0px);” title=””>

Here’s the Markdownified version of it.

[image: my_image_1 70%] \n \n![](https://lh7-us.googleusercontent.com/FcQFpX3FcEHFx0WmbZlMSKryKxd9hnkdtvSwnlS-v5Y92x6VKfYFhb_b8yJtkv6q8j2gARKB5AGfQRJlMVpz4JbTzdQEg0GiHnreS7U0bD-XehoZT6S_DydXtSgpnlPutG8pso1XBZChKvl0)

Now two things need to happen: the image needs to be downloaded to a file with the indicated name, and the whole string needs to be replaced with this element.

<p><img alt=”aws_start_0_create_role_1″ style={{“width”:”70%”}} src=”/images/docs/my_image_1.png”/></p>

Since the site is fed by Markdown, why not a Markdown-style link? You can’t specify the optional width attribute in Markdown, so HTML syntax is necessary. (Why the double braces? There’s another level here: for attributes, the publishing system requires JSX syntax.)

Here are the functions that do the work.

My assistants helped with the coding, as usual, but they didn’t come up with the idea, nor would I have expected them to. While large language models (LLMs) can certainly be useful rubber ducks, I think that for now this kind of creative and non-obvious solution will require human insight. Maybe chain-of-thought prompting could have gotten me there, and if so I’d be delighted to learn how.

But meanwhile, I’m just grateful for the coding help. In this case, that included writing the regular expression, testing it, and then revising it to use long form with comments. I’m better at having an idea like this than hammering out the details, so I appreciate the help.

Please, ChatGPT, Don’t Write Code Unless I Ask!

ChatGPT really wants to code, and I’m finding it’s still a struggle to keep it focused on strategy.

That’s a sentence I never imagined I’d write. But as I worked through this exercise, I became increasingly frustrated by both ChatGPT’s and Claude’s eagerness to dive straight into coding. I asked both to refrain so that we could first discuss strategy. Claude responded pretty well to that request, but ChatGPT was stubbornly determined to write code.

When I mentioned this on Mastodon, Josh Kellendonk helpfully replied.

“I added instructions to my user base prompt not to generate code unless explicitly asked. ChatGPT has been much better behaved since.”

It was news to me that you can do that by way of the “Customize ChatGPT” link on your profile dropdown. And that does seem to have helped. But wow, ChatGPT really wants to code, and I’m finding it’s still a struggle to keep it focused on strategy.

Frictionless Screenshots

Screenshots are fundamental to software documentation. As developers, we use words and pictures of screens together to explain the various states that applications can be in. If the pictures are costly to create and maintain, we’ll use fewer of them than we ideally would. It’s easy enough to grab a region from a screen, and maybe tweak it in image editor.

But there’s a nontrivial cost for moving that image into the publishing system. You have to name it in the image editor, save it to the right place, and then insert a reference to it in the page where it will display. And when the application’s UX changes, you have to redo those steps.

Reducing that overhead has been a major benefit of this approach. Just capture bits from a screen, paste them into a Google Doc, write a descriptive name (which also serves as alt text), and that’s it. You’re done in seconds and, if you need to revise, that’s just as quick.

I can envision a knock-on benefit too. In “How to Learn Unfamiliar Software Tools with ChatGPT,” we saw how it’s now useful to upload pictures of application states. As the distinction between text and pictures of text fades away, pictures become rich prompts that can augment (or even supersede) verbal descriptions.

I first experienced this powerful mode when learning how to plot an equation in an unfamiliar tool, GeoGebra, by showing ChatGPT screenshots of failed efforts. That was effective, I think, because GeoGebra is a popular tool that’s well-represented in the corpus of documents that LLMs feed on.

The more application states pictorially represented in the docs, the likelier it will be that a user’s captured screenshot will be an effective prompt.

As noted in “How AI Can Help Improve Our Documentation,” we’re making effective use of Unblocked, which has ingested our documentation and can answer questions with help from that corpus. It doesn’t yet work with images but, if it gains that capability — and/or as the docs make their way into the public LLMs — screenshots will become a powerful alternate way to ask questions like: How did I land on this screen? What went wrong? Where do I go next? The more application states pictorially represented in the docs, the likelier it will be that a user’s captured screenshot will be an effective prompt.

If things turn out that way, it’ll be very different to what I once envisioned. It’s always bugged me that we have to write docs that say things like: “Click the Save button (at the top right of the screen).” My notion was always that the web platform affords a better solution: links. If an application’s URL surface area were sufficiently rich, you could just cite the link that results from following that instruction.

But there’s a practical limit to how much application state you can represent in a URL. And ultimately the state that matters is the one that’s rendered to the screen. So I think screenshots will play an even more important role than they already do. And less publishing friction means we can have more of them.

Integrated Review and Revision

Although streamlined publishing of screenshots is nice, the biggest win comes from reviewing and revising in Google Docs which, for better and worse, has become the defacto collaboration standard for many of us. Vercel does support comments on drafts, but you can’t just click “Accept” to incorporate a suggestion into a published doc. And it’s hard to keep track of comments in two places.

Accepting suggestions in Google, and thereby automatically updating the published web page, is much cleaner and simpler. Requested changes can be a lot easier to make too. I’d rather add a column to a table in a Google Doc than wrangle Markdown in a text editor!

Of course, Google Docs has its share of bugs. One that’s bitten me on this project: assigning H-level headings in close proximity to normal text. It can be fussy to separate the two, and while it’s nice to see styled headings in a Google doc, I might fall back to using Markdown headings. But that’s a minor nit, because on the whole this hybrid method has been a big win.

TRENDING STORIES
Jon Udell is an author and software developer who explores software tools and technologies and explains them in writing, audio, and video. He is the author of the cult classic Practical Internet Groupware. Past gigs include Lotus, BYTE magazine, Safari...
Read more from Jon Udell
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.