VOOZH about

URL: https://thenewstack.io/from-agi-hype-to-engineering-reality-the-future-of-llms/

⇱ From AGI Hype to Engineering Reality: The Future of LLMs - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-12-16 09:00:02
From AGI Hype to Engineering Reality: The Future of LLMs
AI / Large Language Models / Software Development

From AGI Hype to Engineering Reality: The Future of LLMs

We explore how LLM providers may address current limitations with ontologies, hub-and-spoke models, and the use of local, personalized AI.
Dec 16th, 2025 9:00am by David Eastman
👁 Featued image for: From AGI Hype to Engineering Reality: The Future of LLMs
Image via Unsplash+.

If we wonder how Large Language Model (LLM) providers will try to improve their services in the next few years, we can start by trying to foresee how current limitations will be addressed. While LLMs have been fairly successful in the chat box format, they are both expensive in terms of energy usage and have interminable problems with hallucinations. Software developers battle with increasing token usage to achieve more focused results.

There is still a bit of guess work involved in working out how size vs. training truly effects output, but the problems with energy and hallucinations have put a limit on expansion. So this post looks at the possible directions that LLM providers might choose to move towards.

But first we have to check the validity of Yann LeCun’s prediction that LLMs are a dead end. While this might ultimately be true with respect to “artificial general intelligence,” the sheer money and momentum invested with the AI companies ensures that we will still be using LLMs for some time yet. LeCun himself has launched a startup to “continue the Advanced Machine Intelligence research program (AMI) I have been pursuing over the last several years”; but this won’t bear fruit for a while.

Ontologies

Many of the old approaches to AI have been brushed aside by the successes of LLMs, but I still remember when it was assumed that artificial intelligence would be composed of large ontologies — think of these as concept maps, very much like hashtags, to connect ideas within some type of formal structure. Because LLMs train on vast amounts of information, they internalise concepts in a somewhat random manner, yet appear to understand how things relate. But we know that LLMs can help create knowledge graphs; and Retrieval-Augmented Generation (RAG) is one vital method used to keep an LLM response honest, by feeding it with formatted expert knowledge.

One possible approach to fight hallucination would be to focus on maintaining lots of large knowledge graphs in certain subject matters, and sharing these amongst other provider services.

The pressure to do this may be regulatory. For example, we’ve seen recently how Australia has put an age restriction on social networks because of the various negative effects of screen addiction on children. So it might be necessary to create the equivalent of the “Children’s Britannica” — a large set of information that doesn’t divulge facts from problematic areas. Maintained by third parties, more regulated information may persuade national governments that LLMs won’t spread biased facts.

Hub and Spoke

Formally sharing large amounts of information might work against the business models of competing providers, but working together could still lead to efficiency savings.

We already have hope some here: the early and slightly surprising universal acceptance of Anthropic’s Model Context Protocol (MCP) as the “USB of the LLM” may tell us that where an idea is good enough, competing providers (such as OpenAI in this case) will take it up.

OpenAI has already underlined a possible distribution model with its Apps SDK and how that might work with its Atlas browser. The idea here is to literally treat local knowledge as a kind of MCP server that the LLM can call on. In this way, OpenAI is taking a shot at replacing the web — by answering general queries with its ChatGPT model, but calling user application servers to get local expert information. In exactly the same way as OpenAI uses MCP tools to access your hard drive, for example.

Local LLMs

Many people already run LLMs locally, and we have shown readers ways to do this over the past few years. While the big bleeding edge models will remain in the cloud, there are plenty of smaller pre-weighted open source models that users can run on their laptops. Running locally is still a bit technical, but apps like Ollama make it much simpler. Of course, the ultimate local machine might well be your phone.

We’ve already seen how Agentic CLI systems can choose a quick cheap model for some queries, leaving the more expensive models for “deep thought” or “planning.” This leads to the idea of maybe using a local model for smaller queries, while sending harder queries to the bigger models crunching in the cloud.

The Life Stream

The other reason to look locally is to pick up the user’s personal context. This begins to make a lot of sense when we see how good Google has historically been in answering user queries, because it knows enough about them to exclude irrelevant results.

It is reasonable to assume that Amazon trains LLMs with information from millions of Alexa speakers, as well as isolating the identity of individual speakers in a family. But a local LLM could just listen and read all your speech and content in order to fully understand not just your geographical location, but also what interests you in detail.

While the possibly Orwellian consequences of the “life stream apps” did alarm us in 2010s, we still filled them with our continuous status reports. Agentic CLIs use setup markdown files to give the LLM hints about a project, so analysing a user over time could certainly be more efficient. Socrates is supposed to have said “the unexamined life is not worth living,” and while I doubt he would’ve approved of AI, a moderate amount of recording could certainly give a rich (if personal) graph for an LLM to start working with.

Conclusion

It might need a small “correction” (i.e. crash) in the market before the large providers set to work on improving efficiency, or for investors to move away from chasing “artificial general intelligence.” Perhaps large companies will move together into another hype area to continue the AI momentum and keep their share prices high. But the chances are good that engineering will shift into improving existing investments.

If you use LLMs for software development, you will have a front row ringside seat for any upcoming changes.

TRENDING STORIES
David has been a London-based professional software developer with Oracle Corp. and British Telecom, and a consultant helping teams work in a more agile fashion. He wrote a book on UI design and has been writing technical articles ever since....
Read more from David Eastman
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Anthropic, OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.