VOOZH about

URL: https://thenewstack.io/what-temperature-means-in-natural-language-processing-and-ai/

⇱ What Temperature Means in Natural Language Processing and AI - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-01-09 03:00:55
What Temperature Means in Natural Language Processing and AI
AI / Data

What Temperature Means in Natural Language Processing and AI

In generative AI, "temperature" refers to raised entropy. Here's what that means and why raising the temperature might result in more hallucinations.
Jan 9th, 2024 3:00am by David Eastman
👁 Featued image for: What Temperature Means in Natural Language Processing and AI
Photo by Denny Müller from Unsplash.

One of the issues that keeps bubbling to the surface with increasing use of ChatGPT is the occasional inclusion of obviously incorrect information within responses, which have been accurately described as hallucinations. Why does this occur, and can it be controlled?

When we were looking at a simple OpenAI API query, we bumped into the variable temperature. Other than it can be between 0 and 1, we merely noted it controlled “the creativity of the response.” Here’s a lightly technical look at what this means.

Before moving on, we had better briefly remember that when an engineering mind thinks “temperature,” they are not thinking “it’s getting hot in here” so much as “raised entropy.” Consider the extra jiggling about of excited molecules as an increased range of (random) possibilities.

Temperature is not specific to OpenAI; it belongs more to the ideas of natural language processing (NLP). While large language models (LLMs) represent the current peak in text generation for a given context, this basic ability to work out the next word has been available with predictive text on your phone for decades.

To understand where the variations come from, let’s consider how a simplistic model learns from examples.

Consider a model ingesting its first-ever sentence:

👁 Image

To be or not to be.

It understands the sentence as a string of ordered words, with the full stop indicating the end. If this is the only sentence it knows, it won’t be doing any decent predicting. And if you do happen to type “To be … ” then it will only suggest Hamlet’s famous line.

So we will add one more line to the model:

👁 Image

To be young again.

Combining the two, we get the possibility of producing either line after the first “To be.” We recognize the full stop as the end of the phrase, so that can be shared by either option, just like the first two words.

👁 Image

The options that might be produced from a model based on the previous two inputs.

So the orange line represents a variation. Our model now understands two lines.

We must note that I treated each word as a token or unit to be consumed, including the full stop. But words are not really discrete entities; we know that the words “doing” and “done” are the same word in different tenses, or that “ships” is the plural of “ship.” We also know that the word “disengage” is the word “engage” with a prefix at the start.

In short, words seem to be themselves made of tokens. Within models driven by the English language, there are roughly 1.3 tokens per word. And this will be different for different languages. The other reason we need to have a feel for tokens, is that this is how GPT models charge you. So price per token is something you need to have a feel for.

What Are the Odds?

Training is the process where tokens and context are learned, until there are multiple options with varying probability of occurring. If we assume our simple model from above has taken in hundreds of examples from text, it will know that “To be frank” and “To be continued” are far more likely to occur than Shakespeare’s 400-year-old soliloquy.

If we were to do a kind of bell curve around the next word after “To be …” we would naturally expect some to be very likely and some to be much less likely. In the diagram below, a block represents a large number of examples. So possible words that don’t appear as options have too few example references.

Let us consider a possible top five:

👁 Image

A block of possible options based on the input “To be … “

If we add up all the blocks, we can express simply enough the chance for any word to be randomly selected. So “continued” would be six chances in 14, or 42% likely to appear next, whereas “or” would only be about one in 14, or 7%. But it is already clearly the case that some words are much less likely to appear.

What if we flattened the curve? This would clearly still express the likely responses as higher probability, but it allows the less common options a better chance to be selected:

👁 Image

A flatter curve shows the possible options to follow the input “To be … “

This has changed the likelihood of “continued” to 36% and moved “or” up to 9%. So the odds have gotten shorter around a wider variety of words getting picked.

This is effectively what raising the temperature does. It flattens the curve, giving the less likely responses a boost. If the temperature is zero, then the model may only chose the highest probability token. Just as a reminder, when you call the OpenAI API directly, you get to input the temperature range directly:

curl https://api.openai.com/v1/chat/completions 
 -H "Content-Type: application/json" 
 -H "Authorization: Bearer xx-xxxxXX" 
 -d '{
 "model": "gpt-3.5-turbo",
 "messages": [{"role": "user", "content": "What is TheNewStack?"}],
 "temperature": 0.7
 }'

Because we might be looking for an interesting and original response, a value of temperature nearer 1 makes sense.

Now you may well say, “But surely this increases the chances that the model will respond with stuff that isn’t true?” We are then faced with the question of matching the task to the appropriate temperature. This is done by differentiating between “creative” output and “factual” output. If we use too high a temperature with factual material, we are likely to produce the dreaded hallucinations.

Temperature Veils the Source of Chatbot Responses

The great mission of ChatGPT is to fool you into thinking that AI has “thought’ of an answer. It hasn’t. It is doing a much more sophisticated version of the above, with millions of ingested tokens, but it is still entirely guided by pre-constructed LLMs. That is why it can both look authoritative, yet be absolute nonsense.

However, as we see in everyday use, ChatGPT works very well in most cases. This is because for every question you might have, someone has answered it, directly or inadvertently, somewhere on the internet. ChatGPT’s real task is to understand the context of the question and reflect that in the response.

When I read a weather report in my local newspaper, I am not “ripping them off” if I later use that information to answer a friend who wonders if it will be sunny tomorrow. Newspapers are (or were) intended as valid sources of information. But clearly, if I take large parts of text from an expert’s report and reclaim it as my own, this could be fraud.

There will be increasing legal pressure for models not to blurt out responses that make it absolutely obvious where the source material was taken from. And this is why hallucinations are likely to remain, as temperature is used to vary responses and veil their source. Oddly, the same principle was used initially to defeat spam detection — by adding mistakes to spam email, it was initially difficult to blacklist it. Gmail overcame this by its sheer size and ability to understand patterns in distribution.

Overall we recognize LLMs as socially positive. Eventually the law will formalize around the do’s and don’ts of the training process. But between now and then, there will be plenty of opportunities for the temperature to rise over LLMs misappropriating other creators’ content.

TRENDING STORIES
David has been a London-based professional software developer with Oracle Corp. and British Telecom, and a consultant helping teams work in a more agile fashion. He wrote a book on UI design and has been writing technical articles ever since....
Read more from David Eastman
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.