VOOZH about

URL: https://thenewstack.io/save-valuable-genai-tokens-with-this-one-simple-trick/

⇱ Save Valuable GenAI Tokens With This One Simple Trick - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-02-24 08:00:12
Save Valuable GenAI Tokens With This One Simple Trick
AI / Data / Large Language Models

Save Valuable GenAI Tokens With This One Simple Trick

LLMs are great at text but lousy (and expensive) at business analytics. This TechTalk from the Association for Computing Machinery shows how to get answers, with minimal cost.
Feb 24th, 2025 8:00am by Joab Jackson
👁 Featued image for: Save Valuable GenAI Tokens With This One Simple Trick

Large Language Model (LLM)-based Generative AI services such as OpenAI or Google Gemini are better at some tasks more so than others, advised Wei-Meng Lee, technologist and founder of Developer Learning Solutions, in ACM TechTalk last month, entitled “Unlock Hugging Face: Simplify AI with Transformers, LLMs, RAG, Fine-Tuning.”

For instance, the LLMs are not good at performing analytical tasks, surprisingly enough. And even if you did want to use an LLM for the task, it’d probably be prohibitively expensive, given the size of your data set.

Say you have a CSV file with 20 columns and five million rows. It may include transaction records, as well as customer data. You want to ask a question such as what did this customer purchase on one particular day. How much did you earn this month? Is this easy work for the LLM?

“The thing is, no,” Wei Meng explained. “LLMs are very bad at analytical tasks.”

Certainly, LLMs are very good at text-based questions and extracting information from large, unstructured bodies of text. However, numerical analysis is still a challenge.

But there is a way you can still use LLMs for such tasks.

Tokens and Dollars

The information that the user provides and the answers received when interacting with a GenAI chat service is known as the “context window size.” This is usually measured in tokens.

Roughly speaking, one token roughly equals 3/4ths of an English word. Parts of words can be whole tokens, with the prefixes and suffixes making up their own tokens.

👁 Image

An example of tokenization from a Hugging Face course on building AI Agents.

Services have different context size windows. OpenAI‘s GPT-40-mini has a content window size of 128,000 tokens, or about 96,000 words and associated characters, with the both the question and answer.

So, you must stuff your entire question, along with all the supporting data, into the context window.

“For normal chat, not a problem,” Wei Meng said.

But if you are using really large data sets, this will cost you!

A 20-column five million row CSV value file will chew through that token window rapidly.

Exceed the context window size, and you will get an error message or incur extra fees.

Also shipping your data outside puts the privacy of your data at risk.

Do This Instead

Instead of sending an entire data set, keep the data on your server. Then, formulate the prompt by including a description of the format of the dataset, perhaps with the schema itself, and maybe even a few sample, anonymized, examples.

Then, instead of asking the GenAI to answer your questions, ask the GenAI to generate the code or queries necessary to answer them.

Then, you execute the code in your local environment.

“You do not violate the context window size. You do not sacrifice the privacy of your data,” he said.

In an example, Wei Meng said showed how one could analyze the dataset of all the passengers aboard the ill-fated Titanic voyage, using OpenAI and the LM Studio. A CVS file with the data — which had 891 rows and 12 rows  —  was loaded into a Python DataFrame.

Here is the prompt he then gave OpenAI:


{
'role':'userf',
'content':'''
Here is the schema of my data:
PassengerID,Survived,Pclass,Name,Sex,Age,Sib5p,Parch,Ticket,Fare,Cabin,Embarked
Note that for Survived, 0 means dead, 1 means alive
Return the answer in Python code only
For your info, I have already loaded the CSV file into a dataframe named df
'''

}

In general, the more descriptive the prompt, the better answers you’ll get, Wei Meng advised.

Once all the prompts are loaded in, you can ask your questions, such as

  • What is the proportion of male and female passengers?
  • Can you visualize the survival rate for each passenger class (Pclass)?
  • Can you visualize the survival rate of passengers traveling alone vs. with family?

Note that the answers do not need to be text-based if you have, in this case, the Python visualization tools on hand.

Using a Jupyter Notebook or LM Studio, you could even automate the execution of the query yourself, with results displayed back in the workspace as soon as it is returned.

“The nice thing is that you don’t have to upload the data, or learn data analysis,” he said.

👁 Screenshot of solution illustration.

From the Hugging Face presentation by Wei-Meng Lee.

What Is Hugging Face?

Wei-Meng Lee’s presentation concerned itself chiefly with how to use Hugging Face, a collaborative platform for developers and researchers to use and collaborate on machine learning models, datasets and applications.

In the presentation, Wei-Meng showed how to use Hugging Face’s pre-trained models through the company’s Transformers API. Hugging Face’s pipeline objects can then ease the task of using these models, he then goes on to demonstrate. And he shows how to use the Gradio library to easily run LLM-based Python applications.

“Gradio allows you to build a very nice web frontend with just a couple of lines of code,” Wei-Meng said.

TRENDING STORIES
Joab Jackson is a senior editor for The New Stack, covering cloud native computing and system operations. He has reported on IT infrastructure and development for over 30 years, including stints at IDG and Government Computer News. Before that, he...
Read more from Joab Jackson
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.