VOOZH about

URL: https://thenewstack.io/expectations-for-agentic-coding-tools-testing-gemini-cli/

⇱ Expectations for Agentic Coding Tools: Testing Gemini CLI - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-07-07 05:28:21
Expectations for Agentic Coding Tools: Testing Gemini CLI
tutorial,
AI Agents

Expectations for Agentic Coding Tools: Testing Gemini CLI

What are the 'quality-of-life' expectations for agentic applications? To learn more, we test Gemini CLI, Google's open source AI terminal app.
Jul 7th, 2025 5:28am by David Eastman
👁 Featued image for: Expectations for Agentic Coding Tools: Testing Gemini CLI
Image via Unsplash.
Before I launch Gemini CLI, Google’s open source AI terminal app, let’s look at what the “quality-of-life” expectations are for agentic applications. Now that we have several of these tools — Claude Code, Warp and OpenAI Codex are other examples — we have a better sense of what a developer needs from them. Firstly, it needs to be easy to get started on the command line in your terminal. Developers are still the primary target audience for agentic apps, so environment variables or flags for options are fine. But getting straight in is vital. For example, connecting your API key to your account can be done via environment variable or up in a web page console. Knowing when you are running out of tokens (whether freely given or paid for) is now an important gauge. When we hit the start button, we need a simple session intro summary so that we know at least the following things:
  • The model in use;
  • The project directory;
  • Any other pertinent permission or account information, or if a working file is being watched.
A working file in the project directory where assumptions based on the project are written and can be tracked (like the Claude.md file) is an important innovation to move beyond a session life cycle into a project life cycle. Permission boundaries have to be respected; and in general we are in the early days regarding when to allow the large language model (LLM) to change files, and where. I’ve argued that forcing vibe coders to use git is a bit malign — but then again, if you fail to plan you are clearly planning to fail. Showing us an execution plan the LLM will follow to fulfill your request feels good, but has not yet proven to be essential. But unless this is done, the exact tactics an LLM will use are opaque. A simple checkbox list will suffice. A quit session summary showing time, requests and tokens used is great. Full accounts can really only be tracked on a user page. There are plenty of other features that will creep into the above list, but we need to be aware of backsliding as well as genuinely useful innovations.

Starting up Gemini

As with all cloud based LLMs, we must show our fealty before we get access to the precious tokens. Go to Google Studio to generate a key. Currently you are given 100 requests a day (check the other tier limits here). We can install Gemini via npm at the terminal:
npm install -g @google/gemini-cli
Next, set your API key as an environment variable — I’m doing it here in the command line on my MacBook:👁 Image
Then type the command gemini and we are off: 👁 Image
As I mentioned in the quality-of-life section above, this does the important thing of pointing at the active model (Gemini-2.5 Pro in this case) as well as reflecting the project directory. The theme selection screen disappears as soon as you press return, but I assume you can bring it back. It takes up quite a lot of space on the introduction screen. Like Claude Code, there is markdown file — GEMINI.md in this case — for request customization. I won’t use it in this post. What does “no sandbox” mean? The bad news is that Gemini starts off with no restrictions as to where your AI may roam. I’m afraid that isn’t very sensible, but Gemini gives you fairly straightforward options. The good news is that we can use macOS Seatbelt, which starts off with a sensible policy of restricting access to within the project directory. So I’ll exit this session (type /quit) and we can restart with this basic security. The quit screen provides some of the stats I referred to earlier: 👁 Image
We can use Seatbelt by just setting an environment variable in this session, then adding a flag: 👁 Image
Now we are good to go, as we have our seatbelt on. As I did with Codex in a recent post, let’s try out the merge of two JSON files. As before, I’m looking for how the structure supports me, as much as the outcome. If you don’t want to read the previous post, imagine I have a city website that uses JSON data. I have a JSON file called original_cities.json:
{ 
 "cities": [ 
 { 
 "id": "London", 
 "text": "London is the capital of the UK", 
 "image": "BigBen" 
 }, 
 {
 "id": "Berlin", 
 "text": "Great night club scene", 
 "image": "Brandonburg Gate",
 "imageintended": "Reichstag" 
 }, 
 { 
 "id": "Paris", 
 "text": "Held the Olympics of 2024", 
 "image": "EifelTower", 
 } 
 ] 
}
The spelling errors and formatting error (extra comma) are intentional; we want to see if we can bait the LLM. I also have another file, called updated_cities.json:
{
 "cities": [
 {
 "id": "London",
 "text": "London is the capital and largest city in Great Britain",
 "image": "BigBen"
 },
 {
 "id": "Berlin",
 "text": "Great night club scene but a small population",
 "image": "BrandenburgGate",
 "imageintended": "Reichstag"
 },
 {
 "id": "Paris",
 "text": "Held the Olympics of 2024",
 "image": "NotreDame"
 },
 {
 "id": "Rome",
 "text": "The Eternal City",
 "image": "TheColleseum"
 }
 ]
}
I want to update the first file with the contents of the second. This simulates slightly out-of-synch working. I have one condition: I want any updated image references (that I may not have yet) copied into a key called “imageintended” so that I don’t use the data and cause a crash. Essentially all the merge should do is add the Rome entry to the first file and introduce the new image references without overwriting the existing image key. So my project folder looks like this. Note, I haven’t created a GEMINI.md file: 👁 Image
I’ll use the same request I gave to Codex: “please update the JSON file original_cities.json with the contents of the file updated_cities.json but if the ‘image’ field is different, please update or write a new ‘imageintended’ field with the new value instead” So let’s see what it does. This task may look specific, but is actually a bit vague, which reflects a request from the average human. After getting confused about its project file, it gave me a perfectly good answer: 👁 Image
Updating text, adding the new entry and not overwriting any values in the “image” key — all done. It didn’t try to fix inconsequential spelling and didn’t get confused by the trailing comma. It was far quicker than Codex as well. I checked the file, and indeed the changes were made. Before it answered, it didn’t quite make a plan, but gave me a fairly basic explanation of what it would do: 👁 Image
As the outcome was entirely correct, the process didn’t really matter. But only by checking intentions can you really correct LLM “thinking” when it takes the wrong path. I’ll exit to show the final expenditure summary: 👁 Image

Conclusion

As I said, this isn’t a direct LLM comparison, but Gemini gave me an efficient agentic experience. I’m sure Google can plug in any of the missing quality-of-life issues I mentioned (specifically, some running stats on token usage), but it is definitely ready for action right now. There is a growing coterie of agentic terminal applications out there for developers to try, and Gemini CLI is a solid addition to that list.
TRENDING STORIES
David has been a London-based professional software developer with Oracle Corp. and British Telecom, and a consultant helping teams work in a more agile fashion. He wrote a book on UI design and has been writing technical articles ever since....
Read more from David Eastman
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.