Building a web-based personal knowledge management tool sounded like the perfect test for today’s most capable coding agents. It was not just another landing page or basic dashboard.

I wanted a fully functional app that could handle notes, tags, markdown previews, local storage, search, and visual relationships between connected ideas.

I wanted to see how Claude Code, Codex, and Google Antigravity handled the project beyond simply generating code. Could they plan the architecture, make sensible product decisions, identify problems early, and keep the entire build smooth? Let’s find out.

Setting the rules

Learn about the prompt

To keep the comparison fair, I used the strongest available models in each tool: Gemini 3.5 in Antigravity, Opus 4.8 in Claude Code, and GPT-5.5 with High reasoning in Codex. Each agent received exactly the same prompt, with no extra hints, follow-up guidance, or adjusted requirements.

The task was ambitious. Rather than generating a simple prototype, each agent had to build a complete web-based personal knowledge management application.

The app needed a notes system, organized tools, note linking, and a visual knowledge graph to show relationships between ideas. I also required a task management and quick-capture feature to efficiently save thoughts.

The prompt also included engineering requirements covering architecture, data handling, responsiveness, and reliability. Testing was also mandatory.

Codex

Feature-packed but with a below-average UI

Codex did not come close to meeting my expectations. At first glance, the app looked polished, and the overall design gave me a good first impression. Unfortunately, that feeling disappeared as soon as I started using it.

I could create notes, but there was no proper way to open and view them in a full-screen workspace. Several buttons overlapped with other interface elements, and the layout often felt crowded and poorly thought out.

The biggest problem was that the Codex tried to do too many things at once. Instead of prioritizing the core note-taking experience, it packed the interface with features without making sure the basic workflows felt smooth.

The iconography also felt basic. Several menus used generic icons based on the first word of the section name, which made the interface look unfinished rather than thoughtfully designed.

I was also surprised by how hot Codex made my Mac run. While it was generating code, I could feel noticeable warmth around the bottom of the machine. Claude Code and Antigravity didn’t produce the same effect during my testing.

Claude Code

A robust attempt, but a few misses

Claude Code was right up there with Google Antigravity in terms of features. It understood the brief well and delivered nearly everything I expected from a web-based PKM tool. Notes, organization, linking, tags, tasks, and the knowledge graph were all present, and the overall experience felt far smoother than what Codex produced.

However, the interface could have made much better use of the available space. Some areas felt unnecessarily compressed, while others felt too much empty room.

The layout was usable, but it did not feel as balanced or efficient as Antigravity’s implementation.

My biggest disappointment was Quick Capture. Instead of opening a small, lightweight input window, it launched a new note in the full editor. That technically allowed me to capture the idea, but it missed the feature's purpose.

Despite that flaw, Claude Code delivered a solid and highly capable result. It came close, but it narrowly missed the crown.

Google Antigravity

Ticks all the right boxes and takes the crown

Google Antigravity was easily the biggest surprise of this comparison. Based on my previous experience with Gemini 3.1, I went in expecting a functional but fairly basic implementation. However, Gemini 3.5 changed the impression.

The first thing that stood out was the interface. It looked aesthetically pleasing, well-balanced, and far more refined than I expected from a first attempt. Antigravity also made excellent use of icons throughout the app.

The note management experience was strong. I could pin important notes, star them for quick access, and glance over tags without digging through multiple menus. The knowledge graph was easy to find and gave me a clear visual overview of how different notes were connected.

My favorite feature, however, was Quick Notes. Antigravity added a small, window-style menu that let me capture ideas without opening the full editor.

It sounds minor on paper, but it can make a major difference when I am trying to save an idea before it disappears. There were still a few areas that could have been better. The favicon looked basic and did not match the quality of the main interface. The Tasks section could have offered more functionality and better organization.

Still, these are small niggles compared to everything Antigravity got right. It surely takes the crown for me.

Three agents entered, and one took ownership

A capable coding agent needs to understand the product, prioritize the right features, and make thoughtful design decisions. Codex delivered an attractive interface, but its poor UX, overlapping controls, and unfocused execution made it the weakest result.

Claude Code performed much better and came close to winning, but its inefficient use of space and broken Quick Capture implementation held it back.

Google Antigravity was the clear winner. Gemini 3.5 understood both the technical brief and the app's practical purpose. It wasn’t flawless, though, but Antigravity nailed the basics, and it was the only tool that behaved like a tech lead.

Google Antigravity

Google Antigravity is an AI-powered IDE that rivals VS Code and Cursor.