This is insane! The new Gemini Flash model released yesterday has the same accuracy as o3, but it is 2x faster and 4x cheaper for browser agent tasks.
I ran evaluations the whole day and could not believe this. The previous gemini-2.5-flash had only 71% on this benchmark.
Introducing two new Gemini 2.5 models (Flash and Flash-Lite) which are more intelligent, cost effective, and token efficient. You can keep up with our latest models through `gemini-flash-latest` and `gemini-flash-lite-latest`!!
