BrowserCode is incredibly good at long-running tasks
It orders pizza for us
00:00
Congrats to the @browser_use team for taking the #1 spot on Odysseys, a highly challenging benchmark for long-horizon web agents:
odysseys-website.pages.dev/leaderboard
Odysseys evaluates realistic, multi-hour web workflows that require sustained planning, memory, reasoning, and verification
