Google's Gemini 2.5: The AI That Clicks, Scrolls, and Fills Forms for You

Posted on October 08, 2025 at 11:26 PM

Google reportedly developing new AI that can automate web browsing tasks in Chrome - SiliconANGLE

Google’s Gemini 2.5: The AI That Clicks, Scrolls, and Fills Forms for You

In a groundbreaking move, Google’s DeepMind has unveiled Gemini 2.5 Pro Computer Use, an AI model that can surf the web, click buttons, and fill out forms—all from a single text prompt. This advancement marks a significant leap from traditional AI chatbots, enabling more interactive and autonomous online experiences.


🧠 What Is Gemini 2.5 Computer Use?

Gemini 2.5 Computer Use is a fine-tuned version of Google’s powerful Gemini 2.5 Pro language model. Unlike its predecessors, this model can interact with websites in a human-like manner—navigating pages, completing forms, and even bypassing CAPTCHA challenges. Developers can access it through the Gemini API in Google AI Studio or via Google Cloud’s Vertex AI platform.


🧪 Hands-On Experience

In initial tests on Browserbase, Gemini 2.5 Computer Use demonstrated impressive capabilities. It successfully navigated Taylor Swift’s official website and provided a summary of featured products. Additionally, the AI completed a Google Search CAPTCHA (“Select all the boxes with a motorcycle”) in seconds. However, it faced challenges in completing tasks beyond that, highlighting areas for future improvement.


⚖️ Performance Benchmarks

Evaluations conducted via Browserbase and Google’s own testing show that Gemini 2.5 Computer Use outperforms competitors in interface control benchmarks. For instance, it achieved a score of 65.7% on the Online-Mind2Web benchmark, surpassing Claude Sonnet 4’s 61.0% and OpenAI’s agent-based model’s 44.3% (Venturebeat).


🔍 Limitations and Future Prospects

While Gemini 2.5 Computer Use excels in web interaction, it currently lacks the ability to create or edit local files, such as documents or spreadsheets. This limitation sets it apart from OpenAI’s ChatGPT Agent and Anthropic’s Claude, which offer broader functionalities (Venturebeat).

Looking ahead, Gemini’s integration into Google’s broader ecosystem, including Chrome and Android, could enhance its capabilities. For example, it could assist in tasks like autofilling forms, managing tabs, or interacting with mobile applications.


📚 Glossary

  • Gemini 2.5 Pro Computer Use: Google’s advanced AI model capable of interacting with web interfaces autonomously.
  • CAPTCHA: A security feature designed to differentiate between human users and bots.
  • Browserbase: A platform that allows users to test and compare AI models’ web interaction capabilities.
  • Vertex AI: Google Cloud’s platform for building and deploying machine learning models.

For a deeper dive into Gemini 2.5 Computer Use, visit the original article on VentureBeat: Google’s AI can now surf the web for you, click on buttons, and fill out forms.