Google launches Gemini 3 with claims of superior reasoning and “Generative UI”

Google launches Gemini 3 with claims of superior reasoning and “Generative UI”
Photo by Solen Feyissa / Unsplash

Google just announced Gemini 3 Pro based on its latest AI model, claiming improved reasoning and a new capability called "generative UI" that creates custom interfaces on demand. The release intensifies competition with OpenAI, which rolled out GPT-5 in August, but independent review of Google's performance claims reveal a more nuanced picture of improvements.

Benchmark performance requires context

Google touts Gemini 3 Pro's 1501 score on LMArena, topping the previous leader Gemini 2.5 Pro at 1451. The company highlights breakthrough scores on academic benchmarks including 91.9 percent on GPQA Diamond, which tests graduate-level science knowledge, and 76.2 percent on SWE-bench Verified for code generation.

However, Google's own announcement reveals significant limitations. On SimpleQA Verified, a factual accuracy test, Gemini 3 Pro scored 72.1 percent—Google's highest yet, but still failing nearly three in ten basic knowledge questions. On Humanity's Last Exam, testing PhD-level reasoning, the model achieved just 37.5 percent without tool use. Google characterizes these as "state-of-the-art" results, but the scores demonstrate that even frontier AI models struggle with complex reasoning and factual reliability.

Generative UI: Innovation or gimmick?

The model's most distinctive feature is generative UI, which Ars Technica describes as creating "custom interfaces—for example, a web app that explores the life and work of Vincent Van Gogh." Google claims the system generates "fully customized interactive responses" including web pages, tools, and applications tailored to user prompts.

This capability launches as two experimental modes in the Gemini app. "Visual layout" creates magazine-style presentations with images and interactive filters, while "dynamic view" generates coded interfaces with sliders and checkboxes. Google is rolling out these features selectively, showing users only one experiment at a time to gather feedback.

Enterprise adoption and competitive positioning

Google Cloud announced immediate availability for Gemini 3 Pro through Vertex AI and Gemini Enterprise, targeting developers and business customers. The company secured endorsements from Box, Cursor, GitHub, JetBrains, Replit, Shopify, and Thomson Reuters, with claims of thirty-five percent accuracy improvements and fifty-percent reductions in tool-calling errors.

CNBC reports that the Gemini app now has 650 million monthly active users, compared to ChatGPT's 700 million weekly users reported by OpenAI in August. The metrics are not directly comparable—monthly versus weekly measurement—but suggest Google remains in competitive range despite entering the market later.

CEO Sundar Pichai told CNBC that Gemini 3 requires users to do "less prompting" for desired results, and Google claims the model is "trading cliché and flattery for genuine insight."

Agentic development platform debuts

Google simultaneously launched Google Antigravity, an integrated development environment designed for AI agents. Available immediately for Windows, Mac, and Linux, the platform allows developers to monitor multiple AI agents working across editor, terminal, and browser environments. Third-party platforms including Cursor, GitHub, and JetBrains are integrating Gemini 3 Pro.