Google launches Gemini 3.5 Flash with 4x speed edge over rival frontier models

The Flash-tier model beats Gemini 3.1 Pro on agentic benchmarks and ships today across Search, Android Studio, and six named enterprise partners.

Alessandro Benigni

PUBLISHED MAY 20, 2026

3 MIN READ

Follow on Google

MAY 20, 2026

Google launches Gemini 3.5 Flash with 4x speed edge over rival frontier models — featured image for AI Insiders

Google shipped Gemini 3.5 Flash on May 19, making it the default model in the Gemini app and AI Mode in Google Search while simultaneously opening access to enterprise buyers and developers through multiple distribution channels. The release is the opening entry in the Gemini 3.5 family; a Pro variant is in internal testing and is expected to ship within a month.

The headline performance number is speed. According to Google’s announcement on the Google Blog, 3.5 Flash produces output tokens four times faster than other frontier models at comparable capability levels. On the Artificial Analysis intelligence-versus-speed index, the model places in the top-right quadrant, a position Google has historically reserved for its Pro-tier releases. The company says it scores 76.2% on Terminal-Bench 2.1, 83.6% on MCP Atlas, and 84.2% on CharXiv Reasoning for multimodal understanding, all exceeding Gemini 3.1 Pro on the same benchmarks. Those benchmarks are Google’s own or community benchmarks Google selected; the announcement does not include independent third-party evaluation results.

The distribution footprint is wider than any previous Gemini release at launch. Developers can access the model through the Gemini API in Google AI Studio and Android Studio, through the Antigravity agent platform (Google’s agent-first development environment), and through Gemini Enterprise Agent Platform and Gemini Enterprise for corporate buyers. This simultaneous rollout across consumer, developer, and enterprise channels contrasts with the typical pattern of Google’s previous Flash releases, which launched to developers first and reached consumer surfaces weeks later.

Six enterprise partners are cited by name in the announcement: Shopify (parallel subagent data analysis for merchant forecasting), Macquarie Bank (document reasoning for customer onboarding), Salesforce (Agentforce multi-agent task automation), Ramp (invoice OCR through multimodal reasoning), Xero (autonomous 1099 workflow management for small businesses), and Databricks (real-time dataset diagnostics). The specificity of these case studies is notable because Google usually leads I/O model announcements with benchmark tables and follows with partner logos weeks later. Including named production deployments at launch signals a faster enterprise adoption cycle than the Gemini 2.x rollout managed.

The agentic framing is deliberate. Google positions 3.5 Flash explicitly against long-horizon tasks, not conversational use. The Antigravity harness, an orchestration layer for deploying collaborative subagents, ships alongside the model. Google’s blog post describes one demo where two agents synthesized the AlphaZero paper and wrote a fully playable game implementation in six hours. That is a company claim, not an audited result, but the specificity of the timeline is a measurable commitment developers will be able to verify.

The personal agent product, Gemini Spark, also runs on 3.5 Flash. Spark is described as a 24/7 autonomous agent that takes action on the user’s behalf under their direction. It is entering limited trusted-tester rollout now, with a beta planned for Google AI Ultra subscribers in the US within a week. Spark represents Google’s clearest move yet into the ambient, always-on personal assistant category that both OpenAI’s memory layer and Anthropic’s Projects feature are competing for.

On safety, Google says 3.5 was developed under its Frontier Safety Framework and uses interpretability tools to inspect the model’s reasoning before generating responses. The announcement does not disclose the scope or methodology of these evaluations beyond the policy framework reference.

The cost picture is incomplete. Google claims 3.5 Flash often runs at less than half the cost of competing frontier models, but API pricing was not included in the announcement post. Teams evaluating whether to migrate workloads from GPT-4.1 or Claude Sonnet will need to pull the current Gemini API pricing page to model actual cost comparisons before the 3.5 Pro release shifts the competitive baseline again.

Source: Google Blog, published May 19, 2026, authored by Koray Kavukcuoglu.

Google launches Gemini 3.5 Flash with 4x speed edge over rival frontier models

The morning brief for people inside the AI industry.

More in Models

Anthropic Finds a Workspace for Deliberate Thought in Claude

Broadcom Locks In Apple Silicon Deal Through 2031

Tencent Ships Hy3, a 295B Open Model, Free Through July 21