Google's Gemini 3.5 Pro Is Coming in June — 2 Million Token Context and Deep Think Reasoning
Google confirmed Gemini 3.5 Pro for a June 2026 launch with a 2-million-token context window, Deep Think extended reasoning mode, and pricing of $15/$60 per million input/output tokens. It directly targets enterprise agentic workloads where Gemini 3.5 Flash regressed.
Google’s Gemini 3.5 Pro is imminent. Confirmed for a June 2026 launch on June 6, the model arrives with specs that position it as a direct answer to OpenAI’s o3-series and Anthropic’s extended thinking models.
The headline: a 2-million-token context window — the longest of any commercially available frontier model at launch — and a “Deep Think” extended reasoning mode that enables chain-of-thought inference for complex, multi-step problems. Deep Think is Google’s equivalent to OpenAI’s thinking parameter and Anthropic’s extended thinking: it trades latency for accuracy on tasks where the model needs to reason through intermediate steps before committing to an answer.
Pricing is confirmed at approximately $15 per million input tokens and $60 per million output tokens. That’s at the premium end of the market — comparable to Claude Opus 4.8 and OpenAI’s o3. The 2-million-token context window is the key differentiator. Processing an entire large codebase, ingesting a full book, or analyzing months of document history in a single call is achievable within that limit.
Context matters in relation to Gemini 3.5 Flash, which went generally available on May 19 at $1.50 input/$9 output per million tokens. Flash is optimized for speed and cost. Pro fills the accuracy gap. Flash suffered a documented regression on multi-step reasoning benchmarks in its GA release — Deep Think in Pro is explicitly built to address that regression.
The target use case is enterprise agentic workloads: long-running AI agents that maintain state across large context windows, reason through complex business logic, and make tool calls over extended sequences. The 2M token window matters specifically here because agents don’t just process single documents — they accumulate context across many steps and multiple tool-call rounds.
No firm date beyond “June 2026” has been given. Polymarket prediction markets placed a late-June release above 70% probability as of June 6. Google typically releases flagship Gemini updates on a cadence tied to developer conferences and quarterly product cycles; Flash launched in May, making a Pro follow-up in June consistent with that pattern.
The pricing gap between Flash ($1.50/$9) and Pro ($15/$60) is a 10x multiple. Whether Deep Think reasoning quality justifies that premium depends entirely on the workload. For one-shot tasks and high-volume inference, Flash remains the practical choice. For agentic pipelines where the model calls tools hundreds of times per session and accuracy compounds, the Pro tier starts to make economic sense — particularly if it closes the regression Flash introduced on multi-step benchmarks.
Gemini 3.5 Pro is the model the enterprise AI market has been waiting for since Flash launched without the reasoning depth that complex agent workflows need.