---
title: "How Gemini Works: Deep Research, Gems, Canvas, Imagen, Veo, and Live"
description: "Every Gemini feature in depth: Deep Research and Deep Research Max, Gems, Canvas, Audio Overviews, NotebookLM, Workspace integration, Imagen 4, Veo 3.1, Live, Project Astra, Computer Use, and the tier-to-model transparency gap."
url: "https://suprmind.ai/hub/gemini/features/"
published: "2026-05-12T00:10:29+00:00"
modified: "2026-05-12T02:41:34+00:00"
type: page
schema: WebPage
language: en-US
site_name: Suprmind
---

# How Gemini Works: Deep Research, Gems, Canvas, Imagen, Veo, and Live

> Every Gemini feature in depth: Deep Research and Deep Research Max, Gems, Canvas, Audio Overviews, NotebookLM, Workspace integration, Imagen 4, Veo 3.1, Live, Project Astra, Computer Use, and the tier-to-model transparency gap.

Gemini Features Deep Dive

# How Gemini Works: Deep Research, Gems, Canvas, Imagen, Veo, and Live

Gemini ships ten distinct user-facing features split across four categories: research and reasoning (Deep Research, Deep Research Max), customization (Gems, Canvas), conversational and audio interfaces (Audio Overviews, NotebookLM, Live, Project Astra), workspace integration (Gmail, Docs, Sheets, Slides, Meet), and media generation (Imagen 4, Veo 3.1).

This guide covers what each feature actually does, how it works mechanically, when to use it, when not to, and the documented limitations and transparency gaps. For tier requirements, see the [Gemini Pricing Guide](/hub?page_id=5206). For comparisons against Claude, ChatGPT, Grok, and Perplexity equivalents, see [Gemini vs Other AI Models](/hub/gemini/vs-other-ai/).

Last verified May 10, 2026. Next refresh due August 10, 2026.

## See how Gemini Works With other Four Frontier AI Models in Multi-AI Orchestrated Business Discussion









Deep Research and Deep Research Max

## How multi-step research works at the agentic layer.

Deep Research is the feature that turns Gemini from a chat model into a research agent. Activated through a UI toggle in the Gemini app or via the Deep Research model selection in the model picker, it fires an iterative retrieval-augmented-generation loop. The agent decomposes the query into sub-topics, browses up to hundreds of websites iteratively (plus the user’s Gmail, Drive, and Chat if permitted), follows fresh links, summarizes findings in an internal scratchpad, and synthesizes the result into a multi-page cited report.

The output is a structured research document with numbered source citations. Reports can be converted to Audio Overview format (two-host podcast-style audio), to Canvas for further editing, to interactive exploration formats, or to quizzes for retention testing. The conversion options sit at the top of the report when generation completes.

Deep Research Max launched 2026-04-20 as the higher-tier variant. It runs longer iterations, traverses deeper through linked sources, and adds Model Context Protocol (MCP) server integration plus native visualizations to the synthesis stage. The API exposes two model variants as of 2026-04-21: `deep-research-preview-04-2026` for speed and streaming, and `deep-research-max-preview-04-2026` for maximum comprehensiveness at higher cost.

### Tier Availability

-**Free tier:**5 reports per month.
-**Google AI Plus:**more access (exact number not disclosed).
-**Google AI Pro:**5x more Audio Overviews than Free, implying higher Deep Research quota.
-**Google AI Ultra:**highest limits, plus the visual exploration output that lower paid tiers do not get.
-**API:**paid tier with model-specific pricing.

### Documented Limitations

Source quality varies. Deep Research surfaces blogs alongside peer-reviewed sources, marketing pages alongside primary government documents. The synthesis layer cites accessed URLs but does not independently verify whether the claims at those URLs are accurate. The user-side verification load is real: the report contains citations that the user must validate against the original sources before relying on the conclusions for any high-stakes decision.

The hard limits: maximum sources browsed is “up to hundreds” per Google’s official language with no specific cap published. The API file size limit is 100 MB (increased from 20 MB on 2026-01-08). The Free tier cap of 5 reports per month is the firmest published constraint.





Gems

## Custom AI personas with the four-field construction model.

Gems are customizable Gemini chat instances built through the Gem Builder. The construction model defines four fields: Persona (the role the Gem plays), Task (what the Gem should do), Context (how the Gem performs the task), and Format (how the output should be presented). Up to 10 reference files can be attached to each Gem and used across all interactions.

Gems persist across sessions and retain their configured instructions. A user can create a Gem for “weekly Python code reviewer” with attached coding standards documents, a Gem for “meal planner with my dietary restrictions” with attached preferences, and a Gem for “writing coach in my style” with attached samples. Each Gem operates in its own conversation namespace.

Google also provides pre-built Gems in the Gems Manager. The pre-built set covers common use cases (writing coach, code helper, brainstorm partner). The functional comparison: Gems are Google’s equivalent of GPT Custom GPTs, with comparable construction patterns and a 10-file reference attachment limit.

### Tier Availability and Workspace Integration

Available on Free tier with limits. Full Gem creation is confirmed for paid tiers, though specific per-day or per-month creation limits are not publicly enumerated. Gems can be integrated into Google Workspace apps including Gmail, Docs, and Drive, surfacing inside those apps as configured assistants rather than only inside the Gemini chat interface.

The hard limit worth noting: the 10-file reference attachment cap means workflows that depend on a larger reference corpus cannot use Gems alone. For corpus sizes above 10 files, NotebookLM is the firmer fit since it accepts larger source sets and grounds responses in the source material rather than parametric knowledge.





Canvas

## Side-by-side workspace with the targeted-edit pattern.

Canvas opens a split-panel interface inside the Gemini app. The chat sits on the left, and the document, code, slides, or app prototype sits on the right. Users can type directly in the Canvas panel or issue edit instructions through the prompt box. Changes auto-save. The panel supports documents, code, web apps, slides, and code prototypes.

The targeted-edit pattern is the differentiator. Users can select a section of text or code in the Canvas panel and prompt Gemini to revise that specific section. The model reads the selection plus the surrounding context and proposes an edit without regenerating the entire document. The pattern is comparable in function to Claude’s Artifacts feature.

Canvas output formats supported include Audio Overview (the document becomes a two-host audio summary), quiz, infographic, flashcards, and web app. The format conversion runs through the format selector at the top of the Canvas panel.

### Tier Availability

Basic Canvas (documents, code) is available to Free users. Visual and interactive report output from Deep Research into Canvas is Ultra-only as confirmed by independent third-party reporting from late 2025. Workspace Enterprise Business edition has a Canvas feature toggle in the enterprise admin interface, allowing organization-level Canvas enablement for business users.

The hard limits: Pro tier subscription marketing references up to 1,500 pages of file uploads and up to 30,000 lines of code. App and web-app generation in Canvas relies on the underlying Gemini model’s context limits rather than separately enumerated Canvas-specific caps.





Audio Overviews and NotebookLM

## Two-host audio synthesis integrated into the consumer app.

Audio Overviews convert source documents, slides, and Deep Research reports into podcast-style discussions between two AI hosts. The two-host dialogue pattern was pioneered by NotebookLM, the standalone notebook-first product, and integrated into the Gemini consumer app on 2025-03-17.

In the Gemini app, Audio Overview generation is tied to the Deep Research model selection: a Deep Research report can be converted to Audio Overview format from within the result view. The audio runs in the background, allowing concurrent work in the chat interface during generation. In NotebookLM, Audio Overview generation runs per notebook through the Studio panel, with one audio overview per notebook.

NotebookLM Plus is the paid NotebookLM tier with higher source counts per notebook, longer audio output, and customization controls. NotebookLM Enterprise is the Workspace tier with API access via the `notebooks.audioOverviews.create` method, integrated into Workspace identity and access controls.

### Tier Availability

-**Free:**NotebookLM access included with platform limits.
-**Google AI Plus:**more Audio Overviews and notebooks.
-**Google AI Pro:**5x more Audio Overviews than Free plus expanded notebook limits.
-**Google AI Ultra:**highest limits and best model capabilities.

The hard limit: one audio overview per notebook through the API. Specific notebook count limits and source-per-notebook caps are not publicly enumerated for consumer tiers in available documentation.





Workspace Integration

## Gmail, Docs, Sheets, Slides, Meet. The integration depth is the moat.

Gemini in Workspace surfaces as a side panel or inline assistant inside Google Workspace applications. The integration depth differs across applications.

#### Gmail

Drafting full replies from short bullet points, summarizing long threads, suggesting calendar invites from email content, Smart Compose extension.

#### Docs

Writing assistance, paragraph rewriting, tone adjustment, format restructuring, section generation from prompts.

#### Sheets

Formula generation from natural language descriptions, data analysis suggestions, chart recommendations.

#### Meet

Meeting note generation, action item extraction, post-meeting summary delivery.

#### Slides and Vids

Slide generation from outlines, slide rewriting from feedback, image generation through Imagen integration. Vids: AI video creation from prompts and assets.

### Tier Availability

Free tier: Gemini in Gmail only as a basic side panel feature, plus Gemini app chat access. The deep Workspace integration across all five applications requires either Google AI Plus (Gmail and Vids and more), Google AI Pro (Gmail, Docs, Vids, and more), or Google AI Ultra (highest limits across all apps). The Workspace Business plans bundle the integration with graduated feature access by plan tier.

The integration depth is structurally hard to replicate elsewhere. For organizations already standardized on Google Workspace, the in-app integration creates real switching cost relative to a stand-alone external chat interface. The relevant procurement question is rarely “Gemini API cost vs ChatGPT API cost.” It is whether the Workspace integration depth offsets the calibration deficit per the Suprmind Multi-Model Divergence Index, April 2026 Edition.





Imagen 4 – Image Generation

## Three quality tiers in the dedicated API. Nano Banana for native in-chat generation.

Imagen 4 is the dedicated text-to-image API model family with three speed and quality variants: Fast, Standard, and Ultra. Imagen 4 Standard and Ultra reached general availability on 2025-08-14, with Imagen 4 Fast on the same date.

The native image generation variant in the Gemini model itself is separate. Nano Banana (Gemini 2.5 Flash Image) reached general availability on 2025-10-02, allowing image generation and editing in the same model context as text. Nano Banana 2 (Gemini 3.1 Flash Image Preview) launched 2026-02-26. Nano Banana Pro is in preview as of the research date, positioned as state-of-the-art for highly contextual native image creation.

The architectural distinction matters for workflow design. The Imagen 4 family is the dedicated image-only API with per-image pricing. The Nano Banana family is image generation integrated inside the conversational Gemini model, allowing iterative image editing within a chat context. For workflows where the image is the deliverable, Imagen 4 is the firmer path. For workflows where the image accompanies a longer conversational task, Nano Banana fits the integrated context better.

### API Pricing (Imagen 4)

Variant

Per-Image Cost

Use Case

Imagen 4 Fast

$0.02

High-volume exploration

Imagen 4 Standard

$0.04

Default production tier

Imagen 4 Ultra

$0.06

Highest-quality output

The text rendering quality on Imagen 4 was a specific improvement focus. Independent reporting at the launch period flagged better text rendering and overall image quality up to 2K resolution as the headline change versus prior generations.

#### The 2024 Image Generation Controversy

Worth flagging because it shaped Gemini’s brand reputation. In February 2024, Google paused human image generation after users demonstrated that Gemini was producing historically inaccurate images that predominantly featured people of color regardless of historical context. The examples included Black Founding Fathers and Nazi soldiers of non-European descent. Google SVP Prabhakar Raghavan acknowledged the company “missed the mark.” The feature was paused, recalibrated, and resumed. The controversy remains the most prominent public failure associated with the Gemini brand and is referenced in regulatory filings and academic literature on AI safety calibration.





Veo 3.1 – Video Generation

## Up to 4K with native audio synthesis. Reference images, frame control, portrait orientation.

Veo 3.1 is Google’s current video generation model, available in the Gemini app (consumer) through the Flow filmmaking platform and via the API. The Veo line launched in May 2024 in preview, with Veo 2 reaching GA on 2025-04-09, Veo 3 on 2025-09-09 (the first model to generate synchronized audio natively), and Veo 3.1 in preview from 2025-10-15. Veo 3.1 Lite launched 2026-03-31 as the lower-tier variant.

Veo 3.1 generates video from text prompts or image inputs at up to 4K resolution. The model supports portrait orientation, video extension (extending an existing clip), reference image inputs (up to 3), and first/last frame specification (precise control over the opening and closing shots). The audio synthesis runs natively alongside video, producing dialogue, sound effects, and ambient noise synchronized with the visual track.

### Tier Availability and API Pricing

-**Veo 3.1 (full):**Ultra subscribers (consumer).
-**Veo 3.1 Lite:**AI Plus and Pro tiers (limited access).
-**Free tier:**limited access to Veo 3.1 via Flow.

### API Pricing per Second of Generated Video

Variant

720p

1080p

4K

Standard with audio

$0.40

$0.40

$0.60

Fast with audio

$0.10

$0.12

$0.30

Lite with audio

$0.05

$0.08

n/a

The per-second pricing structure means a 30-second 1080p Veo 3.1 Standard clip costs $12 in pure inference. The Lite variant at 720p is $1.50 for the same duration, the cheapest path for low-resolution exploratory generation.





Gemini Live and Project Astra

## Real-time voice with low-latency interruption and snapshot-based camera input.

Gemini Live is the real-time voice conversation mode in the Gemini app. The mode supports back-and-forth spoken interaction with low latency, interruption handling (the user can talk over Gemini and the response adapts), context retention across the voice session, and integration with the phone’s camera for visual context during conversation.

Project Astra is the underlying research initiative. It explores breakthrough capabilities for real-time multimodal AI assistance, including spatial processing, screen sharing, and tool use across Google apps. Project Astra is not a standalone shipping product. Its capabilities are progressively incorporated into the Gemini app and the Live mode.

The camera integration runs as snapshot-based capture rather than continuous video stream at the consumer rollout. The user points the phone camera, and Gemini analyzes a snapshot or short sequence. The screen sharing capability allows Gemini to observe what is on the user’s device screen and provide contextual responses. Tool use and Google app integration (Search, Gmail, Calendar, Maps) layer the agentic capability on top of the conversational surface.

### API Model and Pricing

The current Live API model is `gemini-3.1-flash-live-preview` (launched 2026-03-26).

-**Text input:**$0.75 per million tokens.
-**Audio input:**$3.00 per million tokens.
-**Image and video input:**$0.002 per minute.
-**Text output:**$4.50 per million tokens.
-**Audio output:**$12.00 per million tokens.

The audio output rate is the highest per-token rate in the Gemini API, reflecting the inference cost of voice synthesis at conversational latency. For workflows where high-volume audio output is the deliverable, the per-million-token output rate is the cost driver.

### Tier Availability

Gemini Live basic: Free tier and above. Project Astra camera and screen-sharing capabilities: originally required paid tier, with broader rollout to Android 10+ devices through 2025. Agentic agent mode (Gemini Agent in Ultra tier): US-only, English-only.





Computer Use and Jules

## Agentic browser control. Asynchronous coding agent.

Computer Use is the model capability that allows Gemini to “see” a digital screen and perform UI actions like clicking, typing, and navigating. It is exposed through the API as a specialized model and as a tool callable from Gemini 3 Pro and Gemini 3 Flash.

The Gemini 2.5 Computer Use Preview launched 2025-10-07. Computer Use was added as a tool to Gemini 3 Pro and Gemini 3 Flash on 2026-01-29. The model receives screen content as input and emits UI actions as output. Workflows can chain perception (read screen) with action (click, type, navigate) to automate browser tasks that previously required manual operation.

Jules is the asynchronous coding agent referenced in the May 2026 subscription page. Jules operates on code repositories and runs in the background, comparable in positioning to coding agent products from other vendors. Jules availability is currently in Beta with English-only and 18+ requirements, plus a capacity caveat that means access is not always guaranteed.

Google Antigravity, referenced in the subscription page, is the agentic development platform separate from core Gemini.

### API Pricing (Computer Use)

`gemini-2.5-computer-use-preview-10-2025`: $1.25 per million input tokens (for inputs ≤200,000 tokens), $10.00 per million output tokens.

### Tier Availability

Computer Use API: paid tier. Jules: Pro tier higher limits, Ultra tier highest limits (Beta with the English-only and 18+ caveats). Gemini Agent mode: US-only, English-only, Ultra tier exclusive.





Tier-to-Model Transparency and Citation Mechanics

## Two cross-feature behaviors that shape every workflow on the platform.

Two cross-feature behaviors warrant separate coverage because they affect every feature in the network.

### Tier-to-Model Transparency

The Gemini app’s model selector shows model names (3.1 Pro, 3 Flash, etc.) in a dropdown when users manually switch. The default model delivered per tier is described in subscription marketing language only: Free gets 3 Flash plus varying access to 3.1 Pro, AI Plus gets enhanced access to 3.1 Pro, AI Pro gets higher access to 3.1 Pro, AI Ultra gets the highest limits. No UI element in the default chat surface displays the exact model ID or version being used for any given query.

The tier-to-model mapping is documented in subscription marketing but not surfaced at inference time. This is a documented user pain point in the developer community (GitHub VS Code issue 283194, 2025-04-21). Developers using the API must specify model IDs explicitly to lock model identity, since the `gemini-pro-latest` and `gemini-flash-latest` aliases were updated in January 2026 to point to Gemini 3 generation models, and Google’s documentation states aliases are periodically hot-swapped with two-week email notice. Single-source confirmation of which model a specific UI query hits is not available to the end user.

### Citation Mechanics

In the Gemini app, citations appear when Google Search grounding is active. Citations link to web sources. The system does not currently distinguish between claims sourced from the model’s parametric knowledge versus claims grounded via real-time web search in standard consumer output. Users seeing a Gemini response that includes both grounded and parametric content cannot tell which claims have a source backing and which do not without manually checking the citation list against each claim.

In Deep Research, citations are more explicit. Reports include numbered source citations with links to the web pages browsed during the research session. Each numbered citation maps to a specific section of the synthesis. This is the citation pattern most likely to support audit-quality research workflows.

In the API, the Grounding with Google Search tool returns grounding metadata with source URLs. The File Search API (launched 2025-11-06) returns `media_id` and `page_numbers` for visual citations against uploaded documents. Per Suprmind’s AI Hallucination Rates and Benchmarks reference (May 2026 update), Gemini 3 Pro scored 76% on the Columbia Journalism Review citation hallucination test. This is significantly higher than Perplexity Sonar Pro at 37% (best of any model tested). For citation-grounded research workflows where attribution accuracy matters, pair Gemini for breadth with Perplexity for citation grounding.





Document Handling

## Formats, file size limits, and parser fidelity gaps.

Gemini handles document upload and analysis through both the consumer app and the API. The supported format set covers most everyday workflows.

### Supported Formats

-**Text and code:**plain text, Markdown, code files (Python, JavaScript, others), CSV, JSON.
-**Document formats:**PDF (supported as of 2024-08-09), DOCX.
-**Image formats:**PNG, JPEG, WebP.
-**Audio formats:**various standard audio inputs.
-**Video formats:**MP4, MOV, WebM. Gemini 3 generation supports native video understanding with per-minute pricing on the Live API.

The video understanding capability is unique within the Gemini family at the consumer tier. The 1M token context window enables analysis of approximately one hour of video at standard resolution.

### File Size Limits

-**Chat UI:**subscription marketing references up to 1,500 pages of file uploads in Pro tier.
-**API:**100 MB per file (increased from 20 MB on 2026-01-08).
-**Code repository upload:**up to 30,000 lines mentioned in subscription marketing.
-**Cloud Storage bucket URLs:**also supported as of 2026-01-08.

The 100 MB API limit is meaningfully higher than several competitor APIs and supports workflows that require larger document ingestion. Combined with the 1M context window, the practical ceiling for long-document workflows is the published MRCR v2 accuracy curve rather than the file size cap. Plan workflows to keep retrieval and reasoning inside 128k tokens where accuracy is high.

### Parser Fidelity

PDF parsing is confirmed for both chat UI and API. The multimodal embedding model `gemini-embedding-2` (GA 2026-04-22) added PDF as a native input type, allowing PDF content to be embedded for retrieval without intermediate text extraction. What is not formally documented in available sources: DOCX table extraction fidelity, embedded image extraction from documents, footnote handling, and OCR behavior on scanned PDFs. If your workflow depends on these specifics, test empirically rather than relying on documentation.





Feature Availability Matrix

## Every feature, every tier, at a glance.

Tier availability for several features is not enumerated in official Google docs as of May 2026. Treat tier-specific limits as Volatile and verify at gemini.google.com/subscriptions before relying on the cap for production planning.

Feature

Free

AI Plus

AI Pro

AI Ultra

API

Deep Research

5/month

More

5x Free

Highest + visual

Yes (preview)

Deep Research Max

No

Limited

Limited

Yes

Yes

Gems

Limited

Yes

Full

Full

Custom

Canvas

Basic

Basic

Full

Full + visual

n/a

Audio Overviews

Limited

More

5x Free

Highest

NotebookLM API

NotebookLM

Yes

More

More

Highest

Workspace API

Workspace integration

Gmail only

Gmail, Vids

All apps

Highest

Bundle

Imagen 4

Limited

Nano Banana Pro

Nano Banana Pro

Full + highest

Per-image

Veo 3.1

Via Flow

Lite

Lite

Full

Per-second

Gemini Live

Basic

Yes

Yes

Highest

Live API

Project Astra

Limited

Camera/screen

Camera/screen

Full agentic

n/a

Computer Use

No

No

Limited

Agent (US)

Yes (paid)

Jules (coding)

No

No

Higher

Highest (Beta)

Beta





FAQ

## Gemini Features: Frequently Asked Questions

 What is Gemini Deep Research?

 +



Deep Research is an agentic feature in Gemini that autonomously browses up to hundreds of websites, plus a user’s Gmail, Drive, and Chat if permitted, then synthesizes findings into a multi-page cited report. Mechanically, it runs an iterative search-read-synthesize loop powered by Gemini 3.1 Pro. Deep Research Max (launched 2026-04-20) adds MCP support and native visualizations for long-horizon professional research tasks.

 What are Gems in Gemini?

 +



Gems are customizable AI personas within the Gemini consumer application. Users configure a Gem with a name, behavioral instructions, a specific role, and up to 10 reference files. Gems persist across sessions and retain their configured instructions. They are comparable in function to GPT Custom GPTs on the ChatGPT platform. Gem creation is available starting from the Free tier with full creation on paid tiers.

 How does Gemini Canvas work?

 +



Canvas is a side-by-side workspace within Gemini where the model generates and iteratively edits formatted documents, code, or structured outputs in a separate panel from the chat interface. The user can request revisions targeting specific sections without regenerating the full document. Canvas is comparable in function to Claude’s Artifacts feature. Available on Free tier (basic) with full visual output on Pro and Ultra.

 What is Gemini Live?

 +



Gemini Live is a real-time voice conversation mode in the Gemini app that enables back-and-forth spoken interaction with low latency. It allows interruption, context retention across the voice session, and integration with the phone’s camera (visual context during conversation). It is available on Android and iOS. Project Astra is the research initiative underlying Live’s multimodal real-time capabilities.

 Can Gemini analyze videos?

 +



Yes. Gemini 3.1 Pro and the Gemini 2.5+ generation support native video understanding. The model processes video frames and audio tracks as input and can answer questions about video content, summarize footage, and identify elements within clips. The 1M token context window enables analysis of approximately one hour of video at standard resolution.

 Does Gemini generate images?

 +



Yes. Gemini’s image generation capability uses the Imagen 4 family of models (Fast, Standard, Ultra) and the native Nano Banana variant integrated into the Gemini model itself. The API offers pay-per-image pricing: Fast at $0.02, Standard at $0.04, Ultra at $0.06. Consumer app image generation is available on Free tier (limited) and expanded on Pro and Ultra tiers.

 Does Gemini generate videos?

 +



Yes. Veo 3.1 is the current video generation model, available through the Flow filmmaking platform in the Gemini app and via API. Veo 3.1 generates video at up to 4K with native audio synthesis. Tier availability: Ultra subscribers get full Veo 3.1, Plus and Pro tiers get Veo 3.1 Lite, Free tier gets limited access via Flow. API per-second pricing ranges from $0.05 (Lite 720p) to $0.60 (Standard 4K).

 What is Project Astra?

 +



Project Astra is Google DeepMind’s research prototype for a universal AI assistant with real-time multimodal understanding. It demonstrated real-time camera-to-speech understanding at Google I/O 2024 and serves as the research foundation for Gemini Live’s real-time capabilities. Project Astra is not a separate shipping product. Its capabilities are progressively incorporated into the Gemini app.

 Can Gemini control my computer or browser?

 +



Yes, through the Computer Use capability. Gemini can “see” a digital screen and perform UI actions like clicking, typing, and navigating to automate browser tasks. Available through the API (paid tier) as a specialized model and as a tool on Gemini 3 Pro and Gemini 3 Flash. Gemini Agent mode for full agentic browsing is currently US-only and English-only on the Ultra tier.

 How accurate are Gemini’s citations in Deep Research?

 +



Per Suprmind’s AI Hallucination Rates and Benchmarks reference (May 2026 update), Gemini 3 Pro scored 76% on the Columbia Journalism Review citation hallucination test. This means citations are generated and link to real sources, but the claimed information often does not match the source content. The CJR test scores higher than Grok-3 (94%) but trails Perplexity Sonar Pro (37%, best of any model). For citation-grounded research where attribution accuracy is the audit point, pair Gemini for breadth with Perplexity for citation validation.





## Gemini’s features are deep. Suprmind orchestrates five model families.

Use Gemini for multimodal breadth and Workspace integration. Pair with Claude for calibration, Perplexity for citation accuracy, GPT for math reasoning, and Grok for contrarian signal. All in one shared conversation, with cross-model fact-checking before any answer reaches your decision.

 [Start Your Free Trial](/signup/spark)

 [See How Suprmind Works](/hub?page_id=2571)


7-day free trial. All five frontier models. No credit card required.





Disagreement is the feature.

Last verified May 10, 2026. Next refresh due August 10, 2026.

---

*Source: [https://suprmind.ai/hub/gemini/features/](https://suprmind.ai/hub/gemini/features/)*
*Generated by FAII AI Tracker v3.3.0*