United States
Google develops Gemini models and integrates AI across Search, Workspace, and Cloud, with leadership in multimodal AI and research via DeepMind.
Region
United States
Updated
May 14, 2026
Product coverage
Products from this provider
Model coverage
Models from this provider
Gemini 1.0
Gemini 1.0 Pro
Gemini 1.0 Pro is a multimodal AI model from Google, capable of understanding and generating text, images, audio, and video. It is designed for complex reasoning tasks, code generation, and multimodal understanding.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
8.5
Gemini 1.0 Ultra
Gemini 1.0 Ultra is Google's most capable multimodal AI model, designed for highly complex tasks. It demonstrates strong performance in reasoning, coding, and understanding across text, images, audio, and video. The model features a large context window for processing extensive information.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
10.1
Gemini 1.5
Gemini 1.5 Flash (May '24)
A lightweight, fast, and cost-effective multimodal model from the Gemini 1.5 family. It is optimized for high-volume, low-latency tasks while maintaining strong performance in coding, reasoning, and long-context understanding.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
10.5
Gemini 1.5 Flash (Sep '24)
Gemini 1.5 Flash is a fast and cost-effective multimodal model optimized for high-volume, low-latency tasks. It features a long context window, making it suitable for processing large amounts of information quickly.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
13.8
Gemini 1.5 Flash-8B
A fast, lightweight multimodal model optimized for low latency and cost-effectiveness, while maintaining support for long-context and multimodal tasks.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
11.1
Gemini 1.5 Pro (May '24)
Gemini 1.5 Pro is a multimodal AI model from Google with a large context window of up to 1 million tokens, excelling in reasoning, coding, and fast response times.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
12
Gemini 1.5 Pro (Sep '24)
A highly capable multimodal model from Google, known for its exceptionally long context window (up to 2 million tokens) and strong performance in reasoning, coding, and understanding complex information across text, images, and audio.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
16
Gemini 2.0
Gemini 2.0 Flash (Feb '25)
Gemini 2.0 Flash is a lightweight, high-performance model in the Gemini 2.0 family, optimized for fast response times and low cost. It supports multimodal inputs (text, image, video) and is designed for efficient, scalable applications.
Input / 1M tokens
$0.15
Artificial Analysis Intelligence Index
18.5
Gemini 2.0 Flash (experimental)
An experimental, fast-response multimodal model from Google's Gemini 2.0 series. It is optimized for low-latency tasks and supports a wide range of input modalities.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
16.8
Gemini 2.0 Flash Thinking Experimental (Dec '24)
An experimental version of Google's Gemini 2.0 Flash model, enhanced with a 'thinking' mode for deeper reasoning. It combines the speed and efficiency of the Flash series with improved capabilities for complex, multi-step problem-solving.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
12.3
Gemini 2.0 Flash Thinking Experimental (Jan '25)
An experimental version of Gemini 2.0 Flash optimized for enhanced reasoning and thinking capabilities. It is designed to provide faster responses while maintaining strong performance in complex problem-solving tasks.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
19.6
Gemini 2.0 Flash-Lite (Feb '25)
A fast and lightweight model from the Gemini 2.0 family, optimized for low-latency and high-throughput applications while maintaining multimodal capabilities.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
14.7
Gemini 2.0 Flash-Lite (Preview)
Gemini 2.0 Flash-Lite is a lightweight, fast, and cost-effective model designed for high-throughput tasks. It maintains strong multimodal capabilities while prioritizing speed and low latency, making it suitable for applications requiring rapid responses.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
14.5
Gemini 2.0 Pro Experimental (Feb '25)
Gemini 2.0 Pro is Google's latest and most capable multimodal AI model, featuring advanced reasoning, code generation, and long-context understanding. It is designed for complex tasks and high-performance applications.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
18.1
Gemini 2.5
Gemini 2.5 Flash (Non-reasoning)
Gemini 2.5 Flash is a lightweight, high-speed model from Google's Gemini 2.5 series, optimized for low latency and cost efficiency. This non-reasoning variant is designed for rapid response tasks, offering multimodal capabilities and a large context window without the overhead of extended chain-of-thought processing.
Input / 1M tokens
$0.30
Output tokens/s
191.1
First-token seconds
0.52s
Artificial Analysis Intelligence Index
20.6
Gemini 2.5 Flash (Reasoning)
Gemini 2.5 Flash (Reasoning) is a lightweight, high-speed model from Google's Gemini 2.5 family, optimized for fast inference and strong reasoning capabilities. It is designed to deliver quick and accurate responses for complex tasks while maintaining efficiency.
Input / 1M tokens
$0.30
Output tokens/s
209.34
First-token seconds
11.28s
Artificial Analysis Intelligence Index
27
Gemini 2.5 Flash Preview (Non-reasoning)
Gemini 2.5 Flash is a fast, multimodal model optimized for low latency and cost-efficiency. This non-reasoning variant is designed for rapid response generation across text, image, and code tasks, making it suitable for high-throughput applications.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
17.8
Gemini 2.5 Flash Preview (Reasoning)
A fast and efficient reasoning model from Google's Gemini family, optimized for tasks requiring deep thought and complex problem-solving while maintaining low latency and cost.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
24.3
Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)
A fast and cost-effective multimodal model from the Gemini 2.5 Flash family, optimized for high-throughput tasks without extended reasoning. It excels at rapid responses across text, image, and audio inputs.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
25.7
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)
A fast, multimodal reasoning model from Google's Gemini 2.5 family, optimized for quick responses and complex problem-solving.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
31.1
Gemini 2.5 Flash-Lite (Non-reasoning)
Gemini 2.5 Flash-Lite is a lightweight, high-speed variant of the Gemini 2.5 Flash model, optimized for low-latency and cost-effective inference. It is designed for straightforward tasks and does not include advanced reasoning or thinking capabilities. As part of the Gemini family, it supports multimodal inputs.
Input / 1M tokens
$0.10
Output tokens/s
213.79
First-token seconds
1.36s
Artificial Analysis Intelligence Index
12.7
Gemini 2.5 Flash-Lite (Reasoning)
A lightweight, cost-effective model from the Gemini family optimized for fast reasoning tasks. It maintains the multimodal capabilities of the Gemini series while prioritizing speed and efficiency.
Input / 1M tokens
$0.10
Output tokens/s
223.94
First-token seconds
19.07s
Artificial Analysis Intelligence Index
17.6
Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)
A lightweight, preview version of the Gemini 2.5 Flash model optimized for speed and cost-efficiency. This non-reasoning variant is designed for fast, low-latency tasks and does not emphasize complex reasoning or chain-of-thought capabilities.
Input / 1M tokens
$0.10
Artificial Analysis Intelligence Index
19.4
Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)
A lightweight, fast, and cost-effective preview model from the Gemini 2.5 family, optimized for reasoning tasks. It balances speed and capability, making it suitable for applications requiring quick, logical responses.
Input / 1M tokens
$0.10
Artificial Analysis Intelligence Index
21.6
Gemini 2.5 Pro
Gemini 2.5 Pro is Google's latest multimodal AI model, featuring strong reasoning and coding capabilities, support for ultra-long context processing, and optimization for complex tasks.
Input / 1M tokens
$1.25
Output tokens/s
126.5
First-token seconds
17.12s
Artificial Analysis Intelligence Index
34.6
Gemini 2.5 Pro Preview (Mar' 25)
Gemini 2.5 Pro is Google's most advanced multimodal model, featuring enhanced reasoning, coding, and long-context capabilities. This preview version showcases significant improvements in complex problem-solving and instruction following.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
30.3
Gemini 2.5 Pro Preview (May' 25)
Gemini 2.5 Pro is a highly capable multimodal model from Google, designed for complex reasoning and coding tasks. It features an extended context window and excels at processing and generating content across text, images, and other modalities.
Input / 1M tokens
$1.25
Artificial Analysis Intelligence Index
29.5
Gemini 3
Gemini 3 Deep Think
Gemini 3 Deep Think is a model from Google's Gemini series, optimized for deep reasoning and complex problem-solving tasks. It leverages advanced thinking processes to handle intricate queries and is part of the multimodal Gemini family.
Input / 1M tokens
$0.00
Gemini 3 Flash Preview (Non-reasoning)
Gemini 3 Flash is a fast, multimodal model optimized for low-latency responses and cost-efficiency. This preview version is designed for general-purpose tasks, including coding and long-context understanding, but is not focused on complex reasoning or chain-of-thought tasks.
Input / 1M tokens
$0.50
Output tokens/s
203.01
First-token seconds
0.88s
Artificial Analysis Intelligence Index
35
Gemini 3 Flash Preview (Reasoning)
A preview version of Google's Gemini 3 Flash model optimized for reasoning tasks. It combines the speed of the Flash variant with enhanced logical and analytical capabilities. As part of the Gemini family, it supports multimodal inputs.
Input / 1M tokens
$0.50
Output tokens/s
201.4
First-token seconds
5.68s
Artificial Analysis Intelligence Index
46.4
Gemini 3 Pro Preview (high)
Gemini 3 Pro Preview (high) is a high-performance multimodal model from Google, designed for complex reasoning, code generation, and tasks requiring a long context window. It represents an advanced iteration in the Gemini series, optimized for high-quality outputs.
Input / 1M tokens
$2.00
Output tokens/s
121.99
First-token seconds
61.87s
Artificial Analysis Intelligence Index
48.4
Gemini 3 Pro Preview (low)
A low-latency, cost-optimized variant of the Gemini 3 Pro Preview model. It is designed for fast response times and efficient operation while retaining core multimodal and reasoning capabilities.
Input / 1M tokens
$2.00
Artificial Analysis Intelligence Index
41.3
Gemini 3.1
Gemini 3.1 Flash-Lite Preview
A lightweight, preview version of the Gemini 3.1 Flash model, optimized for low-latency and cost-sensitive applications. It is designed to provide fast responses while maintaining core multimodal capabilities.
Input / 1M tokens
$0.25
Output tokens/s
283.76
First-token seconds
5.09s
Artificial Analysis Intelligence Index
33.5
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview is a high-performance, multimodal AI model from Google's Gemini series, designed for advanced reasoning and coding tasks. As a preview release, it offers early access to the latest capabilities in the Gemini 3.1 family.
Input / 1M tokens
$2.00
Output tokens/s
127.78
First-token seconds
23.16s
Artificial Analysis Intelligence Index
57.2
Gemma 3
Gemma 3 12B Instruct
Gemma 3 12B Instruct is a 12-billion parameter, open-weight model from Google designed for efficient on-device and edge deployment. It features enhanced reasoning and instruction-following capabilities, along with native multimodal support for processing both text and images. The model offers a strong balance of performance, speed, and accessibility for developers.
Input / 1M tokens
$0.09
Output tokens/s
24.14
First-token seconds
1.86s
Artificial Analysis Intelligence Index
8.8
Gemma 3 1B Instruct
Gemma 3 1B Instruct is a lightweight, instruction-tuned language model from Google's Gemma 3 family. Designed for efficiency and speed, it is optimized for on-device and edge deployment scenarios. This model provides a strong balance of performance and low resource consumption for basic conversational and instruction-following tasks.
Input / 1M tokens
$0.00
Output tokens/s
57.46
First-token seconds
0.68s
Artificial Analysis Intelligence Index
5.6
Gemma 3 270M
Gemma 3 270M is a lightweight, small-parameter language model from Google's Gemma family. Designed for efficiency and speed, it is optimized for deployment on resource-constrained devices like mobile phones or edge hardware, offering low-latency and cost-effective inference.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
7.7
Gemma 3 27B Instruct
Gemma 3 27B Instruct is a 27-billion parameter open-source instruction-tuned model from Google, built on the Gemma 3 architecture. It is designed for strong reasoning and instruction-following capabilities, suitable for a wide range of text generation tasks. The model supports a long context window, making it effective for processing lengthy documents.
Input / 1M tokens
$0.11
Output tokens/s
23.92
First-token seconds
0.76s
Artificial Analysis Intelligence Index
10.3
Gemma 3 4B Instruct
Gemma 3 4B Instruct is a lightweight, efficient instruction-tuned model from Google's Gemma family. It is designed for fast inference and deployment on resource-constrained environments like edge devices or local hardware, while maintaining strong reasoning capabilities for its size.
Input / 1M tokens
$0.04
Output tokens/s
25.66
First-token seconds
1.19s
Artificial Analysis Intelligence Index
6.3
Gemma 3n
Gemma 3n E2B Instruct
Gemma 3n E2B is a lightweight, efficient instruction-tuned model from Google's Gemma family, designed for high performance on edge devices and resource-constrained environments. It balances speed and capability, making it suitable for fast, cost-effective on-device AI applications.
Input / 1M tokens
$0.00
Output tokens/s
58.95
First-token seconds
0.31s
Artificial Analysis Intelligence Index
4.8
Gemma 3n E4B Instruct
Gemma 3n E4B Instruct is a lightweight, efficient instruction-tuned model from Google's Gemma family, optimized for on-device and edge deployment. It delivers strong reasoning and instruction-following capabilities within a compact 4B parameter footprint, making it suitable for fast and cost-effective local inference.
Input / 1M tokens
$0.02
Output tokens/s
36.13
First-token seconds
0.78s
Artificial Analysis Intelligence Index
6.4
Gemma 3n E4B Instruct Preview (May '25)
Gemma 3n E4B Instruct Preview is a 4-billion parameter, instruction-tuned model from Google's Gemma family, optimized for efficient on-device and edge deployment. This preview version focuses on strong instruction following and reasoning capabilities within a compact size.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
10.1
Gemma 4
Gemma 4 26B A4B (Non-reasoning)
Gemma 4 26B A4B (Non-reasoning) is a 26-billion parameter open model from Google, optimized for general-purpose tasks without specialized reasoning enhancements. It is part of the Gemma family, built on Gemini technology, and designed for efficient performance across a wide range of applications.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
27.1
Gemma 4 26B A4B (Reasoning)
Gemma 4 26B A4B is a 26-billion parameter reasoning model from Google's Gemma series. It is designed for complex reasoning tasks and supports multimodal inputs, making it suitable for advanced analysis and problem-solving.
Input / 1M tokens
$0.13
Artificial Analysis Intelligence Index
31.2
Gemma 4 31B (Non-reasoning)
Gemma 4 31B (Non-reasoning) is a 31-billion parameter open model from Google, optimized for general-purpose tasks and fast inference. It is part of the Gemma family, derived from Gemini technology, and is designed for efficient performance in coding, instruction following, and multimodal understanding without the overhead of extended reasoning chains.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
32.3
Gemma 4 31B (Reasoning)
Gemma 4 31B (Reasoning) is a reasoning-optimized variant within Google's Gemma model family. It is designed to excel at tasks requiring deep, step-by-step thinking and logical deduction, such as complex problem-solving and analysis.
Input / 1M tokens
$0.00
Output tokens/s
35.1
First-token seconds
0.96s
Artificial Analysis Intelligence Index
39.2
Gemma 4 E2B (Non-reasoning)
Gemma 4 E2B is a lightweight, efficient model from Google's Gemma family, optimized for fast inference on edge devices or in-browser environments. As a non-reasoning variant, it prioritizes speed and low resource consumption for straightforward tasks.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
12.1
Gemma 4 E2B (Reasoning)
Gemma 4 E2B (Reasoning) is a lightweight, open-source model from Google optimized for efficient reasoning tasks. It is designed to deliver strong performance on logic, math, and code-related problems while maintaining a small footprint suitable for edge or resource-constrained deployments.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
15.2
Gemma 4 E4B (Non-reasoning)
Gemma 4 E4B (Non-reasoning) is a lightweight, efficient 4B parameter model from Google's Gemma series, optimized for fast inference and low-cost deployment. It is designed for general-purpose tasks where speed and efficiency are prioritized over complex reasoning chains.
Input / 1M tokens
$0.30
Output tokens/s
92.96
First-token seconds
0.44s
Artificial Analysis Intelligence Index
14.8
Gemma 4 E4B (Reasoning)
Gemma 4 E4B (Reasoning) is a lightweight, 4-billion parameter model from Google's Gemma series, optimized for strong reasoning and instruction-following capabilities. It is designed for efficient deployment while maintaining performance on complex tasks, and supports multimodal inputs.
Input / 1M tokens
$0.30
Output tokens/s
89.96
First-token seconds
0.95s
Artificial Analysis Intelligence Index
18.8
PaLM-2
PALM-2
PALM-2 is a large language model developed by Google, featuring significant improvements in multilingual, math, and coding capabilities compared to its predecessor. It demonstrates stronger reasoning abilities and faster response speeds.
Input / 1M tokens
$0.00
Artificial Analysis Intelligence Index
8.6
Discussion

Thinking... Make sure you are connected to GitHub server

