United States

Google

Google develops Gemini models and integrates AI across Search, Workspace, and Cloud, with leadership in multimodal AI and research via DeepMind.

Website

Products

Models

Available

Benchmarks

Region

United States

Updated

May 14, 2026

Product coverage

Products from this provider

No products have been linked to this provider yet.

Model coverage

Models from this provider

Gemini 1.0

Gemini 1.0 Pro

Gemini 1.0 Pro is a multimodal AI model from Google, capable of understanding and generating text, images, audio, and video. It is designed for complex reasoning tasks, code generation, and multimodal understanding.

MultimodalReasoningCoding

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

8.5

Google

Gemini 1.0 Ultra

Gemini 1.0 Ultra is Google's most capable multimodal AI model, designed for highly complex tasks. It demonstrates strong performance in reasoning, coding, and understanding across text, images, audio, and video. The model features a large context window for processing extensive information.

CodingReasoningLong contextMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

10.1

Gemini 1.5

Gemini 1.5 Flash (May '24)

A lightweight, fast, and cost-effective multimodal model from the Gemini 1.5 family. It is optimized for high-volume, low-latency tasks while maintaining strong performance in coding, reasoning, and long-context understanding.

CodingReasoningFastCheapLong contextMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

10.5

Google

Gemini 1.5 Flash (Sep '24)

Gemini 1.5 Flash is a fast and cost-effective multimodal model optimized for high-volume, low-latency tasks. It features a long context window, making it suitable for processing large amounts of information quickly.

FastCheapLong contextMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

13.8

Google

Gemini 1.5 Flash-8B

A fast, lightweight multimodal model optimized for low latency and cost-effectiveness, while maintaining support for long-context and multimodal tasks.

FastCheapMultimodalLong context

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

11.1

Google

Gemini 1.5 Pro (May '24)

Gemini 1.5 Pro is a multimodal AI model from Google with a large context window of up to 1 million tokens, excelling in reasoning, coding, and fast response times.

CodingReasoningFastCheapLong contextMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

Google

Gemini 1.5 Pro (Sep '24)

A highly capable multimodal model from Google, known for its exceptionally long context window (up to 2 million tokens) and strong performance in reasoning, coding, and understanding complex information across text, images, and audio.

CodingReasoningLong contextMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

Gemini 2.0

Gemini 2.0 Flash (Feb '25)

Gemini 2.0 Flash is a lightweight, high-performance model in the Gemini 2.0 family, optimized for fast response times and low cost. It supports multimodal inputs (text, image, video) and is designed for efficient, scalable applications.

FastCheapMultimodal

Input / 1M tokens

$0.15

Artificial Analysis Intelligence Index

18.5

Google

Gemini 2.0 Flash (experimental)

An experimental, fast-response multimodal model from Google's Gemini 2.0 series. It is optimized for low-latency tasks and supports a wide range of input modalities.

FastMultimodalReasoningCoding

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

16.8

Google

Gemini 2.0 Flash Thinking Experimental (Dec '24)

An experimental version of Google's Gemini 2.0 Flash model, enhanced with a 'thinking' mode for deeper reasoning. It combines the speed and efficiency of the Flash series with improved capabilities for complex, multi-step problem-solving.

ReasoningFastMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

12.3

Google

Gemini 2.0 Flash Thinking Experimental (Jan '25)

An experimental version of Gemini 2.0 Flash optimized for enhanced reasoning and thinking capabilities. It is designed to provide faster responses while maintaining strong performance in complex problem-solving tasks.

ReasoningFastCoding

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

19.6

Google

Gemini 2.0 Flash-Lite (Feb '25)

A fast and lightweight model from the Gemini 2.0 family, optimized for low-latency and high-throughput applications while maintaining multimodal capabilities.

FastCheapMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

14.7

Google

Gemini 2.0 Flash-Lite (Preview)

Gemini 2.0 Flash-Lite is a lightweight, fast, and cost-effective model designed for high-throughput tasks. It maintains strong multimodal capabilities while prioritizing speed and low latency, making it suitable for applications requiring rapid responses.

FastCheapMultimodalReasoning

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

14.5

Google

Gemini 2.0 Pro Experimental (Feb '25)

Gemini 2.0 Pro is Google's latest and most capable multimodal AI model, featuring advanced reasoning, code generation, and long-context understanding. It is designed for complex tasks and high-performance applications.

CodingReasoningFastLong contextMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

18.1

Gemini 2.5

Gemini 2.5 Flash (Non-reasoning)

Gemini 2.5 Flash is a lightweight, high-speed model from Google's Gemini 2.5 series, optimized for low latency and cost efficiency. This non-reasoning variant is designed for rapid response tasks, offering multimodal capabilities and a large context window without the overhead of extended chain-of-thought processing.

FastCheapMultimodalLong context

Input / 1M tokens

$0.30

Output tokens/s

191.1

First-token seconds

0.52s

Artificial Analysis Intelligence Index

20.6

Google

Gemini 2.5 Flash (Reasoning)

Gemini 2.5 Flash (Reasoning) is a lightweight, high-speed model from Google's Gemini 2.5 family, optimized for fast inference and strong reasoning capabilities. It is designed to deliver quick and accurate responses for complex tasks while maintaining efficiency.

ReasoningFast

Input / 1M tokens

$0.30

Output tokens/s

209.34

First-token seconds

11.28s

Artificial Analysis Intelligence Index

Google

Gemini 2.5 Flash Preview (Non-reasoning)

Gemini 2.5 Flash is a fast, multimodal model optimized for low latency and cost-efficiency. This non-reasoning variant is designed for rapid response generation across text, image, and code tasks, making it suitable for high-throughput applications.

FastCheapMultimodalCodingLong context

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

17.8

Google

Gemini 2.5 Flash Preview (Reasoning)

A fast and efficient reasoning model from Google's Gemini family, optimized for tasks requiring deep thought and complex problem-solving while maintaining low latency and cost.

ReasoningFastMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

24.3

Google

Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)

A fast and cost-effective multimodal model from the Gemini 2.5 Flash family, optimized for high-throughput tasks without extended reasoning. It excels at rapid responses across text, image, and audio inputs.

FastCheapMultimodalLong context

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

25.7

Google

Gemini 2.5 Flash Preview (Sep '25) (Reasoning)

A fast, multimodal reasoning model from Google's Gemini 2.5 family, optimized for quick responses and complex problem-solving.

CodingReasoningFastCheapMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

31.1

Google

Gemini 2.5 Flash-Lite (Non-reasoning)

Gemini 2.5 Flash-Lite is a lightweight, high-speed variant of the Gemini 2.5 Flash model, optimized for low-latency and cost-effective inference. It is designed for straightforward tasks and does not include advanced reasoning or thinking capabilities. As part of the Gemini family, it supports multimodal inputs.

FastCheapMultimodal

Input / 1M tokens

$0.10

Output tokens/s

213.79

First-token seconds

1.36s

Artificial Analysis Intelligence Index

12.7

Google

Gemini 2.5 Flash-Lite (Reasoning)

A lightweight, cost-effective model from the Gemini family optimized for fast reasoning tasks. It maintains the multimodal capabilities of the Gemini series while prioritizing speed and efficiency.

ReasoningFastCheapMultimodalCoding

Input / 1M tokens

$0.10

Output tokens/s

223.94

First-token seconds

19.07s

Artificial Analysis Intelligence Index

17.6

Google

Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning)

A lightweight, preview version of the Gemini 2.5 Flash model optimized for speed and cost-efficiency. This non-reasoning variant is designed for fast, low-latency tasks and does not emphasize complex reasoning or chain-of-thought capabilities.

FastCheap

Input / 1M tokens

$0.10

Artificial Analysis Intelligence Index

19.4

Google

Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning)

A lightweight, fast, and cost-effective preview model from the Gemini 2.5 family, optimized for reasoning tasks. It balances speed and capability, making it suitable for applications requiring quick, logical responses.

CodingReasoningFastCheapLong contextMultimodal

Input / 1M tokens

$0.10

Artificial Analysis Intelligence Index

21.6

Google

Gemini 2.5 Pro

Gemini 2.5 Pro is Google's latest multimodal AI model, featuring strong reasoning and coding capabilities, support for ultra-long context processing, and optimization for complex tasks.

CodingReasoningFastLong contextMultimodal

Input / 1M tokens

$1.25

Output tokens/s

126.5

First-token seconds

17.12s

Artificial Analysis Intelligence Index

34.6

Google

Gemini 2.5 Pro Preview (Mar' 25)

Gemini 2.5 Pro is Google's most advanced multimodal model, featuring enhanced reasoning, coding, and long-context capabilities. This preview version showcases significant improvements in complex problem-solving and instruction following.

CodingReasoningFastLong contextMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

30.3

Google

Gemini 2.5 Pro Preview (May' 25)

Gemini 2.5 Pro is a highly capable multimodal model from Google, designed for complex reasoning and coding tasks. It features an extended context window and excels at processing and generating content across text, images, and other modalities.

CodingReasoningLong contextMultimodal

Input / 1M tokens

$1.25

Artificial Analysis Intelligence Index

29.5

Gemini 3

Gemini 3 Deep Think

Gemini 3 Deep Think is a model from Google's Gemini series, optimized for deep reasoning and complex problem-solving tasks. It leverages advanced thinking processes to handle intricate queries and is part of the multimodal Gemini family.

ReasoningLong contextMultimodal

Input / 1M tokens

$0.00

Google

Gemini 3 Flash Preview (Non-reasoning)

Gemini 3 Flash is a fast, multimodal model optimized for low-latency responses and cost-efficiency. This preview version is designed for general-purpose tasks, including coding and long-context understanding, but is not focused on complex reasoning or chain-of-thought tasks.

CodingFastCheapLong contextMultimodal

Input / 1M tokens

$0.50

Output tokens/s

203.01

First-token seconds

0.88s

Artificial Analysis Intelligence Index

Google

Gemini 3 Flash Preview (Reasoning)

A preview version of Google's Gemini 3 Flash model optimized for reasoning tasks. It combines the speed of the Flash variant with enhanced logical and analytical capabilities. As part of the Gemini family, it supports multimodal inputs.

ReasoningFastMultimodal

Input / 1M tokens

$0.50

Output tokens/s

201.4

First-token seconds

5.68s

Artificial Analysis Intelligence Index

46.4

Google

Gemini 3 Pro Preview (high)

Gemini 3 Pro Preview (high) is a high-performance multimodal model from Google, designed for complex reasoning, code generation, and tasks requiring a long context window. It represents an advanced iteration in the Gemini series, optimized for high-quality outputs.

CodingReasoningFastLong contextMultimodal

Input / 1M tokens

$2.00

Output tokens/s

121.99

First-token seconds

61.87s

Artificial Analysis Intelligence Index

48.4

Google

Gemini 3 Pro Preview (low)

A low-latency, cost-optimized variant of the Gemini 3 Pro Preview model. It is designed for fast response times and efficient operation while retaining core multimodal and reasoning capabilities.

FastCheapMultimodalReasoningCoding

Input / 1M tokens

$2.00

Artificial Analysis Intelligence Index

41.3

Gemini 3.1

Gemini 3.1 Flash-Lite Preview

A lightweight, preview version of the Gemini 3.1 Flash model, optimized for low-latency and cost-sensitive applications. It is designed to provide fast responses while maintaining core multimodal capabilities.

FastCheapMultimodal

Input / 1M tokens

$0.25

Output tokens/s

283.76

First-token seconds

5.09s

Artificial Analysis Intelligence Index

33.5

Google

Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is a high-performance, multimodal AI model from Google's Gemini series, designed for advanced reasoning and coding tasks. As a preview release, it offers early access to the latest capabilities in the Gemini 3.1 family.

ReasoningCodingMultimodal

Input / 1M tokens

$2.00

Output tokens/s

127.78

First-token seconds

23.16s

Artificial Analysis Intelligence Index

57.2

Gemma 3

Gemma 3 12B Instruct

Gemma 3 12B Instruct is a 12-billion parameter, open-weight model from Google designed for efficient on-device and edge deployment. It features enhanced reasoning and instruction-following capabilities, along with native multimodal support for processing both text and images. The model offers a strong balance of performance, speed, and accessibility for developers.

ReasoningFastCheapMultimodal

Input / 1M tokens

$0.09

Output tokens/s

24.14

First-token seconds

1.86s

Artificial Analysis Intelligence Index

8.8

Google

Gemma 3 1B Instruct

Gemma 3 1B Instruct is a lightweight, instruction-tuned language model from Google's Gemma 3 family. Designed for efficiency and speed, it is optimized for on-device and edge deployment scenarios. This model provides a strong balance of performance and low resource consumption for basic conversational and instruction-following tasks.

FastCheap

Input / 1M tokens

$0.00

Output tokens/s

57.46

First-token seconds

0.68s

Artificial Analysis Intelligence Index

5.6

Google

Gemma 3 270M

Gemma 3 270M is a lightweight, small-parameter language model from Google's Gemma family. Designed for efficiency and speed, it is optimized for deployment on resource-constrained devices like mobile phones or edge hardware, offering low-latency and cost-effective inference.

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

7.7

Google

Gemma 3 27B Instruct

Gemma 3 27B Instruct is a 27-billion parameter open-source instruction-tuned model from Google, built on the Gemma 3 architecture. It is designed for strong reasoning and instruction-following capabilities, suitable for a wide range of text generation tasks. The model supports a long context window, making it effective for processing lengthy documents.

ReasoningLong contextCodingCheap

Input / 1M tokens

$0.11

Output tokens/s

23.92

First-token seconds

0.76s

Artificial Analysis Intelligence Index

10.3

Google

Gemma 3 4B Instruct

Gemma 3 4B Instruct is a lightweight, efficient instruction-tuned model from Google's Gemma family. It is designed for fast inference and deployment on resource-constrained environments like edge devices or local hardware, while maintaining strong reasoning capabilities for its size.

FastCheapReasoning

Input / 1M tokens

$0.04

Output tokens/s

25.66

First-token seconds

1.19s

Artificial Analysis Intelligence Index

6.3

Gemma 3n

Gemma 3n E2B Instruct

Gemma 3n E2B is a lightweight, efficient instruction-tuned model from Google's Gemma family, designed for high performance on edge devices and resource-constrained environments. It balances speed and capability, making it suitable for fast, cost-effective on-device AI applications.

FastCheapReasoning

Input / 1M tokens

$0.00

Output tokens/s

58.95

First-token seconds

0.31s

Artificial Analysis Intelligence Index

4.8

Google

Gemma 3n E4B Instruct

Gemma 3n E4B Instruct is a lightweight, efficient instruction-tuned model from Google's Gemma family, optimized for on-device and edge deployment. It delivers strong reasoning and instruction-following capabilities within a compact 4B parameter footprint, making it suitable for fast and cost-effective local inference.

ReasoningFastCheap

Input / 1M tokens

$0.02

Output tokens/s

36.13

First-token seconds

0.78s

Artificial Analysis Intelligence Index

6.4

Google

Gemma 3n E4B Instruct Preview (May '25)

Gemma 3n E4B Instruct Preview is a 4-billion parameter, instruction-tuned model from Google's Gemma family, optimized for efficient on-device and edge deployment. This preview version focuses on strong instruction following and reasoning capabilities within a compact size.

FastCheapReasoning

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

10.1

Gemma 4

Gemma 4 26B A4B (Non-reasoning)

Gemma 4 26B A4B (Non-reasoning) is a 26-billion parameter open model from Google, optimized for general-purpose tasks without specialized reasoning enhancements. It is part of the Gemma family, built on Gemini technology, and designed for efficient performance across a wide range of applications.

CheapFastMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

27.1

Google

Gemma 4 26B A4B (Reasoning)

Gemma 4 26B A4B is a 26-billion parameter reasoning model from Google's Gemma series. It is designed for complex reasoning tasks and supports multimodal inputs, making it suitable for advanced analysis and problem-solving.

ReasoningMultimodalCoding

Input / 1M tokens

$0.13

Artificial Analysis Intelligence Index

31.2

Google

Gemma 4 31B (Non-reasoning)

Gemma 4 31B (Non-reasoning) is a 31-billion parameter open model from Google, optimized for general-purpose tasks and fast inference. It is part of the Gemma family, derived from Gemini technology, and is designed for efficient performance in coding, instruction following, and multimodal understanding without the overhead of extended reasoning chains.

CodingFastCheapMultimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

32.3

Google

Gemma 4 31B (Reasoning)

Gemma 4 31B (Reasoning) is a reasoning-optimized variant within Google's Gemma model family. It is designed to excel at tasks requiring deep, step-by-step thinking and logical deduction, such as complex problem-solving and analysis.

ReasoningCodingFastCheapMultimodal

Input / 1M tokens

$0.00

Output tokens/s

35.1

First-token seconds

0.96s

Artificial Analysis Intelligence Index

39.2

Google

Gemma 4 E2B (Non-reasoning)

Gemma 4 E2B is a lightweight, efficient model from Google's Gemma family, optimized for fast inference on edge devices or in-browser environments. As a non-reasoning variant, it prioritizes speed and low resource consumption for straightforward tasks.

CodingFastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

12.1

Google

Gemma 4 E2B (Reasoning)

Gemma 4 E2B (Reasoning) is a lightweight, open-source model from Google optimized for efficient reasoning tasks. It is designed to deliver strong performance on logic, math, and code-related problems while maintaining a small footprint suitable for edge or resource-constrained deployments.

ReasoningFastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

15.2

Google

Gemma 4 E4B (Non-reasoning)

Gemma 4 E4B (Non-reasoning) is a lightweight, efficient 4B parameter model from Google's Gemma series, optimized for fast inference and low-cost deployment. It is designed for general-purpose tasks where speed and efficiency are prioritized over complex reasoning chains.

FastCheapMultimodal

Input / 1M tokens

$0.30

Output tokens/s

92.96

First-token seconds

0.44s

Artificial Analysis Intelligence Index

14.8

Google

Gemma 4 E4B (Reasoning)

Gemma 4 E4B (Reasoning) is a lightweight, 4-billion parameter model from Google's Gemma series, optimized for strong reasoning and instruction-following capabilities. It is designed for efficient deployment while maintaining performance on complex tasks, and supports multimodal inputs.

ReasoningFastCheapMultimodal

Input / 1M tokens

$0.30

Output tokens/s

89.96

First-token seconds

0.95s

Artificial Analysis Intelligence Index

18.8

PaLM-2

PALM-2

PALM-2 is a large language model developed by Google, featuring significant improvements in multilingual, math, and coding capabilities compared to its predecessor. It demonstrates stronger reasoning abilities and faster response speeds.

MultimodalCodingReasoningFast

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

8.6

Discussion

Thinking... Make sure you are connected to GitHub server