Gemini 2.5 Flash (Non-reasoning)

GoogleUnited States

Gemini 2.5 Flash is a lightweight, high-speed model from Google's Gemini 2.5 series, optimized for low latency and cost efficiency. This non-reasoning variant is designed for rapid response tasks, offering multimodal capabilities and a large context window without the overhead of extended chain-of-thought processing.

FastCheapMultimodalLong context

Input / 1M tokens

$0.30

Output / 1M tokens

$2.50

Output tokens/s

195.1

First-token seconds

0.55s

Supported plans

Benchmark history

Evaluations

TAU2

Measured May 29, 2026Source

Score

0.15

Terminalbench Hard

Measured May 29, 2026Source

Score

0.12

Lcr

Measured May 29, 2026Source

Score

0.46

Ifbench

Measured May 29, 2026Source

Score

0.39

Aime 25

Measured May 29, 2026Source

Score

0.6

Aime

Measured May 29, 2026Source

Score

0.5

Math 500

Measured May 29, 2026Source

Score

0.93

Scicode

Measured May 29, 2026Source

Score

0.29

Livecodebench

Measured May 29, 2026Source

Score

0.5

Hle

Measured May 29, 2026Source

Score

0.05

Gpqa

Measured May 29, 2026Source

Score

0.68

Mmlu Pro

Measured May 29, 2026Source

Score

0.81

Artificial Analysis Math Index

Measured May 29, 2026Source

Score

60.3

Artificial Analysis Coding Index

Measured May 29, 2026Source

Score

17.8

Artificial Analysis Intelligence Index

Measured May 29, 2026Source

Score

20.6

Plan availability

Products and plans that support this model

Apertis Coding Plan

Apertis Coding Plan is a subscription-based AI coding service providing unified access to 30+ AI models (GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, and more) through a single API key. Designed for developers using coding agents like Claude Code, Cursor, Cline, and OpenCode, it offers predictable monthly pricing, free prompt caching, auto-failover, and quota-based billing across OpenAI, Anthropic, Google, and other providers.

User ratings

Loading ratings...

Discussion

Thinking... Make sure you are connected to GitHub server