Providers
StepFun

China

StepFun

StepFun is a Chinese AI company focused on developing advanced multimodal AI models and platforms. Its product suite includes large language models (like Step-3.5-Flash), multimodal reasoning models (Step-R1-V-mini), and specialized tools for image creation, editing, and knowledge base Q&A. The company offers an API platform and an 'Agent Studio' for building AI agents, positioning itself in the competitive generative AI market.

Products
2
Models
7
Available
0
Benchmarks
9

Region

China

Updated

May 14, 2026

Product coverage

Products from this provider

2

Model coverage

Models from this provider

7

Step 3.5 Flash

Step 3.5 Flash

Step 3.5 Flash is a fast-response language model optimized for Chinese language understanding and generation. It is designed for quick inference and efficient performance in conversational and text-based tasks.

FastReasoningCoding

Input / 1M tokens

$0.10

Output tokens/s

153.02

First-token seconds

0.88s

Artificial Analysis Intelligence Index

37.8

Step 3.5 Flash

Step 3.5 Flash 2603

Step 3.5 Flash is a fast and efficient language model from StepFun, optimized for low-latency responses. It is part of the Flash series, designed to balance speed with strong reasoning capabilities for general-purpose tasks.

FastReasoning

Input / 1M tokens

$0.00

Output tokens/s

155.65

First-token seconds

0.83s

Artificial Analysis Intelligence Index

38.5

Step 3 VL 10B

Step3 VL 10B

Step3 VL 10B is a multimodal vision-language model developed by StepFun. With 10 billion parameters, it is designed to understand and process both visual and textual information for various tasks.

Multimodal

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

15.5

StepAudio 2.5

StepAudio 2.5 ASR

Automatic speech recognition model for streaming and near-realtime transcription.

StepAudio 2.5

StepAudio 2.5 TTS

Text-to-speech model with zero-shot voice cloning and natural-language control.

Step Image Edit 2

step-image-edit-2

Lightweight generative editing model for text-to-image and image editing with fast response.

Multimodal

Step Router V1

step-router-v1

Intelligent routing model for automatic switching between deepseek-v4-pro and step-3.5-flash based on task complexity.

Discussion

Thinking... Make sure you are connected to GitHub server