United States

IBM

IBM focuses on enterprise AI with Watsonx, offering foundation models, governance tools, and industry-specific AI solutions.

Website

Products

Models

Available

Benchmarks

Region

United States

Updated

May 29, 2026

Product coverage

Products from this provider

No products have been linked to this provider yet.

Model coverage

Models from this provider

Granite 3.3

Granite 3.3 8B (Non-reasoning)

An 8B parameter, instruction-tuned model from IBM's Granite 3.3 series, designed for general-purpose tasks and edge deployment. This non-reasoning variant focuses on efficient instruction following and text generation without extended chain-of-thought processes.

CodingFastCheapLong context

Input / 1M tokens

$0.03

Output tokens/s

404.94

First-token seconds

20.43s

Artificial Analysis Intelligence Index

Granite 4.0

Granite 4.0 1B

IBM Granite 4.0 1B is a lightweight, 1-billion parameter model from the Granite series, optimized for fast inference and low-cost deployment. It is designed for edge computing and resource-constrained environments, offering efficient performance for straightforward tasks.

FastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

7.3

Granite 4.0

Granite 4.0 350M

A lightweight, 350-million parameter model from IBM's Granite 4.0 series, designed for efficient inference and deployment on resource-constrained environments.

FastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

6.1

Granite 4.0 H

Granite 4.0 H 1B

A compact, 1-billion parameter model from IBM's Granite 4.0 series, optimized for efficiency and speed. It is designed for deployment in resource-constrained environments or as a lightweight component in larger systems.

FastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

Granite 4.0 H

Granite 4.0 H 350M

IBM Granite 4.0 H 350M is a compact AI model from the Granite family, featuring 350 million parameters for efficient inference. It is designed for fast and cost-effective deployment in enterprise scenarios, emphasizing speed and low operational costs.

FastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

5.4

Granite 4.0 H

Granite 4.0 H Small

Granite 4.0 H Small is a compact, efficient model from IBM's Granite family, designed for enterprise applications. It offers a balance of strong reasoning capabilities and fast response times, making it suitable for tasks requiring quick and reliable inference.

ReasoningFastCheap

Input / 1M tokens

$0.06

Output tokens/s

383.89

First-token seconds

8.71s

Artificial Analysis Intelligence Index

10.8

Granite 4.0

Granite 4.0 Micro

Granite 4.0 Micro is a lightweight, efficient model from IBM's Granite family, optimized for fast inference and deployment in resource-constrained environments. It maintains the enterprise-grade reliability and safety standards of the Granite series while offering a smaller footprint for cost-effective applications.

FastCheapReasoningCoding

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

7.7

Granite 4.1

Granite 4.1 30B

IBM Granite 4.1 30B is an enterprise-grade, open-source foundation model designed for high performance and reliability. It excels in complex reasoning, code generation, and multilingual understanding, making it suitable for business applications requiring accuracy and security.

CodingReasoning

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

14.7

Granite 4.1

Granite 4.1 3B

IBM provides the Granite family of AI models, including Granite 4.1, which are enterprise-ready, open foundation models designed for coding, reasoning, and multilingual tasks. These models support tool-based instructions and are optimized for robust AI workflows in enterprise settings.

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

8.5

Granite 4.1

Granite 4.1 8B

IBM Granite 4.1 8B is an enterprise-grade, compact language model designed for high efficiency and reliability. It is optimized for tasks requiring strong reasoning and coding capabilities while maintaining a low cost and fast inference speed, making it suitable for deployment in resource-constrained or latency-sensitive environments.

CodingReasoningFastCheap

Input / 1M tokens

$0.05

Output tokens/s

115.17

First-token seconds

0.4s

Artificial Analysis Intelligence Index

12.4

Discussion

Thinking... Make sure you are connected to GitHub server