United States

Meta

Meta develops Llama models and open AI tooling, with strong focus on open-weight LLMs, multimodal AI, and large-scale infrastructure.

Website

Products

Models

Available

Benchmarks

Region

United States

Updated

May 29, 2026

Product coverage

Products from this provider

No products have been linked to this provider yet.

Model coverage

Models from this provider

Llama 2 Chat

Llama 2 Chat 13B

Llama 2 Chat 13B is a 13-billion parameter chat model from Meta's Llama 2 series, fine-tuned with instruction data and Reinforcement Learning from Human Feedback (RLHF). It offers a balance between conversational capability, reasoning performance, and resource efficiency, making it suitable for a wide range of dialogue-based applications.

ReasoningFastCheap

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

8.4

Llama 2 Chat

Llama 2 Chat 70B

Llama 2 Chat 70B is Meta's open-source, chat-optimized large language model with 70 billion parameters. It is designed for multi-turn dialogue and excels at complex reasoning and instruction-following tasks.

CodingReasoning

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

8.4

Llama 2 Chat

Llama 2 Chat 7B

Llama 2 Chat 7B is an open-source conversational model from Meta, fine-tuned with instruction following and RLHF for interactive applications. As a smaller 7B parameter model, it offers fast inference speeds and low deployment costs while maintaining strong general capabilities.

FastCheapLong context

Input / 1M tokens

$0.05

Output tokens/s

110.35

First-token seconds

17.35s

Artificial Analysis Intelligence Index

9.7

Llama 3 Instruct

Llama 3 Instruct 70B

Meta's latest open-source instruction-tuned model with 70 billion parameters. It supports a 128K context window and demonstrates strong performance in reasoning and coding tasks.

ReasoningLong contextCoding

Input / 1M tokens

$0.65

Output tokens/s

45.35

First-token seconds

0.72s

Artificial Analysis Intelligence Index

8.9

Llama 3 Instruct

Llama 3 Instruct 8B

Llama 3 Instruct 8B is a lightweight, instruction-tuned model from Meta's Llama 3 family. It is optimized for fast response times and low-cost deployment, making it suitable for edge devices or applications requiring rapid inference while maintaining strong performance on general tasks.

CodingReasoningFastCheap

Input / 1M tokens

$0.045

Output tokens/s

81.45

First-token seconds

0.48s

Artificial Analysis Intelligence Index

6.4

Llama 3.1 Instruct

Llama 3.1 Instruct 405B

Llama 3.1 Instruct 405B is Meta's largest and most capable open-source language model, optimized for instruction following and dialogue. It features strong reasoning abilities, supports a 128K token context window, and excels at complex tasks including coding and multilingual understanding.

CodingReasoningLong context

Input / 1M tokens

$2.75

Output tokens/s

58.06

First-token seconds

0.67s

Artificial Analysis Intelligence Index

17.4

Llama 3.1 Instruct

Llama 3.1 Instruct 70B

Llama 3.1 Instruct 70B is a 70-billion parameter, instruction-tuned large language model from Meta. It features a 128K token context window and is optimized for high-performance, open-source deployment across a wide range of tasks.

ReasoningLong contextCheapCoding

Input / 1M tokens

$0.56

Output tokens/s

35.27

First-token seconds

0.55s

Artificial Analysis Intelligence Index

12.5

Llama 3.1 Instruct

Llama 3.1 Instruct 8B

A lightweight, open-source large language model developed by Meta, supporting a 128K long context window. It is instruction-tuned for dialogue and instruction-following tasks.

FastCheapLong context

Input / 1M tokens

$0.10

Output tokens/s

213.86

First-token seconds

0.48s

Artificial Analysis Intelligence Index

11.8

Llama 3.2 Instruct

Llama 3.2 Instruct 11B (Vision)

This is an 11B parameter multimodal instruct model from Meta's Llama 3.2 series, optimized for vision tasks. It balances performance with efficiency, offering faster inference speeds than larger models, and is suitable for conversational and instruction-following scenarios requiring image understanding.

MultimodalReasoningFastCheap

Input / 1M tokens

$0.245

Output tokens/s

86.8

First-token seconds

0.47s

Artificial Analysis Intelligence Index

8.7

Llama 3.2 Instruct

Llama 3.2 Instruct 1B

A lightweight 1B parameter instruction-tuned model from Meta's Llama 3.2 series, optimized for fast inference and deployment on edge devices or resource-constrained environments. It supports text-based tasks and is part of a family that includes larger multimodal models.

FastCheapReasoning

Input / 1M tokens

$0.05

Output tokens/s

88.82

First-token seconds

0.57s

Artificial Analysis Intelligence Index

6.3

Llama 3.2 Instruct

Llama 3.2 Instruct 3B

A lightweight instruction-tuned model from the Llama 3.2 family, optimized for fast and efficient on-device or edge deployment. It offers low-cost inference with strong multilingual and conversational capabilities for its size.

FastCheapMultimodal

Input / 1M tokens

$0.15

Output tokens/s

52.13

First-token seconds

0.64s

Artificial Analysis Intelligence Index

9.7

Llama 3.2 Instruct

Llama 3.2 Instruct 90B (Vision)

This is the largest multimodal instruction-tuned model in the Meta Llama 3.2 series, featuring 90 billion parameters and support for both image and text inputs. It excels at visual understanding and complex reasoning tasks, making it suitable for sophisticated applications requiring the processing of both images and text.

MultimodalReasoning

Input / 1M tokens

$1.38

Output tokens/s

46.35

First-token seconds

0.58s

Artificial Analysis Intelligence Index

11.9

Llama 3.3 Instruct

Llama 3.3 Instruct 70B

Llama 3.3 Instruct 70B is a large, open-source instruction-tuned model from Meta, optimized for complex reasoning, code generation, and following detailed instructions. It features a 128K token context window and delivers strong performance on various benchmarks, positioning it as a powerful and versatile general-purpose assistant.

ReasoningCodingLong contextMultimodal

Input / 1M tokens

$0.585

Output tokens/s

87.65

First-token seconds

0.63s

Artificial Analysis Intelligence Index

14.5

Llama 4

Llama 4 Maverick

Llama 4 Maverick is a high-performance, open-weight model from Meta's Llama 4 family, designed for advanced reasoning and coding tasks. It likely features a large context window and multimodal capabilities, continuing the series' focus on powerful, accessible AI.

CodingReasoningFastCheapMultimodal

Input / 1M tokens

$0.35

Output tokens/s

109.54

First-token seconds

0.65s

Artificial Analysis Intelligence Index

18.4

Llama 4

Llama 4 Scout

Llama 4 Scout is a fast and efficient open-source large language model from Meta's Llama series. It is optimized for rapid response times and strong reasoning capabilities, making it suitable for a wide range of general-purpose and coding tasks.

CodingReasoningFast

Input / 1M tokens

$0.17

Output tokens/s

108.23

First-token seconds

0.59s

Artificial Analysis Intelligence Index

13.5

Llama 65B

Llama 65B is a large, open-source language model from Meta's Llama family. It provides strong text generation and understanding capabilities, making it a powerful choice for research and enterprise deployment.

Reasoning

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

7.4

Muse Spark

Muse Spark is a creative generation model from Meta, designed for rapid ideation and content creation. It excels at producing diverse, high-quality outputs with low latency.

MultimodalFast

Input / 1M tokens

$0.00

Artificial Analysis Intelligence Index

52.2

Discussion

Thinking... Make sure you are connected to GitHub server