fireworks/models/deepseek-v3-0324

Common Name: Deepseek V3 03-24

Fireworks
Released on Oct 16, 2025 12:00 AMSupportedTool Invocation
CompareTry in Chat

A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token from Deepseek. Updated checkpoint.

Specifications

Context
160K
Inputtext
Outputtext

Performance (7-day Average)

Collecting…
Collecting…
Collecting…

Pricing

Input$0.99/MTokens
Output$0.99/MTokens

Availability Trend (24h)

Performance Metrics (24h)

Similar Models

$0.66/$2.75/M
ctx256Kmaxavailtps
InOutCap

Kimi K2 0905 is an updated version of Kimi K2, a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Kimi K2 0905 has improved coding abilities, a longer context window, and agentic tool use, and a longer (262K) context window.

$0.66/$3.30/M
ctx262Kmaxavailtps
InOutCap

Kimi K2.5 is Moonshot AI's flagship agentic model and a new SOTA open model. It unifies vision and text, thinking and non-thinking modes, and single-agent and multi-agent execution into one model. Kimi K2.5 is a mixture-of-experts (MoE) language model with 1 trillion total parameters and a 262K context window.

$0.62/$1.85/M
ctx160Kmaxavailtps
InOutCap

DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.

$1.49/$5.94/M
ctx160Kmaxavailtps
InOutCap

05/28 updated checkpoint of Deepseek R1. Its overall performance is now approaching that of leading models, such as O3 and Gemini 2.5 Pro. Compared to the previous version, the upgraded model shows significant improvements in handling complex reasoning tasks, and this version also offers a reduced hallucination rate, enhanced support for function calling, and better experience for vibe coding.