option
Home
List of Al models
DeepSeek-V2-Chat-0628

DeepSeek-V2-Chat-0628

Add comparison
Add comparison
Model parameter quantity
236B
Model parameter quantity
Affiliated organization
DeepSeek
Affiliated organization
Open Source
License Type
Release time
May 6, 2024
Release time

Model Introduction
DeepSeek-V2 is a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token. Compared with DeepSeek 67B, DeepSeek-V2 achieves stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.
Swipe left and right to view more
Language comprehension ability Language comprehension ability
Language comprehension ability
Often makes semantic misjudgments, leading to obvious logical disconnects in responses.
4.6
Knowledge coverage scope Knowledge coverage scope
Knowledge coverage scope
Possesses core knowledge of mainstream disciplines, but has limited coverage of cutting-edge interdisciplinary fields.
7.8
Reasoning ability Reasoning ability
Reasoning ability
Unable to maintain coherent reasoning chains, often causing inverted causality or miscalculations.
4.7
Related model
DeepSeek-V3.2 The latest version of Deepseek V3 series models.
DeepSeek-V3.2-Exp The latest experimental version of Deepseek V3 series models.
DeepSeek-R1-0528 The latest version of Deepseek R1.
DeepSeek-V3-0324 DeepSeek-V3 outperforms other open-source models such as Qwen2.5-72B and Llama-3.1-405B in multiple evaluations and matches the performance of top-tier closed-source models like GPT-4 and Claude-3.5-Sonnet.
DeepSeek-R1-0528 The latest version of Deepseek R1.
Relevant documents
Xiaohongshu Restructures: Conan Named President, Creates AI Primary Department Dots and Overseas Division Rednote On April 30, Xiaohongshu sent an internal memo to all employees announcing the launch of a new organizational restructuring. The core of this change involves fully integrating three business lines—community, e-commerce, and commercialization—along wi
Tencent's Xiaolongxia Surges Beyond Expectations, Team Expands Capacity 10x, Apologizes and Compensates Tencent has officially launched WorkBuddy, an all-scenario AI intelligent agent, marking a new phase in the large model application layer race with high integration and a low deployment threshold.The product drew immediate industry attention on its l
Suno Lead Investor: Deleting Posts Won't Plug Copyright Lawsuit Hole The much-anticipated AI music generation platform Suno is facing a tough copyright battle, and a candid remark from its lead investor may have handed the opposing side exactly the evidence they were hoping for. C.C. Gong, a partner at Menlo Ventures
Claude Opus 4.7 Launches with Reliability Valued Over Intelligence Anthropic has maintained an aggressive pace this year, rolling out new features almost every other day. The much-anticipated Claude Opus 4.7 has just been officially released, and interestingly, Anthropic was upfront in the announcement: "This is not
Haier Launches World's Lightest AI Sports Exoskeleton Robot, Weighing Just 1.75 kg Haier Group has introduced the world's lightest AI-powered exoskeleton robot for sports — the Haier Exoskeleton Robot W3. This launch sets a new industry record for lightness, marking a major breakthrough in lightweight design and intelligent human m
Model comparison
Start the comparison
OR