option
Home
List of Al models
DeepSeek-V3
Model parameter quantity
671B
Model parameter quantity
Affiliated organization
DeepSeek
Affiliated organization
Open Source
License Type
Release time
December 26, 2024
Release time
Model Introduction
DeepSeek-V3 has achieved higher evaluation scores than other open-source models such as Qwen2.5-72B and Llama-3.1-405B, and its performance is on par with the world's top closed-source models like GPT-4o and Claude-3.5-Sonnet.
Swipe left and right to view more
Language comprehension ability Language comprehension ability
Language comprehension ability
Often makes semantic misjudgments, leading to obvious logical disconnects in responses.
6.8
Knowledge coverage scope Knowledge coverage scope
Knowledge coverage scope
Possesses core knowledge of mainstream disciplines, but has limited coverage of cutting-edge interdisciplinary fields.
8.8
Reasoning ability Reasoning ability
Reasoning ability
Unable to maintain coherent reasoning chains, often causing inverted causality or miscalculations.
6.7
Related model
DeepSeek-V2-Chat-0628 DeepSeek-V2 is a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token. Compared with DeepSeek 67B, DeepSeek-V2 achieves stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.
DeepSeek-V2.5 DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions.
DeepSeek-V3-0324 DeepSeek-V3 outperforms other open-source models such as Qwen2.5-72B and Llama-3.1-405B in multiple evaluations and matches the performance of top-tier closed-source models like GPT-4 and Claude-3.5-Sonnet.
DeepSeek-V2-Lite-Chat DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model presented by DeepSeek, DeepSeek-V2-Lite is a lite version of it.
DeepSeek-V2-Chat DeepSeek-V2 is a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token. Compared with DeepSeek 67B, DeepSeek-V2 achieves stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.
Relevant documents
DeepSeek-V3 Unveiled: How Hardware-Aware AI Design Slashes Costs and Boosts Performance DeepSeek-V3: A Cost-Efficient Leap in AI DevelopmentThe AI industry is at a crossroads. While large language models (LLMs) grow more powerful, their computational demands have skyrocketed, making cutting-edge AI development prohibitively expensive for most organizations. DeepSeek-V3 challenges this
AI-Driven Travel: Plan Your Perfect Getaway with Ease Crafting a vacation can feel daunting, with endless searches and reviews turning excitement into stress. AI-powered travel planning changes that, making the process smooth and enjoyable. This article
AI-Powered NoteGPT Transforms YouTube Learning Experience In today’s fast-moving world, effective learning is essential. NoteGPT is a dynamic Chrome extension that revolutionizes how you engage with YouTube content. By harnessing AI, it offers concise summar
Community Union and Google Partner to Boost AI Skills for UK Workers Editor’s Note: Google has teamed up with Community Union in the UK to demonstrate how AI skills can enhance the capabilities of both office and operational workers. This pioneering program is part of
Magi-1 Unveils Revolutionary Open-Source AI Video Generation Technology The realm of AI-powered video creation is advancing rapidly, and Magi-1 marks a transformative milestone. This innovative open-source model offers unmatched precision in controlling timing, motion, an
Model comparison
Start the comparison
Back to Top
OR