option
Home
List of Al models
DeepSeek-R1
Model parameter quantity
671B
Model parameter quantity
Affiliated organization
DeepSeek
Affiliated organization
Open Source
License Type
Release time
January 20, 2025
Release time

Model Introduction
DeepSeek-R1 is a model trained through large-scale Reinforcement Learning (RL) without using Supervised Fine-Tuning (SFT) as an initial step. Its performance in mathematics, coding, and reasoning tasks is comparable to that of OpenAI-o1.
Swipe left and right to view more
Language comprehension ability Language comprehension ability
Language comprehension ability
Capable of understanding complex contexts and generating logically coherent sentences, though occasionally off in tone control.
7.8
Knowledge coverage scope Knowledge coverage scope
Knowledge coverage scope
Possesses core knowledge of mainstream disciplines, but has limited coverage of cutting-edge interdisciplinary fields.
8.9
Reasoning ability Reasoning ability
Reasoning ability
Capable of building multi-level logical frameworks, achieving over 99% accuracy in complex mathematical modeling.
9.1
Related model
DeepSeek-V3-0324 DeepSeek-V3 outperforms other open-source models such as Qwen2.5-72B and Llama-3.1-405B in multiple evaluations and matches the performance of top-tier closed-source models like GPT-4 and Claude-3.5-Sonnet.
DeepSeek-R1-0528 The latest version of Deepseek R1.
DeepSeek-V2-Chat-0628 DeepSeek-V2 is a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token. Compared with DeepSeek 67B, DeepSeek-V2 achieves stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.
DeepSeek-V2.5 DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions.
DeepSeek-V3-0324 DeepSeek-V3 outperforms other open-source models such as Qwen2.5-72B and Llama-3.1-405B in multiple evaluations and matches the performance of top-tier closed-source models like GPT-4 and Claude-3.5-Sonnet.
Relevant documents
US Senate Drops AI Moratorium from Budget Bill Amid Controversy Senate Overwhelmingly Repeals AI Regulation Moratorium In a rare show of bipartisan unity, U.S. lawmakers voted nearly unanimously Tuesday to eliminate a contentious decade-long prohibition on state-level AI regulation from landmark legislation orig
Why AI Fell Short in 2025 Texas Floods: Critical Disaster Response Lessons Here's the rewritten version:The Texas Floods of 2025: A Wake-Up CallIn July 2025, Texas faced catastrophic flooding that revealed critical gaps in disaster preparedness. The Guadalupe River's rapid surge from 3 to 34 feet caught communities off guar
Last Chance to Score Discounted Tickets for TechCrunch Sessions: AI Event Tomorrow This isn't just another tech conference - UC Berkeley's Zellerbach Hall is about to host the most important AI gathering of the year. When those doors open tomorrow, you'll want to be among the select group shaping the future of artificial intelligen
AI-Powered Newsletter Automation Guide: Streamline Your Workflow with Ease Here's my rewrite of the HTML content while strictly maintaining all original tags and structure:Key Points Implement an automated newsletter workflow using Make, Notion, and 0CodeKit solutions. Programmatically collect content inspiration and auto-g
Hawaiian Beach Escapades: New Bonds and Surprising Turns Picture yourself on a pristine Hawaiian beach, sunlight warming your skin, waves crafting a calming rhythm. For Josh, this vision became reality after years of dedication. What begins as a tranquil ge
Model comparison
Start the comparison
Back to Top
OR