Model Introduction
DeepSeek-R1 extensively utilized reinforcement learning techniques during the post-training phase, significantly enhancing the model's reasoning capabilities with only a minimal amount of annotated data. On tasks involving mathematics, coding, and natural language inference, its performance is on par with the official release of OpenAI's o1.