Daya Guo - Principales líderes e innovadores de IA | Perfiles, hitos y proyectos - xix.ai
opción

Descubre herramientas de IA de calidad

Reúna las principales herramientas de inteligencia artificial del mundo para ayudar a mejorar la eficiencia laboral

Buscar herramientas de Al…
Hogar
Celebridad de IA
Daya Guo
Daya Guo

Daya Guo

Investigador, DeepSeek
Año de nacimiento  desconocido
Nacionalidad  Chinese

Hito importante

2023 Unido a DeepSeek

Comenzó la investigación en modelos de IA centrados en código en DeepSeek

Lanzamiento de DeepSeek-Coder 2023

Coautor de DeepSeek-Coder, superando a los LLMs de código de fuente abierta existentes

Investigación de Modelos de Código Avanzados 2024

Contribuyó a DeepSeek-Coder V2, mejorando las capacidades de codificación

Producto de IA

DeepSeek-V3 outperforms other open-source models such as Qwen2.5-72B and Llama-3.1-405B in multiple evaluations and matches the performance of top-tier closed-source models like GPT-4 and Claude-3.5-Sonnet.

The inference model Spark X1 released by iFlytek, on the basis of leading domestic mathematical tasks, benchmarks the performance of general tasks such as inference, text generation, and language understanding against OpenAI o series and DeepSeek R1.

The latest version of Deepseek R1.

DeepSeek-V2 is a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token. Compared with DeepSeek 67B, DeepSeek-V2 achieves stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.

The inference model Spark X1 released by iFlytek, on the basis of leading domestic mathematical tasks, benchmarks the performance of general tasks such as inference, text generation, and language understanding against OpenAI o1 and DeepSeek R1.

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions.

DeepSeek-V3 outperforms other open-source models such as Qwen2.5-72B and Llama-3.1-405B in multiple evaluations and matches the performance of top-tier closed-source models like GPT-4 and Claude-3.5-Sonnet.

DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model presented by DeepSeek, DeepSeek-V2-Lite is a lite version of it.

DeepSeek-V2 is a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token. Compared with DeepSeek 67B, DeepSeek-V2 achieves stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times.

DeepSeek-R1 is a model trained through large-scale Reinforcement Learning (RL) without using Supervised Fine-Tuning (SFT) as an initial step. Its performance in mathematics, coding, and reasoning tasks is comparable to that of OpenAI-o1.

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions.

DeepSeek-V3 has achieved higher evaluation scores than other open-source models such as Qwen2.5-72B and Llama-3.1-405B, and its performance is on par with the world's top closed-source models like GPT-4o and Claude-3.5-Sonnet.

DeepSeek-R1 extensively utilized reinforcement learning techniques during the post-training phase, significantly enhancing the model's reasoning capabilities with only a minimal amount of annotated data. On tasks involving mathematics, coding, and natural language inference, its performance is on par with the official release of OpenAI's o1.

DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model presented by DeepSeek, DeepSeek-V2-Lite is a lite version of it.

Perfil personal

Contribuyó al desarrollo de DeepSeek-Coder, centrándose en modelos de lenguaje de código con alto rendimiento en tareas de programación.

Volver arriba
OR