Deep Cogito's LLMs Outperform Similar-Sized Models Using IDA
Deep Cogito, a San Francisco-based company, is making waves in the AI community with its latest release of open large language models (LLMs). These models, which come in various sizes ranging from 3 billion to 70 billion parameters, are not just another set of AI tools; they're a bold step towards what the company calls "general superintelligence." Deep Cogito claims that each of their models outperforms the leading open models of similar sizes, including those from LLAMA, DeepSeek, and Qwen, across most standard benchmarks. It's quite a claim, but what's even more impressive is that their 70B model has reportedly outshone the recently released Llama 4 109B Mixture-of-Experts (MoE) model.
Iterated Distillation and Amplification (IDA)
At the heart of Deep Cogito's breakthrough is a new training approach they call Iterated Distillation and Amplification (IDA). This method is described as "a scalable and efficient alignment strategy for general superintelligence using iterative self-improvement." It's designed to push past the limitations of traditional LLM training, where the model's intelligence often hits a ceiling defined by larger "overseer" models or human curators.
The IDA process revolves around two key steps that are repeated over and over:
- Amplification: This step uses more computational power to help the model come up with better solutions or capabilities, much like advanced reasoning techniques.
- Distillation: Here, the model internalizes these improved capabilities, refining its parameters.
Deep Cogito argues that this creates a "positive feedback loop," allowing the model's intelligence to grow more directly with the computational resources and the efficiency of the IDA process itself, rather than being limited by an overseer's intelligence.
The company points to historical successes like AlphaGo, emphasizing that "Advanced Reasoning and Iterative Self-Improvement" were crucial. IDA, they claim, brings these elements into LLM training. They also tout the efficiency of IDA, noting that their team, though small, managed to develop these models in just about 75 days. When compared to other methods like Reinforcement Learning from Human Feedback (RLHF) or standard distillation from larger models, IDA is said to offer better scalability.
As proof, Deep Cogito highlights how their 70B model outperforms both Llama 3.3 70B (distilled from a 405B model) and Llama 4 Scout 109B (distilled from a 2T parameter model).
Capabilities and Performance of Deep Cogito Models
The new Cogito models, which build upon Llama and Qwen checkpoints, are tailored for coding, function calling, and agentic applications. A standout feature is their dual functionality: "Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models)." This mirrors capabilities seen in models like Claude 3.5. However, Deep Cogito mentions they haven't focused on very long reasoning chains, prioritizing faster answers and the efficiency of distilling shorter chains.
The company has shared extensive benchmark results, comparing their Cogito models against size-equivalent state-of-the-art open models in both direct and reasoning modes. Across a range of benchmarks like MMLU, MMLU-Pro, ARC, GSM8K, and MATH, and across different model sizes (3B, 8B, 14B, 32B, 70B), the Cogito models generally show significant performance improvements. For example, the Cogito 70B model scores 91.73% on MMLU in standard mode, a +6.40% improvement over Llama 3.3 70B, and 91.00% in thinking mode, a +4.40% boost over Deepseek R1 Distill 70B. Livebench scores also reflect these gains.
Here are benchmarks of 14B models for a medium-sized comparison:

While Deep Cogito acknowledges that benchmarks don't fully capture real-world utility, they remain confident in the practical performance of their models. This release is considered a preview, with the company stating they are "still in the early stages of this scaling curve." They plan to release improved checkpoints for the current sizes and introduce larger MoE models (109B, 400B, 671B) in the coming weeks and months. All future models will also be open-source.
Related article
Meta Enhances AI Security with Advanced Llama Tools
Meta has released new Llama security tools to bolster AI development and protect against emerging threats.These upgraded Llama AI model security tools are paired with Meta’s new resources to empower c
NotebookLM Unveils Curated Notebooks from Top Publications and Experts
Google is enhancing its AI-driven research and note-taking tool, NotebookLM, to serve as a comprehensive knowledge hub. On Monday, the company introduced a curated collection of notebooks from promine
Alibaba Unveils Wan2.1-VACE: Open-Source AI Video Solution
Alibaba has introduced Wan2.1-VACE, an open-source AI model poised to transform video creation and editing processes.VACE is a key component of Alibaba’s Wan2.1 video AI model family, with the company
Comments (26)
0/200
PaulThomas
August 6, 2025 at 3:01:00 PM EDT
Super cool to see Deep Cogito pushing the boundaries with their LLMs! 😎 Those parameter sizes are wild—wonder how they stack up in real-world tasks?
0
GregoryCarter
April 20, 2025 at 11:16:16 PM EDT
LLM от Deep Cogito впечатляют, но приложение могло бы иметь лучший UI. Навигация по разным размерам моделей немного неуклюжая. Тем не менее, производительность на высшем уровне, особенно с технологией IDA. Обязательно стоит посмотреть, если вы интересуетесь ИИ и хотите увидеть, что возможно с большими языковыми моделями! 🤖💡
0
EricRoberts
April 20, 2025 at 12:40:17 AM EDT
ディープコギトのLLMは印象的ですが、アプリのUIがもう少し改善されると良いですね。モデルサイズをナビゲートするのが少しぎこちないです。それでも、パフォーマンスは最高で、特にIDAテクノロジーとの組み合わせが素晴らしいです。AIに興味があるなら、大規模言語モデルの可能性を見る価値がありますよ!🤖💡
0
WillieAnderson
April 20, 2025 at 12:09:03 AM EDT
딥 코기토의 LLM은 정말 혁신적이에요! 비슷한 크기의 모델과 비교해도 성능 향상이 놀랍습니다. IDA 접근법이 큰 차이를 만듭니다. 유일한 단점은 학습 곡선인데, 한번 익숙해지면 문제없어요! 🚀
0
EricKing
April 19, 2025 at 6:12:37 PM EDT
Deep Cogito's LLMs are impressive, but the app could use a better UI. It's a bit clunky to navigate through the different model sizes. Still, the performance is top-notch, especially with the IDA tech. Definitely worth a look if you're into AI and want to see what's possible with large language models! 🤖💡
0
BruceClark
April 19, 2025 at 2:48:03 PM EDT
ディープ・コギトのLLMは本当に素晴らしい!同じサイズのモデルと比べてパフォーマンスが格段に向上しています。私は研究に700億パラメータのモデルを使っていますが、これはまるで超賢いアシスタントを持つようなものです。唯一の欠点はリソースを多く消費することですが、それでも完全に価値があります!🚀
0
Deep Cogito, a San Francisco-based company, is making waves in the AI community with its latest release of open large language models (LLMs). These models, which come in various sizes ranging from 3 billion to 70 billion parameters, are not just another set of AI tools; they're a bold step towards what the company calls "general superintelligence." Deep Cogito claims that each of their models outperforms the leading open models of similar sizes, including those from LLAMA, DeepSeek, and Qwen, across most standard benchmarks. It's quite a claim, but what's even more impressive is that their 70B model has reportedly outshone the recently released Llama 4 109B Mixture-of-Experts (MoE) model.
Iterated Distillation and Amplification (IDA)
At the heart of Deep Cogito's breakthrough is a new training approach they call Iterated Distillation and Amplification (IDA). This method is described as "a scalable and efficient alignment strategy for general superintelligence using iterative self-improvement." It's designed to push past the limitations of traditional LLM training, where the model's intelligence often hits a ceiling defined by larger "overseer" models or human curators.
The IDA process revolves around two key steps that are repeated over and over:
- Amplification: This step uses more computational power to help the model come up with better solutions or capabilities, much like advanced reasoning techniques.
- Distillation: Here, the model internalizes these improved capabilities, refining its parameters.
Deep Cogito argues that this creates a "positive feedback loop," allowing the model's intelligence to grow more directly with the computational resources and the efficiency of the IDA process itself, rather than being limited by an overseer's intelligence.
The company points to historical successes like AlphaGo, emphasizing that "Advanced Reasoning and Iterative Self-Improvement" were crucial. IDA, they claim, brings these elements into LLM training. They also tout the efficiency of IDA, noting that their team, though small, managed to develop these models in just about 75 days. When compared to other methods like Reinforcement Learning from Human Feedback (RLHF) or standard distillation from larger models, IDA is said to offer better scalability.
As proof, Deep Cogito highlights how their 70B model outperforms both Llama 3.3 70B (distilled from a 405B model) and Llama 4 Scout 109B (distilled from a 2T parameter model).
Capabilities and Performance of Deep Cogito Models
The new Cogito models, which build upon Llama and Qwen checkpoints, are tailored for coding, function calling, and agentic applications. A standout feature is their dual functionality: "Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models)." This mirrors capabilities seen in models like Claude 3.5. However, Deep Cogito mentions they haven't focused on very long reasoning chains, prioritizing faster answers and the efficiency of distilling shorter chains.
The company has shared extensive benchmark results, comparing their Cogito models against size-equivalent state-of-the-art open models in both direct and reasoning modes. Across a range of benchmarks like MMLU, MMLU-Pro, ARC, GSM8K, and MATH, and across different model sizes (3B, 8B, 14B, 32B, 70B), the Cogito models generally show significant performance improvements. For example, the Cogito 70B model scores 91.73% on MMLU in standard mode, a +6.40% improvement over Llama 3.3 70B, and 91.00% in thinking mode, a +4.40% boost over Deepseek R1 Distill 70B. Livebench scores also reflect these gains.
Here are benchmarks of 14B models for a medium-sized comparison:
While Deep Cogito acknowledges that benchmarks don't fully capture real-world utility, they remain confident in the practical performance of their models. This release is considered a preview, with the company stating they are "still in the early stages of this scaling curve." They plan to release improved checkpoints for the current sizes and introduce larger MoE models (109B, 400B, 671B) in the coming weeks and months. All future models will also be open-source.


Super cool to see Deep Cogito pushing the boundaries with their LLMs! 😎 Those parameter sizes are wild—wonder how they stack up in real-world tasks?




LLM от Deep Cogito впечатляют, но приложение могло бы иметь лучший UI. Навигация по разным размерам моделей немного неуклюжая. Тем не менее, производительность на высшем уровне, особенно с технологией IDA. Обязательно стоит посмотреть, если вы интересуетесь ИИ и хотите увидеть, что возможно с большими языковыми моделями! 🤖💡




ディープコギトのLLMは印象的ですが、アプリのUIがもう少し改善されると良いですね。モデルサイズをナビゲートするのが少しぎこちないです。それでも、パフォーマンスは最高で、特にIDAテクノロジーとの組み合わせが素晴らしいです。AIに興味があるなら、大規模言語モデルの可能性を見る価値がありますよ!🤖💡




딥 코기토의 LLM은 정말 혁신적이에요! 비슷한 크기의 모델과 비교해도 성능 향상이 놀랍습니다. IDA 접근법이 큰 차이를 만듭니다. 유일한 단점은 학습 곡선인데, 한번 익숙해지면 문제없어요! 🚀




Deep Cogito's LLMs are impressive, but the app could use a better UI. It's a bit clunky to navigate through the different model sizes. Still, the performance is top-notch, especially with the IDA tech. Definitely worth a look if you're into AI and want to see what's possible with large language models! 🤖💡




ディープ・コギトのLLMは本当に素晴らしい!同じサイズのモデルと比べてパフォーマンスが格段に向上しています。私は研究に700億パラメータのモデルを使っていますが、これはまるで超賢いアシスタントを持つようなものです。唯一の欠点はリソースを多く消費することですが、それでも完全に価値があります!🚀












