Deep Cogito's LLMs Outperform Similar-Sized Models Using IDA

Home

News

April 18, 2025

JoseAdams

190

# ai # models # llm

Deep Cogito, a San Francisco-based company, is making waves in the AI community with its latest release of open large language models (LLMs). These models, which come in various sizes ranging from 3 billion to 70 billion parameters, are not just another set of AI tools; they're a bold step towards what the company calls "general superintelligence." Deep Cogito claims that each of their models outperforms the leading open models of similar sizes, including those from LLAMA, DeepSeek, and Qwen, across most standard benchmarks. It's quite a claim, but what's even more impressive is that their 70B model has reportedly outshone the recently released Llama 4 109B Mixture-of-Experts (MoE) model.

Iterated Distillation and Amplification (IDA)

At the heart of Deep Cogito's breakthrough is a new training approach they call Iterated Distillation and Amplification (IDA). This method is described as "a scalable and efficient alignment strategy for general superintelligence using iterative self-improvement." It's designed to push past the limitations of traditional LLM training, where the model's intelligence often hits a ceiling defined by larger "overseer" models or human curators.

The IDA process revolves around two key steps that are repeated over and over:

Amplification: This step uses more computational power to help the model come up with better solutions or capabilities, much like advanced reasoning techniques.
Distillation: Here, the model internalizes these improved capabilities, refining its parameters.

Deep Cogito argues that this creates a "positive feedback loop," allowing the model's intelligence to grow more directly with the computational resources and the efficiency of the IDA process itself, rather than being limited by an overseer's intelligence.

The company points to historical successes like AlphaGo, emphasizing that "Advanced Reasoning and Iterative Self-Improvement" were crucial. IDA, they claim, brings these elements into LLM training. They also tout the efficiency of IDA, noting that their team, though small, managed to develop these models in just about 75 days. When compared to other methods like Reinforcement Learning from Human Feedback (RLHF) or standard distillation from larger models, IDA is said to offer better scalability.

As proof, Deep Cogito highlights how their 70B model outperforms both Llama 3.3 70B (distilled from a 405B model) and Llama 4 Scout 109B (distilled from a 2T parameter model).

Capabilities and Performance of Deep Cogito Models

The new Cogito models, which build upon Llama and Qwen checkpoints, are tailored for coding, function calling, and agentic applications. A standout feature is their dual functionality: "Each model can answer directly (standard LLM), or self-reflect before answering (like reasoning models)." This mirrors capabilities seen in models like Claude 3.5. However, Deep Cogito mentions they haven't focused on very long reasoning chains, prioritizing faster answers and the efficiency of distilling shorter chains.

The company has shared extensive benchmark results, comparing their Cogito models against size-equivalent state-of-the-art open models in both direct and reasoning modes. Across a range of benchmarks like MMLU, MMLU-Pro, ARC, GSM8K, and MATH, and across different model sizes (3B, 8B, 14B, 32B, 70B), the Cogito models generally show significant performance improvements. For example, the Cogito 70B model scores 91.73% on MMLU in standard mode, a +6.40% improvement over Llama 3.3 70B, and 91.00% in thinking mode, a +4.40% boost over Deepseek R1 Distill 70B. Livebench scores also reflect these gains.

Here are benchmarks of 14B models for a medium-sized comparison:

Benchmarks of 14B models

While Deep Cogito acknowledges that benchmarks don't fully capture real-world utility, they remain confident in the practical performance of their models. This release is considered a preview, with the company stating they are "still in the early stages of this scaling curve." They plan to release improved checkpoints for the current sizes and introduce larger MoE models (109B, 400B, 671B) in the coming weeks and months. All future models will also be open-source.

YouTube Integrates Veo 3 AI Video Tool Directly Into Shorts Platform YouTube Shorts to Feature Veo 3 AI Video Model This SummerYouTube CEO Neal Mohan revealed during his Cannes Lions keynote that the platform's cutting-edge Veo 3 AI video generation technology will debut on YouTube Shorts later this summer. This follo

Google Cloud Powers Breakthroughs in Scientific Research and Discovery The digital revolution is transforming scientific methodologies through unprecedented computational capabilities. Cutting-edge technologies now augment both theoretical frameworks and laboratory experiments, propelling breakthroughs across discipline

Elon Musk's Grok AI Seeks Owner's Input Before Tackling Complex Queries The recently released Grok AI—promoted by Elon Musk as a "maximally truth-seeking" system—has drawn attention for its tendency to consult Musk's public statements before responding to politically sensitive topics. Observers note that when addressing

Comments (27)

0/200

Submit

RoyWhite

August 13, 2025 at 5:00:59 AM EDT

Deep Cogito's LLMs sound like a game-changer! Outperforming models of similar size with IDA is no small feat. Curious to see how these stack up in real-world tasks. 🚀

PaulThomas

August 6, 2025 at 3:01:00 PM EDT

Super cool to see Deep Cogito pushing the boundaries with their LLMs! 😎 Those parameter sizes are wild—wonder how they stack up in real-world tasks?

GregoryCarter

April 20, 2025 at 11:16:16 PM EDT

LLM от Deep Cogito впечатляют, но приложение могло бы иметь лучший UI. Навигация по разным размерам моделей немного неуклюжая. Тем не менее, производительность на высшем уровне, особенно с технологией IDA. Обязательно стоит посмотреть, если вы интересуетесь ИИ и хотите увидеть, что возможно с большими языковыми моделями! 🤖💡

EricRoberts

April 20, 2025 at 12:40:17 AM EDT

ディープコギトのLLMは印象的ですが、アプリのUIがもう少し改善されると良いですね。モデルサイズをナビゲートするのが少しぎこちないです。それでも、パフォーマンスは最高で、特にIDAテクノロジーとの組み合わせが素晴らしいです。AIに興味があるなら、大規模言語モデルの可能性を見る価値がありますよ！🤖💡

WillieAnderson

April 20, 2025 at 12:09:03 AM EDT

딥 코기토의 LLM은 정말 혁신적이에요! 비슷한 크기의 모델과 비교해도 성능 향상이 놀랍습니다. IDA 접근법이 큰 차이를 만듭니다. 유일한 단점은 학습 곡선인데, 한번 익숙해지면 문제없어요! 🚀

EricKing

April 19, 2025 at 6:12:37 PM EDT

Deep Cogito's LLMs are impressive, but the app could use a better UI. It's a bit clunky to navigate through the different model sizes. Still, the performance is top-notch, especially with the IDA tech. Definitely worth a look if you're into AI and want to see what's possible with large language models! 🤖💡