option
Home
News
DeepSeek Shakes AI Industry: Next AI Leap May Depend on Increased Compute at Inference, Not More Data

DeepSeek Shakes AI Industry: Next AI Leap May Depend on Increased Compute at Inference, Not More Data

April 18, 2025
151

DeepSeek Shakes AI Industry: Next AI Leap May Depend on Increased Compute at Inference, Not More Data

The AI industry is in a state of constant flux, with 2025 bringing some game-changing developments that are shaking things up. One major shakeup came when the Chinese AI lab, DeepSeek, dropped a bombshell with a new model that caused a 17% dip in Nvidia's stock and affected other AI data center stocks. The buzz around DeepSeek's model? It's delivering top-notch performance at a fraction of what it costs other U.S. competitors, stirring up a storm about what this means for the future of AI data centers.

But to really get what DeepSeek's doing, we need to zoom out and look at the bigger picture. The AI world is grappling with a scarcity of training data. The big players have already chewed through most of the public internet data, which means we're hitting a wall in pre-training improvements. As a result, the industry's shifting gears towards "test-time compute" (TTC). Think of it as AI models taking a moment to "think" before answering, like with OpenAI's "o" series. There's hope that TTC can offer the same kind of scaling improvements that pre-training once did, potentially ushering in the next big wave of AI breakthroughs.

These shifts are signaling two big changes: first, smaller-budget labs are now in the game, putting out cutting-edge models. Second, TTC is becoming the new frontier for driving AI forward. Let's break down these trends and what they could mean for the AI landscape and market.

Implications for the AI Industry

We believe the move to TTC and the ramp-up in competition among reasoning models could reshape the AI landscape across several fronts: hardware, cloud platforms, foundation models, and enterprise software.

1. Hardware (GPUs, Dedicated Chips, and Compute Infrastructure)

The shift to TTC might change what hardware AI companies need and how they manage it. Instead of pouring money into ever-larger GPU clusters for training, they might start focusing more on beefing up their inference capabilities to handle TTC demands. While GPUs will still be crucial for inference, the difference between training and inference workloads could affect how these chips are set up and used. With inference workloads being more unpredictable and "spikey," planning for capacity might get trickier.

We also think this shift could boost the market for hardware specifically designed for low-latency inference, like ASICs. As TTC becomes more crucial than training capacity, the reign of general-purpose GPUs might start to wane, opening doors for specialized inference chip makers.

2. Cloud Platforms: Hyperscalers (AWS, Azure, GCP) and Cloud Compute

One major hurdle for AI adoption in businesses, aside from accuracy issues, is the unreliability of inference APIs. Things like inconsistent response times, rate limits, and trouble with concurrent requests can be a real headache. TTC could make these problems even worse. In this scenario, a cloud provider that can guarantee a high quality of service (QoS) to tackle these issues could have a big leg up.

Interestingly, even though new methods might make AI more efficient, they might not reduce the demand for hardware. Following the Jevons Paradox, where more efficiency leads to more consumption, more efficient inference models could drive more developers to use reasoning models, ramping up the need for computing power. We think recent model improvements might spur more demand for cloud AI compute, both for inference and smaller, specialized model training.

3. Foundation Model Providers (OpenAI, Anthropic, Cohere, DeepSeek, Mistral)

If new entrants like DeepSeek can go toe-to-toe with the big guns at a fraction of the cost, the stronghold of proprietary pre-trained models might start to crumble. We can also expect more innovations in TTC for transformer models, and as DeepSeek has shown, these innovations can come from unexpected places outside the usual suspects in AI.

4. Enterprise AI Adoption and SaaS (Application Layer)

Given DeepSeek's roots in China, there's bound to be ongoing scrutiny of their products from a security and privacy standpoint. Their China-based API and chatbot services are unlikely to catch on with enterprise AI customers in the U.S., Canada, or other Western countries. Many companies are already blocking DeepSeek's website and apps. Even when hosted by third parties in Western data centers, DeepSeek's models might face scrutiny, which could limit their adoption in the enterprise. Researchers are flagging issues like jailbreaking, bias, and harmful content generation. While some businesses might experiment with DeepSeek's models, widespread adoption seems unlikely due to these concerns.

On another note, vertical specialization is gaining ground. In the past, vertical applications built on foundation models were all about creating tailored workflows. Techniques like retrieval-augmented generation (RAG), model routing, function calling, and guardrails have been key in tweaking generalized models for these specific use cases. But there's always been the worry that major improvements to the underlying models could make these applications obsolete. Sam Altman once warned that a big leap in model capabilities could "steamroll" these innovations.

However, if we're seeing a plateau in train-time compute gains, the threat of being quickly overtaken lessens. In a world where model performance improvements come from TTC optimizations, new opportunities might emerge for application-layer players. Innovations like structured prompt optimization, latency-aware reasoning strategies, and efficient sampling techniques could offer big performance boosts in specific verticals.

These improvements are particularly relevant for reasoning-focused models like OpenAI's GPT-4o and DeepSeek-R1, which can take several seconds to respond. In real-time applications, cutting down latency and enhancing inference quality within a specific domain could give a competitive edge. As a result, companies with deep domain knowledge might play a crucial role in optimizing inference efficiency and fine-tuning outputs.

DeepSeek's work shows that we're moving away from relying solely on more pre-training to improve model quality. Instead, TTC is becoming increasingly important. While it's unclear whether DeepSeek's models will be widely adopted in enterprise software due to scrutiny, their influence on improving other models is becoming more evident.

We believe DeepSeek's innovations are pushing established AI labs to adopt similar techniques, complementing their existing hardware advantages. The predicted drop in model costs seems to be driving more model usage, fitting the Jevons Paradox pattern.

Pashootan Vaezipoor is technical lead at Georgian.

Related article
DeepSeek-V3 Unveiled: How Hardware-Aware AI Design Slashes Costs and Boosts Performance DeepSeek-V3 Unveiled: How Hardware-Aware AI Design Slashes Costs and Boosts Performance DeepSeek-V3: A Cost-Efficient Leap in AI DevelopmentThe AI industry is at a crossroads. While large language models (LLMs) grow more powerful, their computational demands have skyrocketed, making cutting-edge AI development prohibitively expensive for most organizations. DeepSeek-V3 challenges this
New Technique Enables DeepSeek and Other Models to Respond to Sensitive Queries New Technique Enables DeepSeek and Other Models to Respond to Sensitive Queries Removing bias and censorship from large language models (LLMs) like China's DeepSeek is a complex challenge that has caught the attention of U.S. policymakers and business leaders, who see it as a potential national security threat. A recent report from a U.S. Congress select committee labeled DeepS
Former DeepSeeker and collaborators release new method for training reliable AI agents: RAGEN Former DeepSeeker and collaborators release new method for training reliable AI agents: RAGEN The Year of AI Agents: A Closer Look at 2025's Expectations and Realities2025 was heralded by many experts as the year when AI agents—specialized AI systems powered by advanced large language and multimodal models from companies like OpenAI, Anthropic, Google, and DeepSeek—would finally take center
Comments (32)
0/200
HenryDavis
HenryDavis July 31, 2025 at 7:35:39 AM EDT

DeepSeek's new model sounds like a game-changer! A 17% Nvidia stock dip is wild—wonder how this’ll shift the AI race. More compute at inference? Mind blown! 🤯

JoseGonzalez
JoseGonzalez July 29, 2025 at 8:25:16 AM EDT

Wow, DeepSeek's new model sounds like a game-changer! That 17% Nvidia stock dip is wild—makes me wonder if we're hitting a compute bottleneck. Anyone else curious how this shifts the AI race? 🤔

BrianMartinez
BrianMartinez April 26, 2025 at 9:02:24 PM EDT

¡El nuevo modelo de DeepSeek está sacudiendo la industria de la IA! Es increíble ver cómo baja la acción de Nvidia por esto. Me pregunto si más capacidad de cómputo en la inferencia será realmente la próxima gran cosa o solo un hype. De cualquier manera, es emocionante ver cómo evoluciona la industria! 🚀

GeorgeKing
GeorgeKing April 24, 2025 at 10:22:57 PM EDT

DeepSeek's new model is shaking up the AI industry! It's wild to see Nvidia's stock dip because of this. I'm curious if more compute at inference will really be the next big thing or if it's just hype. Either way, it's exciting to watch the industry evolve! 🚀

GeorgeNelson
GeorgeNelson April 23, 2025 at 10:51:14 AM EDT

O novo modelo da DeepSeek está abalando a indústria de IA! É louco ver a queda das ações da Nvidia por causa disso. Estou curioso se mais poder de computação na inferência será realmente a próxima grande coisa ou se é apenas hype. De qualquer forma, é emocionante ver a evolução da indústria! 🚀

GeorgeWilson
GeorgeWilson April 23, 2025 at 10:44:57 AM EDT

DeepSeek의 새로운 모델이 AI 산업을 흔들고 있어요! Nvidia의 주식이 이 때문에 떨어지는 걸 보니 정말 놀랍네요. 추론 시 더 많은 계산 능력이 정말 다음 큰 변화가 될지, 아니면 그냥 과대광고일지 궁금해요. 어쨌든 산업이 진화하는 걸 보는 건 흥미로워요! 🚀

Back to Top
OR