Researchers Develop Open-Source Rival to OpenAI's $50 'Reasoning' Model for Under $50
April 21, 2025
JosephWalker
30

Last Friday, a groundbreaking research paper from AI experts at Stanford and the University of Washington hit the scene, revealing that they managed to develop an AI "reasoning" model, dubbed s1, for under $50 in cloud compute credits. This revelation is shaking up the AI world, as s1 holds its own against top-tier models like OpenAI's o1 and DeepSeek's R1 when it comes to tackling math and coding challenges.
The s1 model, along with all the juicy details of its training data and code, is now up for grabs on GitHub. The team kicked things off with a run-of-the-mill base model and then put it through the wringer with a technique called distillation. This process involves squeezing out the "reasoning" juice from another AI model by training on its responses. In this case, s1 got its smarts from Google's Gemini 2.0 Flash Thinking Experimental model. It's a similar tactic to what Berkeley researchers used to whip up their own AI reasoning model for around $450 just last month.
For some, the idea that a small team of researchers can still make waves in the AI field without a massive budget is thrilling. But s1's emergence also sparks some serious questions about the future of AI model development. If a model that rivals those built with millions can be replicated on a shoestring budget, what's to stop everyone from doing the same?
Not surprisingly, the big players in AI aren't thrilled. OpenAI, for instance, has pointed fingers at DeepSeek, accusing them of using their API data to fuel model distillation. Meanwhile, the s1 team was focused on finding the most straightforward way to achieve solid reasoning performance and something called "test-time scaling," where an AI model gets more time to think before answering. These are the same innovations that OpenAI's o1 model brought to the table, which others like DeepSeek have tried to mimic with their own methods.
The s1 paper suggests that you can distill reasoning models with a relatively small dataset using a technique known as supervised fine-tuning (SFT). This involves training the AI model to copy specific behaviors from a dataset, and it's cheaper than the large-scale reinforcement learning that DeepSeek used for their R1 model, which competes with OpenAI's o1.
Google makes Gemini 2.0 Flash Thinking Experimental available for free through its Google AI Studio platform, though with daily limits. But there's a catch—Google's terms don't allow reverse-engineering its models to create competing services. We're waiting to hear back from Google on this.
The s1 model itself started life as a modest, off-the-shelf AI model from Alibaba's Qwen lab, which anyone can download for free. To train s1, the researchers put together a dataset of just 1,000 carefully chosen questions, along with answers and the "thinking" process behind each one, courtesy of Google's Gemini 2.0. The whole training process took less than 30 minutes on 16 Nvidia H100 GPUs. According to Niklas Muennighoff, a Stanford researcher involved in the project, you could pull this off today for about $20 in compute costs.
The researchers also pulled a clever move to make s1 double-check its work and extend its "thinking" time—they simply told it to "wait." Adding this word during s1's reasoning process helped it come up with slightly more accurate answers, according to the paper.
Looking ahead to 2025, tech giants like Meta, Google, and Microsoft are set to pour hundreds of billions into AI infrastructure, much of which will go toward training the next wave of AI models. While distillation proves to be an effective way to recreate AI capabilities on the cheap, it's not going to lead to the creation of brand-new, groundbreaking AI models anytime soon.
Related article
Former DeepSeeker and collaborators release new method for training reliable AI agents: RAGEN
The Year of AI Agents: A Closer Look at 2025's Expectations and Realities2025 was heralded by many experts as the year when AI agents—specialized AI systems powered by advanced large language and multimodal models from companies like OpenAI, Anthropic, Google, and DeepSeek—would finally take center
Google Search Introduces 'AI Mode' for Complex, Multi-Part Queries
Google Unveils "AI Mode" in Search to Rival Perplexity AI and ChatGPTGoogle is stepping up its game in the AI arena with the launch of an experimental "AI Mode" feature in its Search engine. Aimed at taking on the likes of Perplexity AI and OpenAI's ChatGPT Search, this new mode was announced on Wed
ChatGPT's Unsolicited Use of User Names Sparks 'Creepy' Concerns Among Some
Some users of ChatGPT have recently encountered an odd new feature: the chatbot occasionally uses their name while working through problems. This wasn't part of its usual behavior before, and many users report that ChatGPT mentions their names without ever being told what to call them.
Opinions on
Comments (0)
0/200






Last Friday, a groundbreaking research paper from AI experts at Stanford and the University of Washington hit the scene, revealing that they managed to develop an AI "reasoning" model, dubbed s1, for under $50 in cloud compute credits. This revelation is shaking up the AI world, as s1 holds its own against top-tier models like OpenAI's o1 and DeepSeek's R1 when it comes to tackling math and coding challenges.
The s1 model, along with all the juicy details of its training data and code, is now up for grabs on GitHub. The team kicked things off with a run-of-the-mill base model and then put it through the wringer with a technique called distillation. This process involves squeezing out the "reasoning" juice from another AI model by training on its responses. In this case, s1 got its smarts from Google's Gemini 2.0 Flash Thinking Experimental model. It's a similar tactic to what Berkeley researchers used to whip up their own AI reasoning model for around $450 just last month.
For some, the idea that a small team of researchers can still make waves in the AI field without a massive budget is thrilling. But s1's emergence also sparks some serious questions about the future of AI model development. If a model that rivals those built with millions can be replicated on a shoestring budget, what's to stop everyone from doing the same?
Not surprisingly, the big players in AI aren't thrilled. OpenAI, for instance, has pointed fingers at DeepSeek, accusing them of using their API data to fuel model distillation. Meanwhile, the s1 team was focused on finding the most straightforward way to achieve solid reasoning performance and something called "test-time scaling," where an AI model gets more time to think before answering. These are the same innovations that OpenAI's o1 model brought to the table, which others like DeepSeek have tried to mimic with their own methods.
The s1 paper suggests that you can distill reasoning models with a relatively small dataset using a technique known as supervised fine-tuning (SFT). This involves training the AI model to copy specific behaviors from a dataset, and it's cheaper than the large-scale reinforcement learning that DeepSeek used for their R1 model, which competes with OpenAI's o1.
Google makes Gemini 2.0 Flash Thinking Experimental available for free through its Google AI Studio platform, though with daily limits. But there's a catch—Google's terms don't allow reverse-engineering its models to create competing services. We're waiting to hear back from Google on this.
The s1 model itself started life as a modest, off-the-shelf AI model from Alibaba's Qwen lab, which anyone can download for free. To train s1, the researchers put together a dataset of just 1,000 carefully chosen questions, along with answers and the "thinking" process behind each one, courtesy of Google's Gemini 2.0. The whole training process took less than 30 minutes on 16 Nvidia H100 GPUs. According to Niklas Muennighoff, a Stanford researcher involved in the project, you could pull this off today for about $20 in compute costs.
The researchers also pulled a clever move to make s1 double-check its work and extend its "thinking" time—they simply told it to "wait." Adding this word during s1's reasoning process helped it come up with slightly more accurate answers, according to the paper.
Looking ahead to 2025, tech giants like Meta, Google, and Microsoft are set to pour hundreds of billions into AI infrastructure, much of which will go toward training the next wave of AI models. While distillation proves to be an effective way to recreate AI capabilities on the cheap, it's not going to lead to the creation of brand-new, groundbreaking AI models anytime soon.












