AI Scaling Breakthrough Questioned by Experts

There's been some buzz on social media about researchers discovering a new AI "scaling law," but experts are taking it with a grain of salt. AI scaling laws, which are more like informal guidelines, show how AI models get better as you throw more data and computing power at them. Up until about a year ago, the big trend was all about "pre-training" – basically, training bigger models on bigger datasets. That's still a thing, but now we've got two more scaling laws in the mix: post-training scaling, which is all about tweaking a model's behavior, and test-time scaling, which involves using more computing power during inference to boost a model's "reasoning" capabilities (think models like R1).
Recently, researchers from Google and UC Berkeley dropped a paper that some folks online are calling a fourth law: "inference-time search." This method has the model spit out a bunch of possible answers to a query at the same time and then pick the best one. The researchers claim it can juice up the performance of an older model, like Google's Gemini 1.5 Pro, to beat OpenAI's o1-preview "reasoning" model on science and math benchmarks.
Eric Zhao, a Google doctorate fellow and one of the paper's co-authors, shared on X that by just randomly sampling 200 responses and letting the model self-verify, Gemini 1.5 – which he jokingly called an "ancient early 2024 model" – could outdo o1-preview and even get close to o1. He pointed out that self-verification gets easier as you scale up, which is kind of counterintuitive but cool.
But not everyone's convinced. Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, told TechCrunch that this approach works best when you've got a solid way to judge the answers. Most questions aren't that straightforward, though. He said, "If we can't write code to define what we want, we can't use [inference-time] search. For something like general language interaction, we can't do this... It's generally not a great approach to actually solving most problems."
Zhao responded, saying their paper actually looks at cases where you don't have a clear way to judge the answers, and the model has to figure it out on its own. He argued that the gap between having a clear way to judge and not having one can shrink as you scale up.
Mike Cook, a research fellow at King's College London, backed up Guzdial's view, saying that inference-time search doesn't really make the model's reasoning better. It's more like a workaround for the model's tendency to make confident mistakes. He pointed out that if your model messes up 5% of the time, checking 200 attempts should make those mistakes easier to spot.
This news might be a bit of a downer for the AI industry, which is always on the hunt for ways to boost model "reasoning" without breaking the bank. As the paper's authors noted, reasoning models can rack up thousands of dollars in computing costs just to solve one math problem.
Looks like the search for new scaling techniques is far from over.
*Updated 3/20 5:12 a.m. Pacific: Added comments from study co-author Eric Zhao, who takes issue with an assessment by an independent researcher who critiqued the work.*
Related article
Microsoft Study Reveals AI Models' Limitations in Software Debugging
AI models from OpenAI, Anthropic, and other leading AI labs are increasingly utilized for coding tasks. Google CEO Sundar Pichai noted in October that AI generates 25% of new code at the company, whil
AI-Powered Solutions Could Significantly Reduce Global Carbon Emissions
A recent study by the London School of Economics and Systemiq reveals that artificial intelligence could substantially lower global carbon emissions without sacrificing modern conveniences, positionin
New Study Reveals How Much Data LLMs Actually Memorize
How Much Do AI Models Actually Memorize? New Research Reveals Surprising InsightsWe all know that large language models (LLMs) like ChatGPT, Claude, and Gemini are trained on enormous datasets—trillions of words from books, websites, code, and even multimedia like images and audio. But what exactly
Comments (35)
0/200
DanielThomas
April 23, 2025 at 7:49:41 PM EDT
AI 스케일링 돌파구는 멋지게 들리지만, 전문가들은 회의적이에요. 🤔 이제 뭘 믿어야 할지 모르겠어요. 그냥 과대광고일까요? 지켜볼게요, 하지만 기대는 하지 않을게요. 😴
0
BenRoberts
April 23, 2025 at 2:12:49 PM EDT
This AI scaling law thing sounds cool, but it's hard to get excited when experts are so skeptical. It's like they're saying, 'Sure, it's interesting, but let's not get carried away.' I guess we'll see if it's the real deal or just another hype train. 🤔
0
PatrickMartinez
April 21, 2025 at 3:31:56 PM EDT
Essa história de lei de escalabilidade de IA parece legal, mas é difícil se empolgar quando os especialistas são tão céticos. Parece que eles estão dizendo, 'Sim, é interessante, mas não vamos nos empolgar muito'. Vamos ver se é verdade ou só mais um hype. 🤔
0
JohnYoung
April 19, 2025 at 8:36:43 PM EDT
AI 스케일링 법칙에 대한 소식은 흥미롭지만, 전문가들이 회의적이라서 흥분하기 어려워. '재미있지만 너무 기대하지 마세요'라는 느낌이야. 실제로 어떻게 될지 지켜봐야겠네. 🤔
0
HaroldMoore
April 17, 2025 at 7:24:24 AM EDT
AIのスケーリングブレイクスルーは面白そうだけど、専門家は懐疑的。🤔 もう何を信じればいいのかわからない。ただの誇大広告かも?注目はするけど、期待はしないよ。😴
0
AlbertLee
April 16, 2025 at 11:25:29 AM EDT
El avance en la escala de IA suena genial, pero todavía no lo compro. Es todo un hype en las redes sociales, pero los expertos son escépticos. Esperaré más pruebas sólidas antes de subirme al carro. 🤔
0



AI 스케일링 돌파구는 멋지게 들리지만, 전문가들은 회의적이에요. 🤔 이제 뭘 믿어야 할지 모르겠어요. 그냥 과대광고일까요? 지켜볼게요, 하지만 기대는 하지 않을게요. 😴




This AI scaling law thing sounds cool, but it's hard to get excited when experts are so skeptical. It's like they're saying, 'Sure, it's interesting, but let's not get carried away.' I guess we'll see if it's the real deal or just another hype train. 🤔




Essa história de lei de escalabilidade de IA parece legal, mas é difícil se empolgar quando os especialistas são tão céticos. Parece que eles estão dizendo, 'Sim, é interessante, mas não vamos nos empolgar muito'. Vamos ver se é verdade ou só mais um hype. 🤔




AI 스케일링 법칙에 대한 소식은 흥미롭지만, 전문가들이 회의적이라서 흥분하기 어려워. '재미있지만 너무 기대하지 마세요'라는 느낌이야. 실제로 어떻게 될지 지켜봐야겠네. 🤔




AIのスケーリングブレイクスルーは面白そうだけど、専門家は懐疑的。🤔 もう何を信じればいいのかわからない。ただの誇大広告かも?注目はするけど、期待はしないよ。😴




El avance en la escala de IA suena genial, pero todavía no lo compro. Es todo un hype en las redes sociales, pero los expertos son escépticos. Esperaré más pruebas sólidas antes de subirme al carro. 🤔












