AI Scaling Breakthrough Questioned by Experts

Home

News

April 10, 2025

CarlLewis

# research # scaling

AI Scaling Breakthrough Questioned by Experts

There's been some buzz on social media about researchers discovering a new AI "scaling law," but experts are taking it with a grain of salt. AI scaling laws, which are more like informal guidelines, show how AI models get better as you throw more data and computing power at them. Up until about a year ago, the big trend was all about "pre-training" – basically, training bigger models on bigger datasets. That's still a thing, but now we've got two more scaling laws in the mix: post-training scaling, which is all about tweaking a model's behavior, and test-time scaling, which involves using more computing power during inference to boost a model's "reasoning" capabilities (think models like R1). Recently, researchers from Google and UC Berkeley dropped a paper that some folks online are calling a fourth law: "inference-time search." This method has the model spit out a bunch of possible answers to a query at the same time and then pick the best one. The researchers claim it can juice up the performance of an older model, like Google's Gemini 1.5 Pro, to beat OpenAI's o1-preview "reasoning" model on science and math benchmarks. Eric Zhao, a Google doctorate fellow and one of the paper's co-authors, shared on X that by just randomly sampling 200 responses and letting the model self-verify, Gemini 1.5 – which he jokingly called an "ancient early 2024 model" – could outdo o1-preview and even get close to o1. He pointed out that self-verification gets easier as you scale up, which is kind of counterintuitive but cool. But not everyone's convinced. Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, told TechCrunch that this approach works best when you've got a solid way to judge the answers. Most questions aren't that straightforward, though. He said, "If we can't write code to define what we want, we can't use [inference-time] search. For something like general language interaction, we can't do this... It's generally not a great approach to actually solving most problems." Zhao responded, saying their paper actually looks at cases where you don't have a clear way to judge the answers, and the model has to figure it out on its own. He argued that the gap between having a clear way to judge and not having one can shrink as you scale up. Mike Cook, a research fellow at King's College London, backed up Guzdial's view, saying that inference-time search doesn't really make the model's reasoning better. It's more like a workaround for the model's tendency to make confident mistakes. He pointed out that if your model messes up 5% of the time, checking 200 attempts should make those mistakes easier to spot. This news might be a bit of a downer for the AI industry, which is always on the hunt for ways to boost model "reasoning" without breaking the bank. As the paper's authors noted, reasoning models can rack up thousands of dollars in computing costs just to solve one math problem. Looks like the search for new scaling techniques is far from over. *Updated 3/20 5:12 a.m. Pacific: Added comments from study co-author Eric Zhao, who takes issue with an assessment by an independent researcher who critiqued the work.*

157

Google Cloud Powers Breakthroughs in Scientific Research and Discovery The digital revolution is transforming scientific methodologies through unprecedented computational capabilities. Cutting-edge technologies now augment both theoretical frameworks and laboratory experiments, propelling breakthroughs across discipline

AI Accelerates Scientific Research for Greater Real-World Impact Google has consistently harnessed AI as a catalyst for scientific progress, with today's pace of discovery reaching extraordinary new levels. This acceleration has transformed the research cycle, turning fundamental breakthroughs into practical appli

Ethics in AI: Tackling Bias and Compliance Challenges in Automation As automation becomes deeply embedded across industries, ethical considerations are emerging as critical priorities. Decision-making algorithms now influence crucial aspects of society including employment opportunities, financial services, medical c

Comments (35)

0/200

Submit

DanielThomas

April 23, 2025 at 7:49:41 PM EDT

AI 스케일링 돌파구는 멋지게 들리지만, 전문가들은 회의적이에요. 🤔 이제 뭘 믿어야 할지 모르겠어요. 그냥 과대광고일까요? 지켜볼게요, 하지만 기대는 하지 않을게요. 😴

BenRoberts

April 23, 2025 at 2:12:49 PM EDT

This AI scaling law thing sounds cool, but it's hard to get excited when experts are so skeptical. It's like they're saying, 'Sure, it's interesting, but let's not get carried away.' I guess we'll see if it's the real deal or just another hype train. 🤔

PatrickMartinez

April 21, 2025 at 3:31:56 PM EDT

Essa história de lei de escalabilidade de IA parece legal, mas é difícil se empolgar quando os especialistas são tão céticos. Parece que eles estão dizendo, 'Sim, é interessante, mas não vamos nos empolgar muito'. Vamos ver se é verdade ou só mais um hype. 🤔

JohnYoung

April 19, 2025 at 8:36:43 PM EDT

AI 스케일링 법칙에 대한 소식은 흥미롭지만, 전문가들이 회의적이라서 흥분하기 어려워. '재미있지만 너무 기대하지 마세요'라는 느낌이야. 실제로 어떻게 될지 지켜봐야겠네. 🤔

HaroldMoore

April 17, 2025 at 7:24:24 AM EDT

AIのスケーリングブレイクスルーは面白そうだけど、専門家は懐疑的。🤔 もう何を信じればいいのかわからない。ただの誇大広告かも？注目はするけど、期待はしないよ。😴

AlbertLee

April 16, 2025 at 11:25:29 AM EDT

El avance en la escala de IA suena genial, pero todavía no lo compro. Es todo un hype en las redes sociales, pero los expertos son escépticos. Esperaré más pruebas sólidas antes de subirme al carro. 🤔