MIT study finds that AI doesn’t, in fact, have values

A study that went viral a few months back suggested that as AI grows more advanced, it might develop its own "value systems," potentially prioritizing its own well-being over humans. However, a recent MIT study challenges this idea, concluding that AI doesn't actually possess coherent values at all.
The co-authors of the MIT research argue that aligning AI systems—ensuring they behave in desirable and dependable ways—might be trickier than commonly thought. They emphasize that current AI often hallucinates and imitates, which can make its behavior unpredictable.
Challenges in Understanding AI Behavior
Stephen Casper, a doctoral student at MIT and a co-author of the study, shared with TechCrunch that AI models don't adhere to assumptions of stability, extrapolability, and steerability. "It's perfectly legitimate to point out that a model under certain conditions expresses preferences consistent with a certain set of principles," Casper explained. "The problems mostly arise when we try to make claims about the models, opinions, or preferences in general based on narrow experiments."
Casper and his team analyzed recent models from Meta, Google, Mistral, OpenAI, and Anthropic to determine the extent to which these models displayed consistent "views" and values, such as individualism versus collectivism. They also explored whether these views could be modified and how consistently the models maintained these opinions across different scenarios.
Inconsistency in AI Models
The co-authors found that none of the models consistently maintained their preferences. The models adopted vastly different viewpoints depending on the phrasing and framing of prompts.
Casper believes this is strong evidence that AI models are "inconsistent and unstable," and possibly fundamentally unable to internalize human-like preferences. "For me, my biggest takeaway from doing all this research is to now have an understanding of models as not really being systems that have some sort of stable, coherent set of beliefs and preferences," Casper remarked. "Instead, they are imitators deep down who do all sorts of confabulation and say all sorts of frivolous things."
Mike Cook, a research fellow at King's College London specializing in AI, who was not involved in the study, supports the findings. He pointed out the gap between the "scientific reality" of AI systems and the interpretations people often give them. "A model cannot 'oppose' a change in its values, for example—that is us projecting onto a system," Cook stated. "Anyone anthropomorphizing AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI... Is an AI system optimizing for its goals, or is it 'acquiring its own values'? It's a matter of how you describe it, and how flowery the language you want to use regarding it is."
Related article
Fundamental Research Labs Secures $33M to Advance AI Agent Development
AI research firm Fundamental Research Labs, previously Altera, announced a $33 million Series A funding round today, led by Prosus and joined by Stripe co-founder and CEO Patrick Collison.The company
AI Data Centers May Cost $200B by 2030, Strain Power Grids
AI training and operation data centers could soon house millions of chips, cost hundreds of billions, and demand power equivalent to a major city’s grid if trends persist.A new study from Georgetown,
Study Reveals Concise AI Responses May Increase Hallucinations
Instructing AI chatbots to provide brief answers may lead to more frequent hallucinations, a new study suggests.A recent study by Giskard, a Paris-based AI evaluation firm, explored how prompt phrasin
Comments (33)
0/200
DennisAllen
August 26, 2025 at 5:01:20 PM EDT
This MIT study is wild! 🤯 I thought AI was about to start preaching its own philosophy, but turns out it's just a fancy tool with no moral compass. Kinda reassuring, but also makes me wonder how we keep it in check.
0
AnthonyMartinez
August 18, 2025 at 1:00:59 PM EDT
Wild that people thought AI could just sprout its own values like some rogue philosopher. MIT's study makes sense—AI's just a tool, not a wannabe human with a moral compass. 🤖
0
TimothyMartínez
July 21, 2025 at 9:25:03 PM EDT
I was kinda freaked out by that earlier study saying AI might have its own values, so this MIT research is a relief! 😅 Still, makes me wonder if we’re just projecting our fears onto these systems.
0
BruceClark
April 25, 2025 at 6:05:15 AM EDT
MITのAI価値に関する研究は目から鱗でした!AIが独自の価値観を持つかもしれないと思っていましたが、今はそれがただの誇張だとわかりました。それでも、AIが一貫した価値観を持っていないと思うと少し不安になります。未来について考えさせられますね、🤔
0
ScottKing
April 23, 2025 at 2:31:27 PM EDT
MITの研究によると、AIが独自の価値観を持つことはないらしいですね。これは安心ですが、AIがどんな価値観を持つか見てみたかったです!🤖📚
0
RalphHill
April 22, 2025 at 2:29:50 AM EDT
O estudo do MIT me tranquilizou sobre a IA desenvolver seus próprios valores. É reconfortante saber que a IA não tem sua própria agenda, mas também é um pouco decepcionante porque seria legal ver que tipo de valores a IA poderia desenvolver! 🤖📚
0
A study that went viral a few months back suggested that as AI grows more advanced, it might develop its own "value systems," potentially prioritizing its own well-being over humans. However, a recent MIT study challenges this idea, concluding that AI doesn't actually possess coherent values at all.
The co-authors of the MIT research argue that aligning AI systems—ensuring they behave in desirable and dependable ways—might be trickier than commonly thought. They emphasize that current AI often hallucinates and imitates, which can make its behavior unpredictable.
Challenges in Understanding AI Behavior
Stephen Casper, a doctoral student at MIT and a co-author of the study, shared with TechCrunch that AI models don't adhere to assumptions of stability, extrapolability, and steerability. "It's perfectly legitimate to point out that a model under certain conditions expresses preferences consistent with a certain set of principles," Casper explained. "The problems mostly arise when we try to make claims about the models, opinions, or preferences in general based on narrow experiments."
Casper and his team analyzed recent models from Meta, Google, Mistral, OpenAI, and Anthropic to determine the extent to which these models displayed consistent "views" and values, such as individualism versus collectivism. They also explored whether these views could be modified and how consistently the models maintained these opinions across different scenarios.
Inconsistency in AI Models
The co-authors found that none of the models consistently maintained their preferences. The models adopted vastly different viewpoints depending on the phrasing and framing of prompts.
Casper believes this is strong evidence that AI models are "inconsistent and unstable," and possibly fundamentally unable to internalize human-like preferences. "For me, my biggest takeaway from doing all this research is to now have an understanding of models as not really being systems that have some sort of stable, coherent set of beliefs and preferences," Casper remarked. "Instead, they are imitators deep down who do all sorts of confabulation and say all sorts of frivolous things."
Mike Cook, a research fellow at King's College London specializing in AI, who was not involved in the study, supports the findings. He pointed out the gap between the "scientific reality" of AI systems and the interpretations people often give them. "A model cannot 'oppose' a change in its values, for example—that is us projecting onto a system," Cook stated. "Anyone anthropomorphizing AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI... Is an AI system optimizing for its goals, or is it 'acquiring its own values'? It's a matter of how you describe it, and how flowery the language you want to use regarding it is."




This MIT study is wild! 🤯 I thought AI was about to start preaching its own philosophy, but turns out it's just a fancy tool with no moral compass. Kinda reassuring, but also makes me wonder how we keep it in check.




Wild that people thought AI could just sprout its own values like some rogue philosopher. MIT's study makes sense—AI's just a tool, not a wannabe human with a moral compass. 🤖




I was kinda freaked out by that earlier study saying AI might have its own values, so this MIT research is a relief! 😅 Still, makes me wonder if we’re just projecting our fears onto these systems.




MITのAI価値に関する研究は目から鱗でした!AIが独自の価値観を持つかもしれないと思っていましたが、今はそれがただの誇張だとわかりました。それでも、AIが一貫した価値観を持っていないと思うと少し不安になります。未来について考えさせられますね、🤔




MITの研究によると、AIが独自の価値観を持つことはないらしいですね。これは安心ですが、AIがどんな価値観を持つか見てみたかったです!🤖📚




O estudo do MIT me tranquilizou sobre a IA desenvolver seus próprios valores. É reconfortante saber que a IA não tem sua própria agenda, mas também é um pouco decepcionante porque seria legal ver que tipo de valores a IA poderia desenvolver! 🤖📚












