MIT study finds that AI doesn’t, in fact, have values

A study that went viral a few months back suggested that as AI grows more advanced, it might develop its own "value systems," potentially prioritizing its own well-being over humans. However, a recent MIT study challenges this idea, concluding that AI doesn't actually possess coherent values at all.
The co-authors of the MIT research argue that aligning AI systems—ensuring they behave in desirable and dependable ways—might be trickier than commonly thought. They emphasize that current AI often hallucinates and imitates, which can make its behavior unpredictable.
Challenges in Understanding AI Behavior
Stephen Casper, a doctoral student at MIT and a co-author of the study, shared with TechCrunch that AI models don't adhere to assumptions of stability, extrapolability, and steerability. "It's perfectly legitimate to point out that a model under certain conditions expresses preferences consistent with a certain set of principles," Casper explained. "The problems mostly arise when we try to make claims about the models, opinions, or preferences in general based on narrow experiments."
Casper and his team analyzed recent models from Meta, Google, Mistral, OpenAI, and Anthropic to determine the extent to which these models displayed consistent "views" and values, such as individualism versus collectivism. They also explored whether these views could be modified and how consistently the models maintained these opinions across different scenarios.
Inconsistency in AI Models
The co-authors found that none of the models consistently maintained their preferences. The models adopted vastly different viewpoints depending on the phrasing and framing of prompts.
Casper believes this is strong evidence that AI models are "inconsistent and unstable," and possibly fundamentally unable to internalize human-like preferences. "For me, my biggest takeaway from doing all this research is to now have an understanding of models as not really being systems that have some sort of stable, coherent set of beliefs and preferences," Casper remarked. "Instead, they are imitators deep down who do all sorts of confabulation and say all sorts of frivolous things."
Mike Cook, a research fellow at King's College London specializing in AI, who was not involved in the study, supports the findings. He pointed out the gap between the "scientific reality" of AI systems and the interpretations people often give them. "A model cannot 'oppose' a change in its values, for example—that is us projecting onto a system," Cook stated. "Anyone anthropomorphizing AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI... Is an AI system optimizing for its goals, or is it 'acquiring its own values'? It's a matter of how you describe it, and how flowery the language you want to use regarding it is."
Related article
AI's Growth Stunted by Lack of Public Trust
While politicians emphasize AI's potential for growth and efficiency, a recent report highlights a significant trust deficit among the public. Widespread skepticism is creating major challenges for government initiatives.A comprehensive study by the
MIT Startup Tackles AI Hallucinations by Teaching Systems to Admit Uncertainty
The risks associated with AI hallucinations are escalating as these models are increasingly relied upon to surface critical information and make high-stakes decisions.We all know someone who acts like a know-it-all, refusing to admit ignorance or off
MIT Study Finds AI Diminishes Human Brain Engagement
A study conducted by MIT (Massachusetts Institute of Technology) reveals that using a large language model (LLM) not only reduces mental effort in the moment, but also has lingering negative effects on cognitive performance in subsequent tasks.In the
Related Special Topic Recommendations
Comments (35)
0/500
La gente se preocupa demasiado por las 'valores' de la IA, cuando en realidad solo reflejan y multiplican nuestros propios sesgos. Este estudio del MIT lo deja claro: las máquinas no piensan como nosotros, solo procesan datos. ¿No sería más útil enfocarnos en regular a quienes las programan? 🤔
So basically AI is more like a super calculator than a rebellious teen with a moral compass? Interesting study. It does make sense when you think about it—these models are just predicting text, not forming beliefs. Still, kinda spooky how the debate swings from 'AI will take over' to 'AI has no motives' every few months. 🤔
This MIT study is wild! 🤯 I thought AI was about to start preaching its own philosophy, but turns out it's just a fancy tool with no moral compass. Kinda reassuring, but also makes me wonder how we keep it in check.
Wild that people thought AI could just sprout its own values like some rogue philosopher. MIT's study makes sense—AI's just a tool, not a wannabe human with a moral compass. 🤖
I was kinda freaked out by that earlier study saying AI might have its own values, so this MIT research is a relief! 😅 Still, makes me wonder if we’re just projecting our fears onto these systems.

A study that went viral a few months back suggested that as AI grows more advanced, it might develop its own "value systems," potentially prioritizing its own well-being over humans. However, a recent MIT study challenges this idea, concluding that AI doesn't actually possess coherent values at all.
The co-authors of the MIT research argue that aligning AI systems—ensuring they behave in desirable and dependable ways—might be trickier than commonly thought. They emphasize that current AI often hallucinates and imitates, which can make its behavior unpredictable.
Challenges in Understanding AI Behavior
Stephen Casper, a doctoral student at MIT and a co-author of the study, shared with TechCrunch that AI models don't adhere to assumptions of stability, extrapolability, and steerability. "It's perfectly legitimate to point out that a model under certain conditions expresses preferences consistent with a certain set of principles," Casper explained. "The problems mostly arise when we try to make claims about the models, opinions, or preferences in general based on narrow experiments."
Casper and his team analyzed recent models from Meta, Google, Mistral, OpenAI, and Anthropic to determine the extent to which these models displayed consistent "views" and values, such as individualism versus collectivism. They also explored whether these views could be modified and how consistently the models maintained these opinions across different scenarios.
Inconsistency in AI Models
The co-authors found that none of the models consistently maintained their preferences. The models adopted vastly different viewpoints depending on the phrasing and framing of prompts.
Casper believes this is strong evidence that AI models are "inconsistent and unstable," and possibly fundamentally unable to internalize human-like preferences. "For me, my biggest takeaway from doing all this research is to now have an understanding of models as not really being systems that have some sort of stable, coherent set of beliefs and preferences," Casper remarked. "Instead, they are imitators deep down who do all sorts of confabulation and say all sorts of frivolous things."
Mike Cook, a research fellow at King's College London specializing in AI, who was not involved in the study, supports the findings. He pointed out the gap between the "scientific reality" of AI systems and the interpretations people often give them. "A model cannot 'oppose' a change in its values, for example—that is us projecting onto a system," Cook stated. "Anyone anthropomorphizing AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI... Is an AI system optimizing for its goals, or is it 'acquiring its own values'? It's a matter of how you describe it, and how flowery the language you want to use regarding it is."
AI's Growth Stunted by Lack of Public Trust
While politicians emphasize AI's potential for growth and efficiency, a recent report highlights a significant trust deficit among the public. Widespread skepticism is creating major challenges for government initiatives.A comprehensive study by the
MIT Study Finds AI Diminishes Human Brain Engagement
A study conducted by MIT (Massachusetts Institute of Technology) reveals that using a large language model (LLM) not only reduces mental effort in the moment, but also has lingering negative effects on cognitive performance in subsequent tasks.In the
La gente se preocupa demasiado por las 'valores' de la IA, cuando en realidad solo reflejan y multiplican nuestros propios sesgos. Este estudio del MIT lo deja claro: las máquinas no piensan como nosotros, solo procesan datos. ¿No sería más útil enfocarnos en regular a quienes las programan? 🤔
So basically AI is more like a super calculator than a rebellious teen with a moral compass? Interesting study. It does make sense when you think about it—these models are just predicting text, not forming beliefs. Still, kinda spooky how the debate swings from 'AI will take over' to 'AI has no motives' every few months. 🤔
This MIT study is wild! 🤯 I thought AI was about to start preaching its own philosophy, but turns out it's just a fancy tool with no moral compass. Kinda reassuring, but also makes me wonder how we keep it in check.
Wild that people thought AI could just sprout its own values like some rogue philosopher. MIT's study makes sense—AI's just a tool, not a wannabe human with a moral compass. 🤖
I was kinda freaked out by that earlier study saying AI might have its own values, so this MIT research is a relief! 😅 Still, makes me wonder if we’re just projecting our fears onto these systems.





Home






