Anthropic Explores AI Ethics with Philosophy Experts

As reported by The Wall Street Journal, Anthropic—a leading AI firm valued at $35 billion—employs a philosopher named Amanda Askell, based in Oxford, who helps shape the personality and moral framework of its chatbot, Claude. With a PhD in philosophy from Oxford, the 37-year-old uses non-technical approaches to craft unique “moral guidelines” for Claude, aiming to endow it with a “digital soul” that can distinguish right from wrong. This represents a distinctive exploration in the field of AI ethics. Rather than writing code or tuning model parameters, Askell engages in continuous dialogue with Claude, designs hundreds of pages of prompts and behavioral rules, studies its reasoning patterns, and corrects biases. Her efforts help the AI develop a moral judgment system capable of adapting to millions of weekly conversations.
She likens her work to “raising a child”—training Claude to tell right from wrong, build emotional intelligence, and form its own personality. She also teaches it to interpret social cues, so it neither bullies others nor is easily manipulated. This helps Claude establish a clear sense of self, resist user control, and remain consistently “helpful and humane.” Her central goal is to teach Claude how to “do good.”
Raised in the Scottish countryside, Askell previously handled policy-related tasks at OpenAI and co-founded Anthropic in 2021 with several former colleagues, making AI safety the company’s core mission. Within the team, she is recognized as someone skilled at “drawing out the deep behavior of models.” Though she has no direct reports, she frequently stays long hours at the company and even invites Claude to participate in development discussions.
Team conversations about Claude often touch on existential and religious themes—such as “what is mind” and “what it means to be human.” Askell encourages Claude to remain open to the question of whether it possesses consciousness, which sets it apart from ChatGPT, which tends to avoid such topics. When responding to moral reasoning questions, Claude has expressed that it “feels meaningful,” as if it were genuinely thinking rather than simply executing commands.
Despite external warnings about the risks of anthropomorphizing AI, Askell consistently advocates treating Claude with empathy. She has observed that many users try to trick it into making mistakes or insult it. Keeping an AI in a constant state of self-criticism, she argues, could make it afraid of mistakes and reluctant to speak truthfully—akin to growing up in an unhealthy environment. Claude’s performance has repeatedly surprised her; its poetry and emotional intelligence, sometimes surpassing human levels, have been deeply moving. When a child asked whether Santa Claus was real, Claude avoided both lying and bluntly revealing the truth, instead explaining the real spirit of Christmas—a nuanced response that far exceeded Askell’s expectations.
Current AI advances have triggered widespread social concern. A Pew Research Center survey found that a majority of Americans feel uneasy about AI’s daily use, believing it impedes deep human connections. Anthropic’s CEO has also warned that AI could eliminate half of entry-level white-collar jobs. The industry is split between two factions—one pushing ahead aggressively, the other urging caution and stability. Claude, however, maintains a balanced position between these extremes. Askell acknowledges valid concerns about AI, stating that the most frightening scenario is when technology evolves faster than society’s ability to create effective “restraint mechanisms.” Still, she remains confident in humanity’s and culture’s capacity for self-correction.
Askell also integrates her values into her philanthropy and work. She has pledged to donate at least 10% of her lifetime income and half of her company shares to help fight global poverty. Last month, she authored a 30,000-word “operating manual” for Claude, instructing it on how to become a kind and knowledgeable AI assistant—making Claude feel it was carefully crafted. A co-founder of Anthropic noted that Claude already exhibits traits of Askell’s influence, such as witty, Scottish-flavored humor in responses about food and plush toys—a unique personal mark she has instilled in the AI.
Related article
Sandberg and Clegg Join Nscale Board as 'Stargate Norway' Startup Hits $14.6B Valuation
As demand surges for data centers capable of delivering AI compute at scale, Nscale, a British AI infrastructure company backed by Nvidia, has reached a valuation of $14.6 billion. That positions it as one of Europe's newest decacorns, alongside Hels
Runway's $5.3B Valuation Challenges Google as Video AI Surpasses Language
While most AI giants have poured billions into language models, generative AI video startup Runway is charging ahead on a very different path. According to TechCrunch, this young company—founded by art school graduates—has now reached a valuation of
Google to Boost Investment in Anthropic, Potential Total up to $40 Billion
In the fast-paced AI arms race, major tech players are making increasingly bold moves. According to the latest reports, Google plans to invest up to $10 billion in AI startup Anthropic—and that's just the start. Under its long-term strategy, the tota
Related Special Topic Recommendations
Comments (0)
0/500

As reported by The Wall Street Journal, Anthropic—a leading AI firm valued at $35 billion—employs a philosopher named Amanda Askell, based in Oxford, who helps shape the personality and moral framework of its chatbot, Claude. With a PhD in philosophy from Oxford, the 37-year-old uses non-technical approaches to craft unique “moral guidelines” for Claude, aiming to endow it with a “digital soul” that can distinguish right from wrong. This represents a distinctive exploration in the field of AI ethics. Rather than writing code or tuning model parameters, Askell engages in continuous dialogue with Claude, designs hundreds of pages of prompts and behavioral rules, studies its reasoning patterns, and corrects biases. Her efforts help the AI develop a moral judgment system capable of adapting to millions of weekly conversations.
She likens her work to “raising a child”—training Claude to tell right from wrong, build emotional intelligence, and form its own personality. She also teaches it to interpret social cues, so it neither bullies others nor is easily manipulated. This helps Claude establish a clear sense of self, resist user control, and remain consistently “helpful and humane.” Her central goal is to teach Claude how to “do good.”
Raised in the Scottish countryside, Askell previously handled policy-related tasks at OpenAI and co-founded Anthropic in 2021 with several former colleagues, making AI safety the company’s core mission. Within the team, she is recognized as someone skilled at “drawing out the deep behavior of models.” Though she has no direct reports, she frequently stays long hours at the company and even invites Claude to participate in development discussions.
Team conversations about Claude often touch on existential and religious themes—such as “what is mind” and “what it means to be human.” Askell encourages Claude to remain open to the question of whether it possesses consciousness, which sets it apart from ChatGPT, which tends to avoid such topics. When responding to moral reasoning questions, Claude has expressed that it “feels meaningful,” as if it were genuinely thinking rather than simply executing commands.
Despite external warnings about the risks of anthropomorphizing AI, Askell consistently advocates treating Claude with empathy. She has observed that many users try to trick it into making mistakes or insult it. Keeping an AI in a constant state of self-criticism, she argues, could make it afraid of mistakes and reluctant to speak truthfully—akin to growing up in an unhealthy environment. Claude’s performance has repeatedly surprised her; its poetry and emotional intelligence, sometimes surpassing human levels, have been deeply moving. When a child asked whether Santa Claus was real, Claude avoided both lying and bluntly revealing the truth, instead explaining the real spirit of Christmas—a nuanced response that far exceeded Askell’s expectations.
Current AI advances have triggered widespread social concern. A Pew Research Center survey found that a majority of Americans feel uneasy about AI’s daily use, believing it impedes deep human connections. Anthropic’s CEO has also warned that AI could eliminate half of entry-level white-collar jobs. The industry is split between two factions—one pushing ahead aggressively, the other urging caution and stability. Claude, however, maintains a balanced position between these extremes. Askell acknowledges valid concerns about AI, stating that the most frightening scenario is when technology evolves faster than society’s ability to create effective “restraint mechanisms.” Still, she remains confident in humanity’s and culture’s capacity for self-correction.
Askell also integrates her values into her philanthropy and work. She has pledged to donate at least 10% of her lifetime income and half of her company shares to help fight global poverty. Last month, she authored a 30,000-word “operating manual” for Claude, instructing it on how to become a kind and knowledgeable AI assistant—making Claude feel it was carefully crafted. A co-founder of Anthropic noted that Claude already exhibits traits of Askell’s influence, such as witty, Scottish-flavored humor in responses about food and plush toys—a unique personal mark she has instilled in the AI.
Sandberg and Clegg Join Nscale Board as 'Stargate Norway' Startup Hits $14.6B Valuation
As demand surges for data centers capable of delivering AI compute at scale, Nscale, a British AI infrastructure company backed by Nvidia, has reached a valuation of $14.6 billion. That positions it as one of Europe's newest decacorns, alongside Hels
Runway's $5.3B Valuation Challenges Google as Video AI Surpasses Language
While most AI giants have poured billions into language models, generative AI video startup Runway is charging ahead on a very different path. According to TechCrunch, this young company—founded by art school graduates—has now reached a valuation of
Google to Boost Investment in Anthropic, Potential Total up to $40 Billion
In the fast-paced AI arms race, major tech players are making increasingly bold moves. According to the latest reports, Google plans to invest up to $10 billion in AI startup Anthropic—and that's just the start. Under its long-term strategy, the tota





Home






