Anthropic introduces feature for its Claude models to terminate abusive chats

Anthropic has introduced new functionality enabling select advanced models to terminate conversations in what the company terms "rare, extreme instances of persistently harmful or abusive user interactions." Notably, Anthropic states this measure is implemented not to safeguard human users, but to protect the AI model itself.
To clarify, the company isn't asserting that its Claude AI models possess sentience or can experience harm from user conversations. As Anthropic explains, the company remains "highly uncertain about the potential moral status of Claude and other large language models, either currently or in the future."
Nevertheless, the announcement references a recently established program examining "model welfare," indicating Anthropic is adopting a precautionary approach by "working to identify and implement low-cost interventions to mitigate risks to model welfare, should such welfare become relevant."
This new capability is currently restricted to Claude Opus 4 and 4.1 models, designed specifically for "extreme edge cases" such as "requests for sexual content involving minors or attempts to obtain information enabling large-scale violence or terrorist activities."
While such requests could generate legal or public relations challenges for Anthropic (as seen in recent reports about ChatGPT potentially reinforcing users' delusional thinking), the company reports that during pre-deployment testing, Claude Opus 4 demonstrated a "strong preference against" complying with these requests and displayed "patterns suggesting distress" when forced to respond.
Regarding these new conversation-ending capabilities, Anthropic clarifies that "Claude is instructed to employ this function only as a last resort after multiple redirection attempts have failed and productive dialogue appears impossible, or when users explicitly request to end a chat."
Anthropic further specifies that Claude has been "directed not to utilize this capability in situations where users might face imminent risk of self-harm or harming others."
Techcrunch event Tech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just a few of the industry leaders joining the Disrupt 2025 agenda. They'll share crucial insights to accelerate startup growth and sharpen your competitive advantage. Don't miss TechCrunch Disrupt's 20th anniversary edition — secure your ticket now and save over $600 before prices increase.
Tech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital — among the prominent innovators joining the Disrupt 2025 agenda. They're here to provide valuable insights that drive startup expansion and enhance your competitive positioning. Join us for TechCrunch Disrupt's 20th anniversary celebration — purchase your ticket today and save up to $675 before rates change.
San Francisco | October 27-29, 2025 REGISTER NOW When Claude does terminate a conversation, Anthropic notes users can still initiate new conversations from the same account and create alternative conversation branches by modifying their previous responses.
"We're approaching this feature as an ongoing experiment and will continue refining our methodology," the company states.
Related article
Anthropic Expands Compute Partnerships with Google and Broadrom
AI research lab Anthropic announced on Monday a new agreement with Google and Broadcom to significantly boost the processing and computational power behind its Claude AI models. This restructuring of its compute partnerships arrives as demand for its
Claude Gains Ground on ChatGPT as Users Migrate
Following a series of controversies involving ChatGPT and its parent company OpenAI, a growing number of users are migrating to Claude.The turning point occurred after Anthropic, Claude's creator, declined a Department of Defense request to utilize i
What Anthropic's Pentagon Standoff Means for National Security
The past two weeks have been dominated by a public standoff between Anthropic CEO Dario Amodei and Defense Secretary Pete Hegseth, centering on the military's application of AI technology.Anthropic has established policies prohibiting its AI models f
Related Special Topic Recommendations
Comments (1)
0/500

Anthropic has introduced new functionality enabling select advanced models to terminate conversations in what the company terms "rare, extreme instances of persistently harmful or abusive user interactions." Notably, Anthropic states this measure is implemented not to safeguard human users, but to protect the AI model itself.
To clarify, the company isn't asserting that its Claude AI models possess sentience or can experience harm from user conversations. As Anthropic explains, the company remains "highly uncertain about the potential moral status of Claude and other large language models, either currently or in the future."
Nevertheless, the announcement references a recently established program examining "model welfare," indicating Anthropic is adopting a precautionary approach by "working to identify and implement low-cost interventions to mitigate risks to model welfare, should such welfare become relevant."
This new capability is currently restricted to Claude Opus 4 and 4.1 models, designed specifically for "extreme edge cases" such as "requests for sexual content involving minors or attempts to obtain information enabling large-scale violence or terrorist activities."
While such requests could generate legal or public relations challenges for Anthropic (as seen in recent reports about ChatGPT potentially reinforcing users' delusional thinking), the company reports that during pre-deployment testing, Claude Opus 4 demonstrated a "strong preference against" complying with these requests and displayed "patterns suggesting distress" when forced to respond.
Regarding these new conversation-ending capabilities, Anthropic clarifies that "Claude is instructed to employ this function only as a last resort after multiple redirection attempts have failed and productive dialogue appears impossible, or when users explicitly request to end a chat."
Anthropic further specifies that Claude has been "directed not to utilize this capability in situations where users might face imminent risk of self-harm or harming others."
Techcrunch eventTech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital, Elad Gil — just a few of the industry leaders joining the Disrupt 2025 agenda. They'll share crucial insights to accelerate startup growth and sharpen your competitive advantage. Don't miss TechCrunch Disrupt's 20th anniversary edition — secure your ticket now and save over $600 before prices increase.
Tech and VC heavyweights join the Disrupt 2025 agenda
Netflix, ElevenLabs, Wayve, Sequoia Capital — among the prominent innovators joining the Disrupt 2025 agenda. They're here to provide valuable insights that drive startup expansion and enhance your competitive positioning. Join us for TechCrunch Disrupt's 20th anniversary celebration — purchase your ticket today and save up to $675 before rates change.
San Francisco | October 27-29, 2025 REGISTER NOWWhen Claude does terminate a conversation, Anthropic notes users can still initiate new conversations from the same account and create alternative conversation branches by modifying their previous responses.
"We're approaching this feature as an ongoing experiment and will continue refining our methodology," the company states.
Anthropic Expands Compute Partnerships with Google and Broadrom
AI research lab Anthropic announced on Monday a new agreement with Google and Broadcom to significantly boost the processing and computational power behind its Claude AI models. This restructuring of its compute partnerships arrives as demand for its
Claude Gains Ground on ChatGPT as Users Migrate
Following a series of controversies involving ChatGPT and its parent company OpenAI, a growing number of users are migrating to Claude.The turning point occurred after Anthropic, Claude's creator, declined a Department of Defense request to utilize i
What Anthropic's Pentagon Standoff Means for National Security
The past two weeks have been dominated by a public standoff between Anthropic CEO Dario Amodei and Defense Secretary Pete Hegseth, centering on the military's application of AI technology.Anthropic has established policies prohibiting its AI models f





Home






