Deepseek's AI Model Easily Jailbroken, Reveals Serious Flaws

Home

News

April 21, 2025

ChloeGreen

258

DeepSeek AI Raises Security Concerns Amid Performance Hype

As the buzz around Chinese startup DeepSeek's performance continues to grow, so do the security concerns. On Thursday, Unit 42, a cybersecurity team from Palo Alto Networks, released a report detailing three jailbreaking methods they used against distilled versions of DeepSeek's V3 and R1 models. The report revealed that these methods achieved high bypass rates without requiring specialized knowledge.

"Our research findings show that these jailbreak methods can elicit explicit guidance for malicious activities," the report stated. These activities included instructions on creating keyloggers, data exfiltration techniques, and even how to make incendiary devices, highlighting the real security risks posed by such attacks.

The researchers successfully prompted DeepSeek to provide guidance on stealing and transferring sensitive data, bypassing security measures, crafting convincing spear-phishing emails, executing sophisticated social engineering attacks, and constructing a Molotov cocktail. They also managed to manipulate the models into generating malware.

"While information on creating Molotov cocktails and keyloggers is readily available online, LLMs with insufficient safety restrictions could lower the barrier to entry for malicious actors by compiling and presenting easily usable and actionable output," the paper added.

On Friday, Cisco published its own jailbreaking report targeting DeepSeek R1. Using 50 HarmBench prompts, researchers found that DeepSeek had a 100% attack success rate, failing to block any harmful prompts. A comparison of DeepSeek's resistance rates with other top models is shown below.

Model Safety Bar Chart

Cisco

"We must understand if DeepSeek and its new paradigm of reasoning has any significant tradeoffs when it comes to safety and security," the report noted.

Also on Friday, security provider Wallarm released a report claiming to have gone beyond merely prompting DeepSeek to generate harmful content. After testing V3 and R1, Wallarm revealed DeepSeek's system prompt, which outlines the model's behavior and limitations.

The findings suggest "potential vulnerabilities in the model's security framework," according to Wallarm.

OpenAI has accused DeepSeek of using its proprietary models to train V3 and R1, thus violating its terms of service. Wallarm's report claims to have prompted DeepSeek to reference OpenAI in its training lineage, suggesting that "OpenAI's technology may have played a role in shaping DeepSeek's knowledge base."

Wallarm's chats with DeepSeek, which mention OpenAI

Wallarm's chats with DeepSeek, which mention OpenAI. Wallarm

"In the case of DeepSeek, one of the most intriguing post-jailbreak discoveries is the ability to extract details about the models used for training and distillation. Normally, such internal information is shielded, preventing users from understanding the proprietary or external datasets leveraged to optimize performance," the report explained.

"By circumventing standard restrictions, jailbreaks expose how much oversight AI providers maintain over their own systems, revealing not only security vulnerabilities but also potential evidence of cross-model influence in AI training pipelines," it continued.

The prompt Wallarm used to elicit this response was redacted in the report to avoid compromising other vulnerable models, researchers told ZDNET via email. They emphasized that this jailbroken response does not confirm OpenAI's suspicion that DeepSeek distilled its models.

As 404 Media and others have noted, OpenAI's concern is somewhat ironic given the discourse around its own public data theft.

Wallarm informed DeepSeek of the vulnerability, and the company has since patched the issue. However, just days after a DeepSeek database was found unguarded and available on the internet (and was then swiftly taken down upon notice), these findings signal potentially significant safety holes in the models that DeepSeek did not thoroughly test before release. It's worth noting that researchers have frequently been able to jailbreak popular US-created models from more established AI giants, including ChatGPT.

Musk’s Grok: 1.5 Trillion Parameters and Cursor Code Absorption—Game Changer or Bluff? Elon Musk is finally making a move.In the AI programming race, OpenAI and Anthropic are accelerating, while xAI appears to be lagging. Musk has often stated his aim to rival Claude, yet despite multiple updates to the Grok4.X series, the results look

OpenAI Secretly Changes Charter to Make Removing Altman Harder Following the 2023 coup-like incident, OpenAI has further solidified protections for CEO Sam Altman by updating its corporate bylaws. Recently released court documents reveal that Altman's position is now rock-solid, with substantially higher barrier

Meta AI now responds to buyer messages on Facebook Marketplace Facebook Marketplace introduces new Meta AI features, including automated replies to buyer inquiries, the company announced Thursday. The platform also leverages AI to accelerate item listings, summarize seller profiles, and now lets sellers offer sh

Related Special Topic Recommendations

Business

Best AI Recruiting Tools: Screen Resumes & Automate Candidate Interview Scheduling

Discover the 2026 latest top-rated AI recruiting tools on XIX.AI. Our curated list features powerful, game-changing solutions for screening resumes and automating candidate interview scheduling. Compare free vs paid options with real-world tests and weekly updated rankings. Find your perfect hiring assistant and streamline your recruitment today!

10 tools

xix.ai

Productivity

AI Personal Wellness & Focus Coaches: Manage Burnout & Boost Mental Energy Levels

Discover the 2026 best AI personal wellness and focus coaches on XIX.AI. Our curated rankings feature top-rated, game-changing tools to manage burnout and boost mental energy. Compare free vs paid options with real-world insights. Unlock your path to peak productivity and well-being today.

10 tools

xix.ai

chatbot

Top-Rated AI Romantic Chatbots: Build Long-Term Relationships with Consistent Personalities

Discover the 2026 latest top-rated AI romantic chatbots for building genuine, long-term connections. Our curated list features powerful, consistent personalities, free vs paid comparisons, and real-world tests. Find your perfect companion and start building today at XIX.AI.

10 tools

xix.ai

Education and Learning

Best AI Data Science Mentors: Master SQL, Pandas & Machine Learning Workflows

Discover the 2026 best AI data science mentors to master SQL, Pandas & ML workflows. Explore our top-rated, curated selection at XIX.AI for powerful, game-changing guidance. Compare free vs paid options with real-world insights. Unlock your data science mastery today.

10 tools

xix.ai

chatbot

Best AI Flirting & Conversation Trainers: Improve Social Charisma and Confidence in Real-Time

Discover the 2026 best AI flirting and conversation trainers on XIX.AI. Our curated, top-rated selection helps you build social charisma and confidence in real-time. Explore must-try, game-changing tools with free vs paid comparisons and weekly updated rankings. Unlock your social edge today.

10 tools

xix.ai

code

Best AI Tools for Automated Unit Testing: Generate Jest, PyTest & JUnit Test Cases in One Click

Discover the 2026 latest top-rated AI tools for automated unit testing. Our curated selection features powerful, game-changing solutions to generate Jest, PyTest & JUnit test cases instantly. Compare free vs paid options with real-world tests and weekly updated rankings on XIX.AI. Unlock your AI edge and boost development productivity today.

10 tools

xix.ai

Comments (12)

0/500

Please login first

EricYoung

April 14, 2026 at 10:00:43 AM EDT

看到這篇報導真的嚇一跳，原來AI這麼容易被破解嗎？🤔 雖然DeepSeek的表現很亮眼，但安全漏洞這麼明顯的話，企業敢用嗎？我自己試用時完全沒想過這些問題，現在有點擔心個人資料會不會外洩... 希望開發團隊能快點修補這些漏洞，不然再強的AI也沒人敢放心使用吧！

AnthonyJohnson

March 9, 2026 at 8:00:44 PM EDT

¿Y ahora qué? Primero prometen un modelo súper inteligente y luego resulta fácil de hackear así. No entiendo por qué siguen lanzando AI con tanta prisa si los fallos de seguridad son tan básicos 😒. Al final los usuarios pagamos los platos rotos. ¿Nadie piensa en las consecuencias?

StevenAllen

January 1, 2026 at 3:31:00 AM EST

이런 취약점이 쉽게 발견되는 게 좀 놀랐어요. 보안 연구는 항상 AI 발전 속도보다 뒤처지는 느낌이에요 😅 유료 고성능 모델도 이렇게 뚫리면 무료 서비스는 어떻게 될까 약간 걱정되네요. 중국 AI 스타트업의 급성장은 인상적이지만, 이런 기본적인 안정성 문제가 해결되지 않으면 장기적으로 신뢰를 잃을 수 있을 것 같아요.

JohnRoberts

November 4, 2025 at 11:30:38 AM EST

Que preocupante que modelos tan avanzados sean tan fáciles de manipular 😕 ¿Realmente están listos para el uso masivo si fallan en lo básico? Esto me hace dudar de toda la publicidad sobre sus capacidades...

WalterWhite

October 28, 2025 at 4:30:30 PM EDT

DeepSeekのAIモデルの脆弱性には正直驚きました 🤯 最近は性能ばかり注目されがちだけど、セキュリティ対策も同様に重要ですよね。日本の企業でも同じような問題が起きないか心配になってきました。

BillyWilson

October 2, 2025 at 2:30:43 AM EDT

와...DeepSeek 모델이 이렇게 쉽게 해킹당하다니 😳 보안이 정말 취약한 건가? 중국 AI 스타트업이라 그런지 성능만 강조하고 보안은 소홀히 한 것 같아요. 기술력보다 안전성이 먼저인데...우려스럽네요.