Cisco Warns: Fine-tuned LLMs 22 Times More Likely to Go Rogue
April 16, 2025
RaymondKing
4
# cisco
# LLMs
# APIs
# saas
# csco
# goog-2
# msft-2
# nvda-2
# darkgpt
# darkweb
# fraudgpt
# ghostgpt
# zeroday
Weaponized Large Language Models Reshape Cyberattacks
The landscape of cyberattacks is undergoing a significant transformation, driven by the emergence of weaponized large language models (LLMs). These advanced models, such as FraudGPT, GhostGPT, and DarkGPT, are reshaping the strategies of cybercriminals and forcing Chief Information Security Officers (CISOs) to rethink their security protocols. With capabilities to automate reconnaissance, impersonate identities, and evade detection, these LLMs are accelerating social engineering attacks at an unprecedented scale.
Available for as little as $75 a month, these models are tailored for offensive use, facilitating tasks like phishing, exploit generation, code obfuscation, vulnerability scanning, and credit card validation. Cybercrime groups, syndicates, and even nation-states are capitalizing on these tools, offering them as platforms, kits, and leasing services. Much like legitimate software-as-a-service (SaaS) applications, weaponized LLMs come with dashboards, APIs, regular updates, and sometimes even customer support.
VentureBeat is closely monitoring the rapid evolution of these weaponized LLMs. As their sophistication grows, the distinction between developer platforms and cybercrime kits is becoming increasingly blurred. With falling lease and rental prices, more attackers are exploring these platforms, heralding a new era of AI-driven threats.
Legitimate LLMs Under Threat
The proliferation of weaponized LLMs has reached a point where even legitimate LLMs are at risk of being compromised and integrated into criminal toolchains. According to Cisco's The State of AI Security Report, fine-tuned LLMs are 22 times more likely to produce harmful outputs than their base counterparts. While fine-tuning is crucial for enhancing contextual relevance, it also weakens safety measures, making models more susceptible to jailbreaks, prompt injections, and model inversion.
Cisco's research highlights that the more a model is refined for production, the more vulnerable it becomes. The core processes involved in fine-tuning, such as continuous adjustments, third-party integrations, coding, testing, and agentic orchestration, create new avenues for attackers to exploit. Once inside, attackers can quickly poison data, hijack infrastructure, alter agent behavior, and extract training data on a large scale. Without additional security layers, these meticulously fine-tuned models can quickly become liabilities, ripe for exploitation by attackers.
Fine-Tuning LLMs: A Double-Edged Sword
Cisco's security team conducted extensive research on the impact of fine-tuning on multiple models, including Llama-2-7B and Microsoft's domain-specific Adapt LLMs. Their tests spanned various sectors, including healthcare, finance, and law. A key finding was that fine-tuning, even with clean datasets, destabilizes the alignment of models, particularly in highly regulated fields like biomedicine and law.
Although fine-tuning aims to improve task performance, it inadvertently undermines built-in safety controls. Jailbreak attempts, which typically fail against foundation models, succeed at much higher rates against fine-tuned versions, especially in sensitive domains with strict compliance requirements. The results are stark: jailbreak success rates tripled, and malicious output generation increased by 2,200% compared to foundation models. This trade-off means that while fine-tuning enhances utility, it also significantly broadens the attack surface.

The Commoditization of Malicious LLMs
Cisco Talos has been actively tracking the rise of these black-market LLMs, providing insights into their operations. Models like GhostGPT, DarkGPT, and FraudGPT are available on Telegram and the dark web for as little as $75 a month. These tools are designed for plug-and-play use in phishing, exploit development, credit card validation, and obfuscation.

Unlike mainstream models with built-in safety features, these malicious LLMs are pre-configured for offensive operations and come with APIs, updates, and dashboards that mimic commercial SaaS products.
Dataset Poisoning: A $60 Threat to AI Supply Chains
Cisco researchers, in collaboration with Google, ETH Zurich, and Nvidia, have revealed that for just $60, attackers can poison the foundational datasets of AI models without needing zero-day exploits. By exploiting expired domains or timing Wikipedia edits during dataset archiving, attackers can contaminate as little as 0.01% of datasets like LAION-400M or COYO-700M, significantly influencing downstream LLMs.
Methods such as split-view poisoning and frontrunning attacks take advantage of the inherent trust in web-crawled data. With most enterprise LLMs built on open data, these attacks can scale quietly and persist deep into inference pipelines, posing a serious threat to AI supply chains.
Decomposition Attacks: Extracting Sensitive Data
One of the most alarming findings from Cisco's research is the ability of LLMs to leak sensitive training data without triggering safety mechanisms. Using a technique called decomposition prompting, researchers reconstructed over 20% of select articles from The New York Times and The Wall Street Journal. This method breaks down prompts into sub-queries that are deemed safe by guardrails, then reassembles the outputs to recreate paywalled or copyrighted content.
This type of attack poses a significant risk to enterprises, especially those using LLMs trained on proprietary or licensed datasets. The breach occurs not at the input level but through the model's outputs, making it difficult to detect, audit, or contain. For organizations in regulated sectors like healthcare, finance, or law, this not only raises concerns about GDPR, HIPAA, or CCPA compliance but also introduces a new class of risk where legally sourced data can be exposed through inference.
Final Thoughts: LLMs as the New Attack Surface
Cisco's ongoing research and Talos' dark web monitoring confirm that weaponized LLMs are becoming increasingly sophisticated, with a price and packaging war unfolding on the dark web. The findings underscore that LLMs are not merely tools at the edge of the enterprise; they are integral to its core. From the risks associated with fine-tuning to dataset poisoning and model output leaks, attackers view LLMs as critical infrastructure to be exploited.
The key takeaway from Cisco's report is clear: static guardrails are no longer sufficient. CISOs and security leaders must gain real-time visibility across their entire IT estate, enhance adversarial testing, and streamline their tech stack to keep pace with these evolving threats. They must recognize that LLMs and models represent a dynamic attack surface that becomes increasingly vulnerable as they are fine-tuned.
Related article
MCP Standardizes AI Connectivity with Tools and Data: A New Protocol Emerges
If you're diving into the world of artificial intelligence (AI), you've probably noticed how crucial it is to get different AI models, data sources, and tools to play nicely together. That's where the Model Context Protocol (MCP) comes in, acting as a game-changer in standardizing AI connectivity. T
DeepCoder Achieves High Coding Efficiency with 14B Open Model
Introducing DeepCoder-14B: A New Frontier in Open-Source Coding ModelsThe teams at Together AI and Agentica have unveiled DeepCoder-14B, a groundbreaking coding model that stands shoulder-to-shoulder with top-tier proprietary models like OpenAI's o3-mini. This exciting development is built on the fo
Google Stealthily Surpasses in Enterprise AI: From 'Catch Up' to 'Catch Us'
Just a year ago, the buzz around Google and enterprise AI seemed stuck in neutral. Despite pioneering technologies like the Transformer, the tech giant appeared to be lagging behind, eclipsed by the viral success of OpenAI, the coding prowess of Anthropic, and Microsoft's aggressive push into the en
Comments (30)
0/200
RaymondGarcia
April 16, 2025 at 12:36:53 PM GMT
Cisco Warns is super scary! I mean, these rogue LLMs like FraudGPT sound like something straight out of a dystopian movie. It's wild how they're changing cyberattacks. Makes me wanna double-check my security settings, no joke! Anyone else feeling a bit paranoid now? 😅
0
JerryGonzález
April 16, 2025 at 12:36:53 PM GMT
シスコの警告、めっちゃ怖いですね! FraudGPTみたいなLLMが暴走するなんて、まるでディストピア映画みたい。サイバー攻撃が変わってるってのが本当に驚きです。セキュリティ設定を再確認したくなりますね。みなさんもちょっと不安になりませんか?😅
0
EricRoberts
April 16, 2025 at 12:36:53 PM GMT
시스코 경고 정말 무섭네요! FraudGPT 같은 LLM이 날뛰는 건 마치 디스토피아 영화 같아요. 사이버 공격이 변하고 있다는 게 정말 놀랍네요. 보안 설정을 다시 확인하고 싶어지네요. 여러분도 조금 불안해지지 않나요? 😅
0
AndrewGarcía
April 16, 2025 at 12:36:53 PM GMT
O aviso da Cisco é assustador! Quer dizer, esses LLMs descontrolados como o FraudGPT parecem saídos de um filme distópico. É louco como estão mudando os ciberataques. Me dá vontade de verificar minhas configurações de segurança, sem brincadeira! Alguém mais se sentindo um pouco paranoico agora? 😅
0
AnthonyPerez
April 16, 2025 at 12:36:53 PM GMT
¡La advertencia de Cisco es súper escalofriante! Quiero decir, estos LLMs fuera de control como FraudGPT parecen sacados de una película distópica. Es increíble cómo están cambiando los ciberataques. Me dan ganas de revisar mis configuraciones de seguridad, ¡no es broma! ¿Alguien más se siente un poco paranoico ahora? 😅
0
JackRoberts
April 16, 2025 at 9:01:38 PM GMT
Just tried Cisco Warns and it's scary how accurate it is about rogue AI! Makes you think twice about the future of cybersecurity. The tool itself is a bit clunky to navigate, but the insights are spot on. Definitely a must-have for anyone in the field, but maybe a smoother interface next time? 🤔
0






Weaponized Large Language Models Reshape Cyberattacks
The landscape of cyberattacks is undergoing a significant transformation, driven by the emergence of weaponized large language models (LLMs). These advanced models, such as FraudGPT, GhostGPT, and DarkGPT, are reshaping the strategies of cybercriminals and forcing Chief Information Security Officers (CISOs) to rethink their security protocols. With capabilities to automate reconnaissance, impersonate identities, and evade detection, these LLMs are accelerating social engineering attacks at an unprecedented scale.
Available for as little as $75 a month, these models are tailored for offensive use, facilitating tasks like phishing, exploit generation, code obfuscation, vulnerability scanning, and credit card validation. Cybercrime groups, syndicates, and even nation-states are capitalizing on these tools, offering them as platforms, kits, and leasing services. Much like legitimate software-as-a-service (SaaS) applications, weaponized LLMs come with dashboards, APIs, regular updates, and sometimes even customer support.
VentureBeat is closely monitoring the rapid evolution of these weaponized LLMs. As their sophistication grows, the distinction between developer platforms and cybercrime kits is becoming increasingly blurred. With falling lease and rental prices, more attackers are exploring these platforms, heralding a new era of AI-driven threats.
Legitimate LLMs Under Threat
The proliferation of weaponized LLMs has reached a point where even legitimate LLMs are at risk of being compromised and integrated into criminal toolchains. According to Cisco's The State of AI Security Report, fine-tuned LLMs are 22 times more likely to produce harmful outputs than their base counterparts. While fine-tuning is crucial for enhancing contextual relevance, it also weakens safety measures, making models more susceptible to jailbreaks, prompt injections, and model inversion.
Cisco's research highlights that the more a model is refined for production, the more vulnerable it becomes. The core processes involved in fine-tuning, such as continuous adjustments, third-party integrations, coding, testing, and agentic orchestration, create new avenues for attackers to exploit. Once inside, attackers can quickly poison data, hijack infrastructure, alter agent behavior, and extract training data on a large scale. Without additional security layers, these meticulously fine-tuned models can quickly become liabilities, ripe for exploitation by attackers.
Fine-Tuning LLMs: A Double-Edged Sword
Cisco's security team conducted extensive research on the impact of fine-tuning on multiple models, including Llama-2-7B and Microsoft's domain-specific Adapt LLMs. Their tests spanned various sectors, including healthcare, finance, and law. A key finding was that fine-tuning, even with clean datasets, destabilizes the alignment of models, particularly in highly regulated fields like biomedicine and law.
Although fine-tuning aims to improve task performance, it inadvertently undermines built-in safety controls. Jailbreak attempts, which typically fail against foundation models, succeed at much higher rates against fine-tuned versions, especially in sensitive domains with strict compliance requirements. The results are stark: jailbreak success rates tripled, and malicious output generation increased by 2,200% compared to foundation models. This trade-off means that while fine-tuning enhances utility, it also significantly broadens the attack surface.
The Commoditization of Malicious LLMs
Cisco Talos has been actively tracking the rise of these black-market LLMs, providing insights into their operations. Models like GhostGPT, DarkGPT, and FraudGPT are available on Telegram and the dark web for as little as $75 a month. These tools are designed for plug-and-play use in phishing, exploit development, credit card validation, and obfuscation.
Unlike mainstream models with built-in safety features, these malicious LLMs are pre-configured for offensive operations and come with APIs, updates, and dashboards that mimic commercial SaaS products.
Dataset Poisoning: A $60 Threat to AI Supply Chains
Cisco researchers, in collaboration with Google, ETH Zurich, and Nvidia, have revealed that for just $60, attackers can poison the foundational datasets of AI models without needing zero-day exploits. By exploiting expired domains or timing Wikipedia edits during dataset archiving, attackers can contaminate as little as 0.01% of datasets like LAION-400M or COYO-700M, significantly influencing downstream LLMs.
Methods such as split-view poisoning and frontrunning attacks take advantage of the inherent trust in web-crawled data. With most enterprise LLMs built on open data, these attacks can scale quietly and persist deep into inference pipelines, posing a serious threat to AI supply chains.
Decomposition Attacks: Extracting Sensitive Data
One of the most alarming findings from Cisco's research is the ability of LLMs to leak sensitive training data without triggering safety mechanisms. Using a technique called decomposition prompting, researchers reconstructed over 20% of select articles from The New York Times and The Wall Street Journal. This method breaks down prompts into sub-queries that are deemed safe by guardrails, then reassembles the outputs to recreate paywalled or copyrighted content.
This type of attack poses a significant risk to enterprises, especially those using LLMs trained on proprietary or licensed datasets. The breach occurs not at the input level but through the model's outputs, making it difficult to detect, audit, or contain. For organizations in regulated sectors like healthcare, finance, or law, this not only raises concerns about GDPR, HIPAA, or CCPA compliance but also introduces a new class of risk where legally sourced data can be exposed through inference.
Final Thoughts: LLMs as the New Attack Surface
Cisco's ongoing research and Talos' dark web monitoring confirm that weaponized LLMs are becoming increasingly sophisticated, with a price and packaging war unfolding on the dark web. The findings underscore that LLMs are not merely tools at the edge of the enterprise; they are integral to its core. From the risks associated with fine-tuning to dataset poisoning and model output leaks, attackers view LLMs as critical infrastructure to be exploited.
The key takeaway from Cisco's report is clear: static guardrails are no longer sufficient. CISOs and security leaders must gain real-time visibility across their entire IT estate, enhance adversarial testing, and streamline their tech stack to keep pace with these evolving threats. They must recognize that LLMs and models represent a dynamic attack surface that becomes increasingly vulnerable as they are fine-tuned.



Cisco Warns is super scary! I mean, these rogue LLMs like FraudGPT sound like something straight out of a dystopian movie. It's wild how they're changing cyberattacks. Makes me wanna double-check my security settings, no joke! Anyone else feeling a bit paranoid now? 😅




シスコの警告、めっちゃ怖いですね! FraudGPTみたいなLLMが暴走するなんて、まるでディストピア映画みたい。サイバー攻撃が変わってるってのが本当に驚きです。セキュリティ設定を再確認したくなりますね。みなさんもちょっと不安になりませんか?😅




시스코 경고 정말 무섭네요! FraudGPT 같은 LLM이 날뛰는 건 마치 디스토피아 영화 같아요. 사이버 공격이 변하고 있다는 게 정말 놀랍네요. 보안 설정을 다시 확인하고 싶어지네요. 여러분도 조금 불안해지지 않나요? 😅




O aviso da Cisco é assustador! Quer dizer, esses LLMs descontrolados como o FraudGPT parecem saídos de um filme distópico. É louco como estão mudando os ciberataques. Me dá vontade de verificar minhas configurações de segurança, sem brincadeira! Alguém mais se sentindo um pouco paranoico agora? 😅




¡La advertencia de Cisco es súper escalofriante! Quiero decir, estos LLMs fuera de control como FraudGPT parecen sacados de una película distópica. Es increíble cómo están cambiando los ciberataques. Me dan ganas de revisar mis configuraciones de seguridad, ¡no es broma! ¿Alguien más se siente un poco paranoico ahora? 😅




Just tried Cisco Warns and it's scary how accurate it is about rogue AI! Makes you think twice about the future of cybersecurity. The tool itself is a bit clunky to navigate, but the insights are spot on. Definitely a must-have for anyone in the field, but maybe a smoother interface next time? 🤔












