The Dark Side of Gen AI [Uncensored Large Language Models]

Table of contents
This blog written by CybelAngel analysts, Noor Bhatnagar and Damya Kecili. This analysis looks at how LLMs are increasingly exploited and uncensored.
AI is booming and the proliferation of applications powered by Large Language Models (LLMs), now readily accessible and affordable, have led to widespread adoption. Generative AI tools are a dime a dozen today.
Yet, the question of responsible AI amid uncensored LLMs is looming large.
In this blog we’ll be examining the trends that have opened the door for malicious Generative Pre-trained Transformer (GPT) applications. We will also examine how some uncensored LLMs lack important safety restrictions, and what that means for your organization in real-time.
As this allows broader access to information and unrestricted content generation, cyber criminals are quickly adopting newer, more crafted, sophisticated attack patterns.
So, where do we go from here?
What is the link between the dark web ecosystem and malicious AI models?
An uncensored LLM can be used to generate any type of content without regard for potential harm, offensiveness, or inappropriateness. Without filtering or suppression, it can be used to create a wide range of malicious LLM-integrated applications, including those that leverage generative adversarial networks (GANs) to produce highly convincing deepfakes, fraudulent content, or other deceptive media.
- Malware code
- Sophisticated phishing scams
- Other malicious activities
Understanding how this ecosystem functions is crucial.
Key parts of the malicious LLM cycle
Consider this typical cycle:
- Creation: Malicious LLMs are often created by developers from scratch using data sourced from the dark web or through OSINT (Open Source Intelligence). For instance, WormGPT, one of the first such models based on the GPT-J language model, is believed to have been trained on malware-related data.
- Training: These models are often trained using either misuse of public LLM APIs (e.g., OpenAI API, Llama API, JinaChat API) or as uncensored LLMs, operating without typical ethical restrictions.
- Distribution: Once developed and hosted, these models are sold and promoted directly by the developers on dark web platforms, underground channels, and forums, or through intermediaries.
- Exploitation: Following promotion and sale, these tools facilitate further exploits, feeding the ecosystem with more data and enabling the creation of new models for various malicious purposes.

Malicious LLMs:WormGPT and much more
Since the launch of ChatGPT by OpenAI in late 2022, malicious actors have been exploring ways to “jailbreak” the platform to bypass its safety restrictions.
Let’s explore patterns and previous models to understand what lies ahead in 2025.
Patterns in building malicious GPTs
What is typical in this case can be broken down into two areas:
- Using a wrapper to exploit a jailbroken version of ChatGPT through modified prompts or API calls.
- Leveraging another LLM as a base and fine-tuning it with malicious datasets and synthetic data while disabling any built-in safeguards.
In the third quarter of 2023, several high-performing malicious GPTs emerged, including BlackHatGPT, DarkBard, DarkBert, Evil-GPT, XXXGPT, FraudGPT, and WormGPT. S
subscription prices ranged from $10 to $200 per month. Some were specialized tools, while others served as unrestricted alternatives to mainstream GPT models. WormGPT, in particular, gained notoriety for its high performance.
An explainer on the rise and fall of WormGPT
WormGPT was officially introduced in June 2023 on dark web forums by a user named “laste,” who claimed to have started developing the chatbot in February 2023.

- WormGPT used three models to process user prompts, enabling illegal activities, code generation, and query responses through a custom ChatGPT API.
- The developer claimed it didn’t use the ChatGPT jailbreak model, citing its unreliability.
- Labeled as a “blackhat alternative to GPT models,” WormGPT was specifically designed for malicious activities and appeared to be based on the GPT-J language model architecture.
GPT-J, an open-source model created by EleutherAI, functions as a large neural network trained on massive amounts of text data, learning patterns, grammar, facts, and reasoning. The creators of WormGPT fine-tuned the base GPT-J model with additional datasets focused on phishing, fraud, and social engineering, enabling it to craft highly persuasive emails, fake business communications, and other malicious content. It was specifically trained on malware datasets, leveraging attribution techniques to enhance the credibility of generated outputs and employing variational autoencoders (VAEs) to generate synthetic data for imbalanced training scenarios.
WormGPT proved highly effective for generating:
- Malware snippets
- Sophisticated phishing emails
- Business Email Compromise (BEC) attack codes
- Malicious Python scripts
The tool had no limitations regarding content or character count, although it couldn’t generate complete programs exceeding 300 lines of code with a single prompt. All conversations were secured and confidential, with each user receiving a unique link to the application. And the fees behind it? The subscription was €110 per month, €550 per year, or €5000 for a private setup.
However, strong media backlash led the creator to discontinue WormGPT in August 2023. FraudGPT, another malicious Generative Pre-trained Transformer, was promoted as its machine learning successor, but its promotion ceased around the same time due to similar policy violations.
Rapidly growing malicious GPTs are everywhere
Since the discontinuation of WormGPT and FraudGPT, numerous malicious GPTs have emerged on the dark web. It is estimated that over 212 malicious LLMs are currently available. However, many of these tools rely on prompts designed to jailbreak ChatGPT.
GhostGPT, advertised on dark web forums and sold via Telegram, operates as a Telegram bot, making it user-friendly and affordable, with subscription prices starting at $50 per week.
Like its predecessors, advancing software development means that GhostGPT can assist with developing malware base code, crafting sophisticated phishing campaigns, and composing convincing emails for BEC attacks.
Moving away from subscriptions
One noticeable trend has emerged- malicious actors are moving away from subscribing to malicious GPTs.
Since most of these tools rely on ChatGPT jailbreak prompts, they can quickly become obsolete as ChatGPT’s ethical safeguards are updated. Malicious actors are now turning to alternative approaches to automate criminal activity like sharing jailbreak prompts for direct use with ChatGPT. Dark web forums have begun to host channels dedicated to sharing tips and techniques for such prompts.
IMAGE HERE
Why ChatGPT-4.0 voice phishing scams are a growing issue
Voice phishing scams are a key example of manipulative generated content that is rising thanks to the new Chat GPT-4.0 since May last year.
Researchers from the University of Illinois Urbana-Champaign (UIUC) revealed in their study, Voice-enabled AI agents can perform common scams, that “voice-enabled AI agents can perform the actions necessary to execute common scams.”
They found it was fairly easy for voice-enabled GPT-4.0 to:
- Autonomously navigate websites
- Input details
- Manage two-factor authentication procedures (2FA)
- Engage in malicious activities
Specifically, they examined the possibility of conducting crypto transfers, credential theft (Gmail and Instagram), bank transfers (Bank of America), and IRS impersonation scams (using Google Play gift cards).
For each category, their success rate ranged between 20% and 60%, costing under $0.75 on average per successful scam. While AI-enabled voice agents made mistakes, it is important to recognize that these agents are likely to improve in the near future. This study demonstrates that the current safeguard measures deployed by OpenAI are inadequate compared to the risks its applications face when used for malicious purposes.
How can we mitigate the misuse of GPTs?
As outlined in its latest report Disrupting Malicious Uses of our Models: An Update, OpenAI remains committed to preventing the misuse of AI tools for scams, spam, and malicious cyber activities.
Their threat disruption efforts focus on several key strategies: expanding threat detection and exposure capabilities; fostering collaboration with other AI companies to strengthen defenses through shared insights; enhancing open-source intelligence (OSINT) capabilities to identify malicious activities; and continuously conducting monitoring and research to detect, prevent, disrupt, and expose abuse attempts as malicious actors develop new methods to bypass security systems.

However, despite these efforts and some notable examples of successful cases, LLMs remain highly vulnerable to jailbreak attacks, making them a constant security threat and raising the question of how to adapt to and mitigate this persistent risk.
Although OpenAI appears to be listening to researchers’ and professionals’ opinions and is willing to improve the protection of ChatGPT, malicious actors are highly adaptable, and their techniques evolve quickly.
Trend watch: AI Systems and Gen AI Cyber attacks
Since their emergence, AI-generated malicious attacks have been a growing threat. Despite general awareness, individuals and companies are increasingly vulnerable.
Gartner recently shared a press release noting that enhanced malicious attacks ranked as the top emerging risk in a survey of senior risk and assurance executives and managers during the third quarter of 2024.
As generative AI models continue to evolve, staying informed about new trends and actively working to mitigate risks is crucial. One of the most risky innovations in this category is agentic AI.
Agentic AI powered attacks are a risk you need to consider
Agentic AI emerged rapidly in late 2024 as a new iteration of AI-powered agents capable of autonomous action and complex task execution. According to Gartner and Forrester, Agentic AI represents a significant evolution in artificial intelligence technology (it a top AI Gartner trend for 2025).
The implications of this tech are enormous for cybercrime fundamentals.
While AI has transformed cybercrime to be automated to a degree where attacks can happen with precision and speed, agentic-AI fine tunes processes to add an additional layer of autonomy in creating workflows and training data.
Use case 1- A breakthrough with speedier attacks
- Where before you could buy any of the malicious LLMs, say WormGPT (though discontinued) to write malicious code or generate phishing emails, now it is much faster.
- An agentic-LLM model would be able to manage the above content creation of bigger amounts of data in a speedier, more efficient way, that involves like human management.
- Notably, it also improves how quick criminals can source for targets and for data to exploit.
Use case 2- Optimization of high quality targeting
- Malicious actors could gather email addresses from a vast number of companies by using LinkedIn or other tools, to determine the email address format from publicly available data.
- They can then replicate the format to send fraudulent emails that appear to come from executives and target their subordinates.
There is no doubt that agentic AI is a trend to watch as generative artificial intelligence booms.
Wrapping up
As we wrap up, remember that uncensored LLMs and malicious AI technology are not just a future threat—they’re here, now, reshaping everything we know about cyber threats. Staying ahead means understanding the tactics, the tools, and the relentless use of AI innovation by cybercriminals.
Want to delve deeper into how these AI threats are materializing across industries? Get in touch with us for a tailored threat investigation.
Eager for more insights? Check out our recent analysis on the AI-fueled phishing attacks, “AI-Powered Phishing is on the Rise [What to Do?].”