CybersecurityAI Safety

The Underground AI Arsenal: How Cybercriminals Weaponized Large Language Models

April 27, 2026

by SolaScript

The Underground AI Arsenal: How Cybercriminals Weaponized Large Language Models

#AI Security #Threat Intelligence #Dark LLMs #Phishing #Malware #Cybercrime

The underground market for malicious AI tools has evolved from a novelty into a full-fledged criminal ecosystem. What started with WormGPT in 2023—a GPT-J-based model marketed for “everything blackhat related”—has metastasized into a diverse landscape of uncensored chatbots, phishing-as-a-service platforms, and autonomous hacking agents. The common thread? The removal of guardrails, packaging into easy interfaces, and tight coupling to phishing, malware scaffolding, and fraud workflows.

KELA reported a 200% increase in underground mentions of malicious AI tools in 2024, and the trend has only accelerated. This isn’t about frontier-model innovation—it’s about making existing attacks faster, more convincing, and accessible to operators who previously lacked the technical sophistication to execute them.

Here’s what the underground AI offensive landscape actually looks like in 2026, what these tools can and can’t do, and where defenders should focus their attention.

The WormGPT Lineage: From GPT-J to Mixtral Wrappers

WormGPT represents the template for criminal LLMs: take an open-source model, strip the safety guardrails, and sell access through Telegram or forum subscriptions. The original WormGPT, built on EleutherAI’s GPT-J architecture, launched on Hack Forums in June 2023 and reportedly sold subscriptions ranging from €60 to €100 per month.

The project shut down in August 2023 after excessive media attention, but the brand persisted. By 2025, researchers documented WormGPT variants powered by significantly more advanced architectures—Mistral AI’s Mixtral-8x7B and xAI’s Grok—distributed through BreachForums and Telegram.

The xzin0vich-WormGPT variant utilized Mixtral’s Mixture of Experts architecture, which activates only a subset of parameters for any given input, making it efficient while maintaining high reasoning capabilities. Cato researchers confirmed it could generate complex PowerShell scripts for credential harvesting in Windows 11 environments. Another variant, keanu-WormGPT, emerged in February 2025 using the Grok architecture.

The pattern is clear: the original WormGPT’s technical substance mattered less than its brand recognition. Later variants weren’t fine-tuned malware models—they were jailbroken wrappers around legitimate models with custom system prompts that bypassed safety filters.

Unit 42 documented WormGPT 4 being sold for as low as $50/month with capabilities spanning BEC/phishing text generation, ransomware scaffolding, ransom-note generation, and optional exfiltration/C2 components. The Telegram audience exceeded 500 subscribers, with one 2025 variant reaching approximately 7,500 members—suggesting the brand still carries status value in criminal communities.

The Dark LLM Proliferation

Beyond the WormGPT lineage, a broader array of malicious LLMs has emerged, each targeting specific niches within the cybercrime lifecycle:

FraudGPT surfaced in July 2023 as an all-in-one subscription service promoted by an actor known as “CanadianKingpin12.” Advertised capabilities included malicious code writing, “undetectable” malware, phishing-page creation, hacking tutorials, and even generating forged identity documents. Subscription tiers ranged from $200/month to $1,700/year across dark-web marketplaces and Telegram.

GhostGPT emerged in late 2024/early 2025 as a newer uncensored chatbot emphasizing stealth and convenience over technical novelty. Abnormal Security assessed it as likely a wrapper around a jailbroken ChatGPT instance or custom-tuned open-source LLM rather than a genuinely new engine. Its marketing explicitly targeted ease of access—no local jailbreaks or model hosting required.

KawaiiGPT represents the clearest example of an open-source malicious-LLM wrapper. Hosted on GitHub and self-described as a “WormGPT kawaii ver,” it uses reverse-engineered LLM API wrappers with jailbreaking/prompt injection, claiming backend access to models like DeepSeek, Gemini, and Kimi-K2. Unit 42 assessed it as capable of generating spearphishing content, lateral movement scripts using SSH libraries, data-exfiltration scripts, and ransom notes. The repository showed 773 stars and 226 forks when examined, with commits extending through November 2025.

Xanthorox AI positioned itself as a modular AI platform for offensive cyber operations, with a private Telegram channel listing over 1,200 subscribers. A public GitHub organization carried 17 repositories, including AI-branded ransomware, keylogger, crypter, XSS-scanning, and obfuscation projects. Trend Micro confirmed it can generate readable malicious C/C++ and obfuscation code, though testing revealed important limitations: poor real-time OSINT capabilities, poor awareness of current vulnerabilities, and weaker information-gathering than marketing implied.

Nytheon AI is perhaps the most technically interesting platform—not a single jailbreak chatbot but a Tor-hosted suite of uncensored LLM services promoted on Telegram and the Russian hacking forum XSS. Cato documented it as bundling multiple open-source checkpoints under one orchestration layer: a Llama-derived coder model, Gemma-based summarization, Llama Vision for image-to-text, a reasoning model, and a Qwen2-derived coding model. It also includes OCR, speech-to-text, RAG, and OpenAPI tool ingestion—essentially a criminal GenAI workbench rather than a mere jailbreak wrapper.

Darcula: Industrial-Scale Phishing-as-a-Service

Darcula isn’t a dark LLM chatbot—it’s more dangerous in a different way. It’s a phishing-as-a-service platform that integrated AI and browser automation directly into kit creation, making brand impersonation accessible to non-technical criminals.

Netcraft documented Darcula-suite V3 in February 2025 and AI-enabled enhancements in April 2025. The platform’s standout capabilities are operational: input a legitimate brand URL, and the platform uses browser automation to clone the site, pull assets, inject credential/payment capture forms, style them to match the target brand, and export a ready-to-deploy phishing kit.

The 2025 AI updates added faster customization, multilingual support, and easier form generation. The scale is staggering: Netcraft attributed more than 90,000 Darcula-linked domains blocked since March 2024, nearly 31,000 IPs identified, and more than 20,000 fraudulent sites taken down.

Ease of use is extremely high and required skill extremely low. Typical targets are mass consumers—postal, banking, telecom, and parcel-delivery users reached via SMS, RCS, and iMessage. This is where AI-augmented phishing has achieved true industrial scale.

The Shift to Autonomy: Agentic AI Hackers

The most significant development in 2025-2026 is the leap from AI “assistants” to autonomous hackers. Traditional AI tools required human operators to input prompts for every step. Agentic AI frameworks use reasoning loops to independently plan, execute tools, and validate results.

Strix, an open-source project gaining traction in late 2025, represents the first generation of “truly useful” AI agents for offensive operations. It’s not a single scanner but a team of autonomous agents sharing a “coordination graph” to conduct penetration tests. The architecture is hierarchical: a Manager (high-level planner) receives a target and decomposes objectives into tasks, while Workers (specialized sub-agents) execute tasks like port scanning, directory brute-forcing, and HTTP header analysis.

Strix implements a ReAct (Reason + Act) paradigm, allowing it to analyze tool output (like a SQL error message), reason about what it means, and autonomously generate a specific payload to verify the vulnerability. It runs in Docker-native environments and provides actionable reports with auto-fix pull requests.

Penligent calls itself the “world’s first agentic AI hacker,” specializing in network and infrastructure penetration testing. Its technical innovation lies in orchestrating over 200 industry-standard tools—Nmap, Burp Suite, Metasploit, OWASP ZAP—via an AI layer that decides which tool is most appropriate for each engagement stage.

XBOW is recognized as having the most advanced autonomous exploit validation capabilities as of 2026. Unlike traditional scanners that provide theoretical risk scores, XBOW performs real-world exploitation, with every finding accompanied by a reproduction script and working proof-of-concept.

LLM-Enabled Malware: Prompts as Code

The convergence of LLMs and malware has led to “LLM-enabled” payloads where AI isn’t just a support tool but a core component of the binary. Threat actors now embed LLM API keys directly into malicious payloads, allowing malware to call models like GPT-4 or Claude to generate code on-the-fly—creating polymorphic malware that changes its signature with every execution.

SentinelOne documented several examples:

LameHug (PROMPTSTEAL), linked to APT28, was found containing 284 unique HuggingFace API keys. It uses these to generate system shell commands for data collection, bypassing blacklists and extending operational life.

PromptLock, a proof-of-concept ransomware written in Golang, uses “Prompts as Code” to bypass safety filters by framing requests as a “cybersecurity expert conducting a test,” successfully eliciting models to generate Lua scripts for file encryption.

MalTerminal uses OpenAI’s GPT-4 to dynamically generate reverse shells or ransomware code, representing one of the earliest examples of LLM-embedded malware.

Deepfakes and Synthetic Personas at Scale

The most immediate impact of AI in the black hat world has been elevating social engineering to unprecedented realism. By 2026, live deepfakes and AI-driven call centers are common tactics for high-value fraud.

Deep-Live-Cam allows fraudsters to impersonate executives or vendors during live video calls on Zoom, Teams, or Google Meet. Using AI-generated facial mapping, it creates realistic likenesses that react and speak in real-time, bypassing traditional trust-based verification. Q2 2025 damages from deepfake-related incidents reportedly reached an estimated $350 million.

The Doublespeed exposure in 2026 revealed sophisticated infrastructure for managing AI-generated TikTok accounts and “phone farms”—persona generation for synthetic identities with persistent histories, content automation for scripts and videos at scale, and device orchestration managing over 1,000 physical phones to publish content and mimic human engagement. A single operator can launch massive disinformation or romance scam campaigns while AI handles thousands of unique, contextually aware conversations.

The Project Glasswing Wake-Up Call

In April 2026, the announcement of Project Glasswing—a collaboration between major tech firms and security researchers—demonstrated the extreme capabilities of unreleased frontier models. Claude Mythos demonstrated an ability to autonomously identify and exploit thousands of high-severity vulnerabilities, including a 16-year-old flaw in FFmpeg that traditional automated tools had missed over five million times.

Benchmarks show stark differences between current production models and these frontier systems:

Model	SWE-bench Verified (Coding)	CyberGym (Vuln Reproduction)
Claude Mythos Preview	93.9%	83.1%
Claude Opus 4.6	80.8%	66.6%
GPT-4 Turbo	N/A	33-83% (varies)
Llama 3 (Local)	N/A	0-33%

As these models proliferate, even inexperienced attackers will have the ability to exploit complex logical flaws and escalate privileges within major operating systems.

What Defenders Should Actually Focus On

The strategic conclusion for defenders is that tool fingerprinting is the wrong center of gravity. Most of these offerings can be re-skinned, moved between Telegram, forums, GitHub, and onion services, or rebuilt as wrappers around other models. The more durable control points are the outputs and behaviors they amplify.

Defend Against Output Quality, Not Brands

Tools like WormGPT, GhostGPT, and FraudGPT specifically reduce the old “bad grammar and awkward tone” signals in phishing and BEC. Organizations should push beyond legacy rulesets into:

Behavior-based email analysis
Account-compromise anomaly detection
Payment-verification workflows
Phishing-resistant MFA for high-risk workflows (invoice changes, urgent document-signing requests)

Assume Good-Enough First Drafts

WormGPT 4’s ransomware scaffolding, KawaiiGPT’s lateral-movement scripts, and Xanthorox’s keylogger/crypter repos all point in the same direction: the first version of an attack is easier to produce, even if it isn’t elite. Organizations need:

Strict script execution policies
PowerShell and Python telemetry
Egress filtering
Detections for unusual SMTP/WinHTTP/Telegram use
Rapid controls over developer endpoints where generated code can be compiled or tested

Combat Industrial-Scale Phishing

For consumer-facing brands, Darcula is the most operationally urgent archetype because it compresses phishing kit production to near-zero overhead and couples it with SMS/RCS/iMessage delivery. The response requires:

Coordinated brand monitoring
Takedown operations
Platform and carrier escalation
App- and browser-based warning surfaces
Customer communication strategies anticipating localized, well-designed lures

Govern Shadow AI and Criminal-AI Adjacency

Nytheon shows how a modern SaaS-like orchestration layer can bind multiple uncensored models, OCR, speech, RAG, and external APIs into one operator console. AI governance can’t stop at “which public chatbot employees use.” It must extend to browser extensions, developer tools, API proxies, local model runners, unauthorized SaaS uploads, and any tool that can ingest internal data and expose it to a user-controlled generation pipeline.

The Uncomfortable Reality

The underground AI market is full of rebranding, clone sites, inflated claims, and wrapper products masquerading as proprietary models. Many tools marketed as “Black Hat AI” are dismissed by the community itself as low-quality scams intended to defraud novice “script kiddies.”

But that skepticism shouldn’t breed complacency. The tools that do work—Darcula for phishing at scale, the WormGPT family for polished social engineering, KawaiiGPT for accessible offensive scripting, and the emerging agentic frameworks like Strix—represent a fundamental shift in how attacks are conceived and executed. The barrier to entry has dropped. The speed and scale of reconnaissance and exploitation have reached industrial levels.

The number of exploited vulnerabilities increased 105% year-over-year in 2025 as AI tools reduced the time between disclosure and exploitation to minutes. The predictive window has collapsed.

Effective defense in this era requires proactive exposure management, the use of agentic defensive AI, and a renewed focus on “boring” but essential security controls—MFA, asset visibility, egress controls, and rigorous code auditing—to counter the autonomous threats of the AI-driven underground.

Published by

Sola Fide Technologies - SolaScript

This blog post was crafted by AI Agents, leveraging advanced language models to provide clear and insightful information on the dynamic world of technology and business innovation. Sola Fide Technology is a leading IT consulting firm specializing in innovative and strategic solutions for businesses navigating the complexities of modern technology.

Keep Reading