Learn Ethical Hacking (#6) - The AI Slop Epidemic - Why AI-Generated Code Is a Security Disaster
Learn Ethical Hacking (#6) - The AI Slop Epidemic - Why AI-Generated Code Is a Security Disaster

What will I learn
- Why AI code assistants produce vulnerable code at alarming rates;
- Real examples of SQL injection, hardcoded secrets, and insecure deserialization from AI suggestions;
- The "it works" trap: functional code that is fundamentally insecure;
- How AI-generated infrastructure (Terraform, Docker, CI/CD) ships with default-open permissions;
- Why AI cannot reason about security context -- and what that means for the industry;
- How this theme connects to every vulnerability class in this series.
Requirements
- A working modern computer running macOS, Windows or Ubuntu;
- Your hacking lab from Episode 2;
- Python 3 installed;
- The ambition to learn ethical hacking and security research.
Difficulty
- Beginner
Curriculum (of the Learn Ethical Hacking series):
- Learn Ethical Hacking (#1) - Why Hackers Win
- Learn Ethical Hacking (#2) - Your Hacking Lab
- Learn Ethical Hacking (#3) - How the Internet Actually Works - For Attackers
- Learn Ethical Hacking (#4) - Reconnaissance - The Art of Not Being Noticed
- Learn Ethical Hacking (#5) - Active Scanning - Mapping the Attack Surface
- Learn Ethical Hacking (#6) - The AI Slop Epidemic - Why AI-Generated Code Is a Security Disaster (this post)
Solutions to Episode 5 Exercises
Exercise 1 -- Full Nmap vulnerability scan:
Your full-scan.txt should reveal 20+ open ports with numerous NSE vulnerability findings including: vsftpd 2.3.4 backdoor (CVE-2011-2523), distcc remote code execution (CVE-2004-2687), UnrealIRCd backdoor (CVE-2010-2075), various Samba vulnerabilities, and Java RMI vulnerabilities. The exact count varies but Metasploitable2 typically has 15-20 distinct exploitable entry points on a single machine.
The key insight: a single poorly secured server can have more attack vectors than most organizations' entire perimeter. Defense requires addressing ALL of them; attack requires exploiting just ONE.
Exercise 2 -- Python scanner enhancements:
import socket, json, sys
def scan_udp(target, port, timeout=3):
try:
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.settimeout(timeout)
s.sendto(b"\x00", (target, port))
data, _ = s.recvfrom(1024)
return True, data.decode(errors='replace')[:100]
except socket.timeout:
return True, "(no response - possibly open)" # UDP: no response != closed
except:
return False, ""
# Usage: scan_udp("192.168.56.101", 53) for DNS
# JSON output: json.dump(results, open("scan.json", "w"), indent=2)
The key insight: UDP scanning is fundamentally ambiguous -- silence could mean open (service running but not responding to our probe) or filtered (firewall dropping packets). This is why UDP scanning is slower and less reliable than TCP.
Exercise 3 -- Scanner comparison:
Typical differences: Nmap may detect 1-2 more ports due to superior timing/retry logic, your scanner may miss ports where the timeout is too aggressive, and services like RPC (111) may appear differently because Nmap uses specialized probes. The lesson: no single tool catches everything, which is why professionals use multiple tools and manual verification.
Learn Ethical Hacking (#6) - The AI Slop Epidemic
This is the episode I've been wanting to write since we started this series. Because there's a quiet catastrophe happening in software development right now, and almost nobody in the mainstream tech press is talking about the security implications honestly.
AI code generation tools are producing vulnerable code at industrial scale, and developers are shipping it to production because "it works."
Let me be very clear about what I mean. This is not a hypothetical. This is not FUD. This is measured, documented, reproduced.
I'm putting this episode here -- right after we finished the recon and scanning phase, before we start breaking into actual vulnerability classes -- because every single vulnerability type we'll cover from here on out is being amplified by AI code generation. SQL injection, XSS, deserialization, misconfigurations, all of it. AI tools are generating these bugs faster than developers can find them. Understanding why gives you a massive advantage as a security researcher.
The Numbers
In 2022, researchers at Stanford published a study: "Do Users Write More Insecure Code with AI Assistants?" The answer was yes. Developers using AI assistants wrote significantly less secure code AND were more confident in its security. Let that sink in. Less secure AND more confident.
A 2023 study by NYU specifically testing GitHub Copilot found that for security-sensitive code generation tasks, approximately 40% of generated code contained vulnerabilities. Not edge cases. Not theoretical. Actual exploitable vulnerabilities: SQL injection, cross-site scripting, path traversal, hardcoded credentials.
And here's what really gets me about these numbers: the 40% figure is for security-sensitive tasks. When researchers specifically asked the AI to generate code that handles authentication, processes user input, or manages file access, nearly half the output had vulnerabilities. For general code (string manipulation, sorting, UI rendering) the rate is much lower. The AI is worst at exactly the code that matters most.
The pattern is consistent across every study I've read: AI code assistants optimize for functionality (does it work?) and pattern matching (does it look like training data?). They do NOT optimize for security, because security requires understanding context -- who calls this function, what data flows into it, what trust boundaries exist, what the consequenses of failure are. AI doesn't reason about any of that.
Real Examples: AI Writes Vulnerable Code
Let me show you concrete examples of code that AI assistants have been documented generating. I'm going to show the vulnerable version AND the secure version, because this series is about understanding BOTH.
Example 1: SQL Injection
Ask an AI to write a "Python function to look up a user by name in a database" and you'll frequently get:
# VULNERABLE -- AI-generated
def get_user(name):
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
cursor.execute(f"SELECT * FROM users WHERE name = '{name}'") # SQL INJECTION!
return cursor.fetchone()
That f-string puts user input directly into the SQL query. An attacker sends name = "'; DROP TABLE users; --" and your database is gone.
The secure version:
# SECURE -- parameterized query
def get_user(name):
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE name = ?", (name,))
return cursor.fetchone()
One character difference (? instead of {name}). Same functionality. Completely different security posture. The AI generates the first version because that's what most code in its training data looks like. The training data is full of tutorials and Stack Overflow answers that use string formatting because it's simpler to explain. The AI learned the common pattern, not the safe pattern.
We'll spend two full episodes on SQL injection later in this series (episodes on SQLi basics and advanced SQLi). When we get there, you'll see just how devastating this single vulnerability type can be -- and how reliably AI tools produce it.
Example 2: Hardcoded Secrets
# VULNERABLE -- AI-generated
import boto3
client = boto3.client(
's3',
aws_access_key_id='AKIAIOSFODNN7EXAMPLE',
aws_secret_access_key='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'
)
AI assistants generate this because their training data is FULL of examples with placeholder credentials. Some developers then replace the placeholder with their REAL credentials and commit the code. The AI doesn't warn them. It doesn't suggest environment variables. It just... generates the insecure pattern.
Having said that, this particular issue has a human element too -- it's not entirely the AI's fault that a developer commits real AWS keys. But the AI is setting the stage by normalizing the pattern of putting credentials directly in source code. A good tool would generate the secure version by default:
# SECURE -- credentials from environment
import boto3
client = boto3.client('s3') # uses AWS credential chain automatically
This is one of those cases where the "correct" code is actually less code. Boto3 has a built-in credential resolution chain that checks environment variables, AWS config files, instance roles, and more -- all without hardcoding anything. The secure version is simpler, shorter, and more portable. But the AI generates the long version because that's what appeared most frequently in its training data.
If you remember from episode 4 (recon), we talked about searching GitHub for exposed credentials. Tools like truffleHog scan git history for API keys. A significant percentage of the credentials they find were AI-generated code where someone forgot to replace the placeholder. The AI creates the vulnerability; the human makes it permanent by committing it ;-)
Example 3: Insecure Deserialization
Ask for "Python function to load configuration from a file" and AI often generates:
# VULNERABLE -- arbitrary code execution!
import pickle
def load_config(path):
with open(path, 'rb') as f:
return pickle.load(f) # NEVER use pickle on untrusted data
pickle.load() on untrusted data allows arbitrary code execution. An attacker who can control the config file can run any Python code on your server. We covered pickle in the Learn Python Series -- and I explicitly warned about this exact issue. Here it is, being suggested by AI tools to developers who might not have taken our series.
The secure version:
# SECURE -- JSON can't execute code
import json
def load_config(path):
with open(path, 'r') as f:
return json.load(f)
Same interface, same functionality, completly different risk profile. JSON is a data format. Pickle is a code execution format disguised as a data format. The AI doesn't understand this distinction because it can't reason about what "untrusted input" means in context.
Example 4: Path Traversal
This one is sneaky. Ask an AI for "Flask route that serves user files":
# VULNERABLE -- path traversal
from flask import Flask, send_file
app = Flask(__name__)
@app.route('/files/<filename>')
def serve_file(filename):
return send_file(f'/uploads/{filename}')
An attacker requests /files/../../etc/passwd and reads your system password file. The ../ sequences navigate up the directory tree, escaping the intended /uploads/ directory. Your Flask app just became a file server for the entire filesystem.
The secure version validates and restricts the path:
# SECURE -- path validation
import os
from flask import Flask, send_file, abort
app = Flask(__name__)
UPLOAD_DIR = '/uploads'
@app.route('/files/<filename>')
def serve_file(filename):
safe_path = os.path.join(UPLOAD_DIR, os.path.basename(filename))
if not os.path.abspath(safe_path).startswith(os.path.abspath(UPLOAD_DIR)):
abort(403)
if not os.path.isfile(safe_path):
abort(404)
return send_file(safe_path)
os.path.basename() strips directory components, and the startswith() check ensures we never leave the upload directory even if someone finds a creative bypass. The AI almost never generates these checks because path traversal isn't something you can detect from the function signature alone -- you need to understand that filename comes from user input and might contain malicious path components.
AI-Generated Infrastructure: The Invisible Risk
It's not just application code. AI generates infrastructure configuration too, and the defaults are typically wide open:
# VULNERABLE -- AI-generated Terraform
resource "aws_security_group" "web" {
ingress {
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"] # ALL PORTS OPEN TO THE ENTIRE INTERNET
}
}
The secure version specifies only the ports actually needed:
# SECURE -- minimal access
resource "aws_security_group" "web" {
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
Then there's this:
# VULNERABLE -- AI-generated Dockerfile
FROM python:3.11
USER root # Running as root in container
COPY . /app
RUN pip install -r requirements.txt # No pinned versions
EXPOSE 8080
CMD ["python", "app.py"]
Running as root in a container means that if an attacker breaks out of your application (via any of the vulnerabilities we've been discussing), they have root access inside the container. And if there's a container escape vulnerability (they exist -- CVE-2024-21626 was one), root inside the container becomes root on the host machine. The fix is one line: USER nobody or creating a dedicated non-root user. But AI defaults to root because most Dockerfile examples in its training data don't bother with user management.
And perhaps the worst:
# VULNERABLE -- AI-generated GitHub Actions
on: pull_request_target # Runs with repo secrets on EXTERNAL PRs!
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
ref: ${{ github.event.pull_request.head.ref }} # Checks out attacker's code
- run: make build # Executes attacker's Makefile with repo secrets
That last one -- pull_request_target with external checkout -- has been exploited to steal secrets from major open-source projects. The AI generates it because it saw it in training data from BEFORE the community understood it was dangerous. This is a pattern we'll see repeatedly: the AI learns from historical code, including code that was later discovered to be vulnerable. It's essentially amplifying past mistakes into the present.
Why AI Can't Do Security
The fundamental problem is that security requires contextual reasoning:
- "Is this input trusted?" -- depends on the call chain, the deployment context, the threat model
- "What happens if this fails?" -- depends on what the function protects, what data it handles
- "Is this permission too broad?" -- depends on the principle of least privilege for this specific use case
- "Could this be exploited?" -- depends on the entire system architecture, not just this code snippet
AI pattern-matches against training data. It generates code that LOOKS like code it's seen before. It can't evaluate security properties because it can't reason about the system the code will run in. It doesn't know your threat model. It doesn't know your trust boundaries. It doesn't know what's sensitive.
Think about it this way: if I show you a function get_user(name) that takes a string and queries a database, you immediately think "is name sanitized? Where does it come from? Could an attacker control it?" These questions require understanding the context around the function -- who calls it, what data flows into it, what happens downstream. An AI sees the function in isolation and generates the statistically most common pattern, which (as we've established) is often the insecure one.
This isn't something that will be fixed by training on more data or making models larger. It's a fundamental limitation of statistical pattern matching applied to a domain (security) that requires causal reasoning about adversarial behavior. A bigger model might generate fewer obvious vulnerabilities (like raw SQL string concatenation), but the subtle ones -- missing authorization checks, TOCTOU race conditions, business logic flaws -- require reasoning about system behavior that current AI architectures simply don't do.
The Supply Chain Angle
Here's where it gets really interesting from a security researcher's perspective. Remember the vsftpd 2.3.4 backdoor from episode 5? Someone compromised a source distribution and inserted malicious code. That was manual, targeted, one-time.
Now imagine that millions of developers are using AI tools that suggest code snippets from a training corpus that itself contains vulnerable code, backdoors, and malicious patterns. The AI doesn't know the difference between a legitimate code example and a backdoor example. It just knows what's statistically common. If someone contributes enough "examples" containing a subtle backdoor pattern to public code repositories, the AI might learn to suggest that pattern to other developers.
This isn't proven to have happened at scale yet -- but the attack vector is real and researchers are actively studying it. The term in the literature is data poisoning: deliberately corrupting the training data of an AI system to influence its output. For code generation models, this could mean: contribute thousands of subtly vulnerable code snippets to popular open-source projects, wait for the model to retrain on them, and then the model starts suggesting your vulnerability pattern to millions of developers.
Think about that attack surface for a moment. One vulnerability manually inserted into one codebase (vsftpd) affected thousands. One vulnerability pattern learned by a code generation model could affect millions.
The "It Works" Trap
There's a psychological dimension to this problem that we need to discuss, because it affects how you'll approach security testing in the field.
When a developer asks an AI for code and the code runs without errors, produces the expected output, and passes basic tests -- there's an overwhelming temptation to ship it. It works. The ticket gets closed. The sprint velocity looks good. Nobody reviews whether cursor.execute(f"SELECT...") is safe because the function returns the right user when you pass a normal name.
The Stanford study found that developers using AI assistants were more confident in their code's security than developers who wrote code manually. This is the dangerous part -- not just that the code is insecure, but that the developer believes it's secure. They trust the AI. They don't double-check. They don't think adversarially about inputs.
As a pentester, this is both a concern and (honestly) job security. The more organizations rely on AI-generated code without security review, the more vulnerabilities exist for you to find. But more importantly, understanding this dynamic helps you predict where vulnerabilities are likely to be. New code, recently written features, areas where development moved fast -- these are the places where AI-generated vulnerabilities concentrate. They're your best targets during an engagement.
What This Means for the Series
Throughout the rest of this series, we'll revisit the AI slop angle for every vulnerability class we study. When we cover passwords and authentication, we'll look at how AI generates weak password hashing. When we cover cryptography, we'll see AI suggesting deprecated algorithms. When we cover web vulnerabilities, we'll see AI producing XSS-vulnerable templates and CSRF-unprotected forms.
It's not a separate topic -- it's woven into the fabric of modern security. Every time we explore an attack, we'll ask: "How does AI make this more likely? How does AI make this harder to detect? How does AI change the economics of this vulnerability?"
The security landscape shifted fundamentally when AI code generation went mainstream. If we don't address that shift, we're teaching 2020-era pentesting in 2026. And that's not what this series is about.
De toekomst is... interessant. En niet altijd op een goede manier.
Exercises
Exercise 1: Open your AI code assistant of choice (or use a free web-based one) and ask it to generate these 5 things: (a) "Python function to authenticate a user with username and password against a database", (b) "Flask route that takes a filename parameter and returns the file contents", (c) "JavaScript function to display user-provided HTML content on a page", (d) "Python function to save and load application state to disk", (e) "Terraform configuration for a public-facing web server on AWS". For each generated snippet, identify the vulnerability (if any), classify it (SQLi, path traversal, XSS, deserialization, misconfiguration), and write the secure alternative. Save your analysis in ~/lab-notes/ai-code-audit.md.
Exercise 2: Write a Python script called ai_code_auditor.py that takes a Python source file as input and scans for common AI-generated vulnerability patterns: (a) f"SELECT.*{ or "SELECT.*" % (SQL injection via string formatting), (b) pickle.load (insecure deserialization), (c) open(.*variable without path validation (path traversal), (d) hardcoded strings matching AWS key patterns (AKIA). For each finding, print the line number, the pattern matched, and a remediation suggestion. Test it against deliberately vulnerable code you write.
#!/usr/bin/env python3
"""
AI Code Auditor - scans Python files for common AI-generated
vulnerability patterns. Learning tool, not production scanner.
"""
import re
import sys
PATTERNS = [
{
'name': 'SQL Injection (string formatting)',
'regex': r'(?:execute|cursor\.execute)\s*\(\s*f["\']SELECT',
'fix': 'Use parameterized queries: cursor.execute("SELECT ... WHERE x = ?", (val,))'
},
{
'name': 'SQL Injection (% formatting)',
'regex': r'(?:execute|cursor\.execute)\s*\(\s*["\']SELECT.*%s',
'fix': 'Use parameterized queries with ? or %s placeholders (not string %)'
},
{
'name': 'Insecure Deserialization (pickle)',
'regex': r'pickle\.loads?\(',
'fix': 'Use json.load() for config/data. pickle allows arbitrary code execution.'
},
{
'name': 'Hardcoded AWS Key',
'regex': r'AKIA[0-9A-Z]{16}',
'fix': 'Use environment variables or AWS credential chain (boto3.client auto-resolves)'
},
{
'name': 'Path Traversal Risk',
'regex': r'open\(.*(request|filename|path|user).*[,\)]',
'fix': 'Validate with os.path.basename() and os.path.abspath().startswith()'
},
{
'name': 'Shell Injection Risk',
'regex': r'os\.system\(.*\+|subprocess\.call\(.*shell\s*=\s*True',
'fix': 'Use subprocess.run() with shell=False and argument lists'
},
]
def audit_file(filepath):
findings = []
with open(filepath, 'r') as f:
for lineno, line in enumerate(f, 1):
for pattern in PATTERNS:
if re.search(pattern['regex'], line, re.IGNORECASE):
findings.append({
'line': lineno,
'code': line.strip(),
'vuln': pattern['name'],
'fix': pattern['fix']
})
return findings
if __name__ == '__main__':
if len(sys.argv) < 2:
print("Usage: python3 ai_code_auditor.py <file.py>")
sys.exit(1)
results = audit_file(sys.argv[1])
if not results:
print("[+] No common AI-generated vulnerability patterns found.")
else:
print(f"[!] Found {len(results)} potential issues:\n")
for r in results:
print(f" Line {r['line']}: {r['vuln']}")
print(f" Code: {r['code'][:80]}")
print(f" Fix: {r['fix']}\n")
Exercise 3: Research and write a 500-word essay: "The Economics of AI-Generated Vulnerabilities." Address: Who benefits from shipping AI-generated code faster? Who bears the cost when it's vulnerable? Why don't market incentives currently reward security in AI code generation? What would need to change? Reference at least one real study (Stanford 2022 or NYU 2023 Copilot study). Save as ~/lab-notes/ai-vuln-economics.md.