Learn Ethical Hacking (#5) - Active Scanning - Mapping the Attack Surface

Learn Ethical Hacking (#5) - Active Scanning - Mapping the Attack Surface

leh-banner.jpg

What will I learn

  • Nmap mastery: SYN scans, version detection, OS fingerprinting, NSE scripts;
  • Service enumeration: identifying exactly what's running and which version;
  • Banner grabbing: extracting service identity information;
  • Vulnerability scanning with Nessus/OpenVAS: what they find and what they miss;
  • Writing your own port scanner in Python;
  • The noise problem: how active scanning gets you caught.

Requirements

  • A working modern computer running macOS, Windows or Ubuntu;
  • Your hacking lab from Episode 2;
  • Python 3 with socket module (standard library);
  • The ambition to learn ethical hacking and security research.

Difficulty

  • Beginner

Curriculum (of the Learn Ethical Hacking series):

Solutions to Episode 4 Exercises

Exercise 1 -- OSINT collector comparison:

Results vary, but typical patterns:
- github.com: ~500+ subdomains from CT logs (massive infrastructure),
  TXT records show Google verification, SPF includes many IP ranges
- cnn.com: ~200+ subdomains, heavy CDN usage visible in A records
  (Akamai/Fastly), multiple MX records (enterprise email)
- nasa.gov: WHOIS heavily restricted (.gov registration), relatively
  few CT subdomains visible, strict DMARC policy (p=reject)

The key insight: organization size correlates with subdomain count (attack surface), government domains tend to have stricter security posture, and CT logs reveal infrastructure that organizations might consider "internal."

Exercise 2 -- Google dorking a university:

Example dorks and findings (fictional university):
1. site:university.edu filetype:pdf "confidential"
   -> Found: budget documents, meeting minutes marked confidential
2. site:university.edu inurl:admin
   -> Found: /admin/login.php, /wp-admin, /phpmyadmin
3. site:university.edu intitle:"index of" "parent directory"
   -> Found: open directory listings with student project files
4. site:university.edu ext:sql
   -> Found: database export files (hopefully test data)
5. site:university.edu "password" filetype:log
   -> Found: application log files with authentication attempts

The key insight: universities are treasure troves because they have huge, decentralized web presence managed by different departments, students, and researchers -- many with minimal security awareness.

Exercise 3 -- Email security checker:

def check_email_security(domain):
    import subprocess
    result = subprocess.run(
        ['dig', '+short', domain, 'TXT'],
        capture_output=True, text=True, timeout=10
    )
    spf = [l for l in result.stdout.split('\n') if 'v=spf1' in l]

    result = subprocess.run(
        ['dig', '+short', f'_dmarc.{domain}', 'TXT'],
        capture_output=True, text=True, timeout=10
    )
    dmarc = result.stdout.strip()

    print(f"  SPF: {'FOUND' if spf else 'MISSING (spoofing trivial)'}")
    print(f"  DMARC: {'FOUND' if dmarc else 'MISSING (no spoofing policy)'}")
    if dmarc:
        if 'p=reject' in dmarc:
            print("    Policy: REJECT -- spoofed emails dropped")
        elif 'p=quarantine' in dmarc:
            print("    Policy: QUARANTINE -- spoofed to spam folder")
        elif 'p=none' in dmarc:
            print("    Policy: NONE -- spoofed emails DELIVERED (no protection)")

# Test:
# github.com: SPF + DMARC p=reject (strong)
# many small companies: SPF only, no DMARC (spoofable)

The key insight: p=none in DMARC means the organization monitors but doesn't block spoofed email. An attacker can send emails appearing to come from anyone@that-domain and they'll land in inboxes. Shockingly common.


Learn Ethical Hacking (#5) - Active Scanning - Mapping the Attack Surface

Passive recon told us what exists. Now we poke it to find out what's running and what's vulnerable. This is where things get loud -- the target can see us coming. But we know enough from our recon (episode 4) to be targeted rather than random. We're not spraying packets at the entire internet hoping something sticks -- we have a shortlist of IPs, subdomains, and services that our OSINT work identified. Now we verify.

Active scanning is the dividing line between observation and action. Everything after this point requires authorization in a real engagement. No exceptions. Remember episode 1 -- the CFAA does not care about your intentions, only whether you had permission.

In our lab? We authorized ourselves by building the targets. Laten we beginnen.

Nmap: The Swiss Army Knife

Nmap (Network Mapper) is the most important tool in a pentester's arsenal. Full stop. If you learn only one security tool this year, make it Nmap. I've been using it since my early sysadmin days and it has never once failed to be relevant -- from quick sanity checks on a client's server to full-scale pentest engagements. Everything starts with Nmap.

At its core, Nmap sends packets to target ports and analyzes the responses to determine which ports are open, what services are running, and what versions they're at. But it goes much further than that -- scripting engine, OS detection, firewall evasion, output formatting, timing controls... it's an entire ecosystem packed into one command-line tool.

Here we go -- let's start with the basics against our Metasploitable2 target:

# Basic TCP connect scan (-sT: full TCP handshake)
nmap -sT 192.168.56.101

# SYN scan (-sS: half-open, stealthier, requires root)
sudo nmap -sS 192.168.56.101

# Version detection (-sV: probe open ports for service/version info)
sudo nmap -sV 192.168.56.101

# OS fingerprinting (-O: analyze TCP/IP stack behavior)
sudo nmap -O 192.168.56.101

# The "I want everything" scan
sudo nmap -sS -sV -O -A --script=default 192.168.56.101

# Specific port range (all 65535 ports -- takes a while)
sudo nmap -sS -p 1-65535 192.168.56.101

Let me break down what each scan type actually does at the packet level, because understanding the mechanism is what separates a professional from someone copy-pasting commands:

TCP Connect Scan (-sT): Your machine completes a full TCP three-way handshake (SYN -> SYN-ACK -> ACK) with each port. If the handshake completes, the port is open. This is the most reliable scan but also the loudest -- every connection gets logged by the target. If you remember the TCP handshake from episode 3, this is exactly that -- just repeated across thousands of ports.

SYN Scan (-sS): Also called a "half-open" scan. You send a SYN packet. If you get SYN-ACK back, the port is open -- but instead of completing the handshake with ACK, you send RST (reset). The connection never fully opens, so many older logging systems don't record it. This is the default Nmap scan type and the pentester's bread and butter. It requires root/sudo because crafting raw SYN packets needs elevated privileges.

UDP Scan (-sU): UDP is connectionless -- there's no handshake. Nmap sends a UDP packet and waits. No response = port might be open (or filtered). ICMP "port unreachable" = closed. UDP scanning is slow because you're waiting for timeouts on every port, but it finds services that TCP scanning completley misses (DNS on 53/UDP, SNMP on 161/UDP, DHCP on 67-68/UDP). Never skip UDP scanning in a real engagement -- some of the juiciest vulnerabilities live on UDP services.

OS Fingerprinting (-O): This is clever. Nmap sends a series of crafted TCP/IP packets and analyzes the responses -- things like TCP window size, initial TTL, IP ID sequence patterns, and how the target responds to malformed packets. Different operating systems implement the TCP/IP stack slightly differently, and Nmap has a database of thousands of OS fingerprints. It'll tell you not just "Linux" but often "Linux 2.6.9 - 2.6.33" with a confidence percentage.

Version Detection Deep Dive

The -sV flag doesn't just check if a port is open -- it actively probes the service to determine exactly what's running:

sudo nmap -sV --version-intensity 9 192.168.56.101 -p 21,22,80,3306

Output:

PORT     STATE SERVICE VERSION
21/tcp   open  ftp     vsftpd 2.3.4
22/tcp   open  ssh     OpenSSH 4.7p1 Debian 8ubuntu1 (protocol 2.0)
80/tcp   open  http    Apache httpd 2.2.8 ((Ubuntu) DAV/2)
3306/tcp open  mysql   MySQL 5.0.51a-3ubuntu5

That version string -- vsftpd 2.3.4 -- is incredibly specific. And incredibly dangerous (for the target). A quick search reveals CVE-2011-2523: a backdoor was deliberately inserted into the vsftpd 2.3.4 source code distribution. If you connect to the FTP service and send a username containing a smiley face (:)), a root shell opens on port 6200.

Let that sink in for a moment. Someone compromised the source code of a widely used FTP server and added a backdoor triggered by a smiley face. This is a real supply chain attack from 2011 -- years before SolarWinds made supply chain attacks front-page news.

That's what version detection gives you: not just "FTP is open" but "this exact version has a known backdoor." The difference between interesting and game over. We found this same version in episode 2's exercises when we first scanned Metasploitable2 -- now you understand why that scan result matters.

Nmap Scripting Engine (NSE)

Nmap includes a scripting engine with hundreds of pre-built Lua scripts for vulnerability detection, brute forcing, service discovery, and more. The NSE is what transforms Nmap from a port scanner into a full vulnerability assessment tool:

# Run default scripts (safe, informational)
sudo nmap --script=default 192.168.56.101

# Check for specific vulnerabilities
sudo nmap --script=vuln 192.168.56.101

# HTTP enumeration (directories, methods, server info)
sudo nmap --script=http-enum 192.168.56.101 -p 80

# SMB vulnerability check (EternalBlue, etc.)
sudo nmap --script=smb-vuln* 192.168.56.101 -p 445

# FTP anonymous login check
sudo nmap --script=ftp-anon 192.168.56.101 -p 21

# How many scripts are available?
ls /usr/share/nmap/scripts/ | wc -l
# > 600+ scripts!

The --script=vuln category is particularly powerful -- it actively checks for known vulnerabilities and outputs CVE numbers with descriptions:

sudo nmap --script=vuln 192.168.56.101

On Metasploitable2, this will flag things like: "vsftpd 2.3.4 backdoor detected", "Samba vulnerable to CVE-XXXX", "Anonymous FTP login allowed", "HTTP methods like TRACE enabled", "Distcc CVE-2004-2687 allows remote code execution." Each finding is a potential entry point.

You can also write your own NSE scripts (they're Lua, pretty straightforward if you've done the Learn Python Series and understand the concepts -- Lua syntax is actually simpler than Python in many ways). But for now, the 600+ built-in scripts will keep us busy for a long time.

Having said that, don't treat NSE output as gospel. Scripts have false positives, especially the vuln category. A script might flag "possible vulnerability" based on a version string match without actually confirming the vulnerability is exploitable. Always verify findings manually before including them in a report. The tool gives you leads; you do the confirmation.

Banner Grabbing the Manual Way

Before Nmap, pentesters grabbed banners by hand -- connecting to a port with netcat or telnet and reading what the service sent back. We did a simplified version of this in episode 3. It's still useful because sometimes you want to interact with a service directly rather than through Nmap's abstraction layer:

# FTP banner
nc 192.168.56.101 21

# SSH banner
nc 192.168.56.101 22

# SMTP banner
nc 192.168.56.101 25

# HTTP banner (need to send a request)
echo -e "HEAD / HTTP/1.0\r\n\r\n" | nc 192.168.56.101 80

Each of these will spit back a banner identifying the service and (usually) the version. FTP servers are especially chatty -- they'll tell you the software name, version, and sometimes the operating system, all before you even authenticate.

The important thing to internalize is why services do this. It's a relic from a friendlier era of computing, when knowing what software you were connecting to was considered helpful. The FTP RFC says servers SHOULD send a greeting. SSH sends its version string as part of the protocol handshake. SMTP announces itself to comply with email delivery standards. These aren't bugs -- they're features. Features that happen to be incredibly useful to attackers.

Modern security hardening includes stripping version information from banners. An Apache server can be configured to only say Server: Apache instead of Server: Apache/2.4.52 (Ubuntu). SSH can be configured to send a generic version string. But the default configuration for almost every service in existence is to tell you everything. And defaults are what most people run.

Writing Your Own Port Scanner

As with everything in this series: understanding the tool by building it yourself. Here's a concurrent port scanner in Python that brings together concepts from the Learn Python Series (sockets, threading, queues):

#!/usr/bin/env python3
"""
Concurrent TCP port scanner.
Builds on what we covered in the Learn Python Series (sockets, threading).
"""
import socket
import sys
import threading
from queue import Queue
from datetime import datetime

def scan_port(target, port, results):
    """Attempt TCP connection to a single port."""
    try:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.settimeout(1)
        result = s.connect_ex((target, port))
        if result == 0:
            # Port is open -- try banner grab
            try:
                if port == 80:
                    s.send(b"HEAD / HTTP/1.0\r\n\r\n")
                banner = s.recv(1024).decode(errors='replace').strip()
            except:
                banner = ""
            results.append((port, banner[:100]))
        s.close()
    except:
        pass

def scan(target, port_range=(1, 1024), threads=100):
    """Scan a target with concurrent threads."""
    print(f"[*] Scanning {target} ports {port_range[0]}-{port_range[1]}")
    print(f"[*] Started at {datetime.now().strftime('%H:%M:%S')}")

    results = []
    thread_list = []
    queue = Queue()

    for port in range(port_range[0], port_range[1] + 1):
        queue.put(port)

    def worker():
        while not queue.empty():
            port = queue.get()
            scan_port(target, port, results)
            queue.task_done()

    for _ in range(min(threads, port_range[1] - port_range[0] + 1)):
        t = threading.Thread(target=worker, daemon=True)
        t.start()
        thread_list.append(t)

    queue.join()

    results.sort(key=lambda x: x[0])
    print(f"\n[+] {len(results)} open ports found:\n")
    for port, banner in results:
        banner_str = f" -- {banner[:60]}" if banner else ""
        print(f"    {port:5d}/tcp  open{banner_str}")

    print(f"\n[*] Finished at {datetime.now().strftime('%H:%M:%S')}")
    return results

if __name__ == '__main__':
    target = sys.argv[1] if len(sys.argv) > 1 else "192.168.56.101"
    ports = (1, 1024)
    if len(sys.argv) > 2:
        p = sys.argv[2].split('-')
        ports = (int(p[0]), int(p[1]))
    scan(target, ports)

Save as ~/pentest-tools/portscan.py and run it:

python3 ~/pentest-tools/portscan.py 192.168.56.101 1-1024

Compare your results with Nmap's. They should match (mostly -- Nmap is smarter about certain edge cases like filtered ports and rate limiting). But you built this. You understand what's happening at the socket level. That's the difference between a script kiddie who runs tools and a professional who understands the underlying mechanics.

A few things worth noting about our scanner vs Nmap:

  1. Thread safety: We're appending to a shared list from multiple threads. In CPython this is safe because of the GIL (Global Interpreter Lock -- remember that from the Learn Python Series?), but in a production tool you'd use a threading.Lock to be explicit about it.

  2. No SYN scanning: Our scanner uses full TCP connect (connect_ex), which is the equivalent of Nmap's -sT. To do SYN scanning you'd need raw sockets (which require root and the scapy library). We'll build a raw socket scanner later in the series when we cover packet crafting.

  3. Timing: Our scanner is fast but dumb -- it fires all threads at once. Nmap has sophisticated timing algorithms that adapt to network conditions, back off when it detects packet loss, and throttle to avoid tripping IDS sensors. That's the difference between a teaching tool and a 25-year-old battle-tested project ;-)

The Noise Problem

Here's something that beginners often overlook: every scan we've done generates network traffic that the target can see and log. A corporate IDS (Intrusion Detection System) or SIEM (Security Information and Event Management) watching the network would notice:

  • Hundreds of connection attempts from a single IP in seconds (port scan pattern)
  • Connection attempts to services that should be internal only
  • Unusual TCP flags (SYN without completing handshake -- classic SYN scan signature)
  • Service probes that don't match normal client behavior (Nmap's version detection sends weird payloads)
  • Sequential port access patterns (port 1, 2, 3, 4... nobody does that normally)

In a real engagement, you will be detected eventually. The question is how long it takes and how much information you've gathered before the SOC (Security Operations Center) notices. Strategies to reduce noise:

# Slow scan (paranoid timing -- one packet every 5 seconds)
sudo nmap -sS -T1 192.168.56.101

# Scan only specific ports based on recon (much less noise than full range)
sudo nmap -sS -p 21,22,80,443,3306,8080 192.168.56.101

# Randomize port order (defeats sequential detection)
sudo nmap -sS --randomize-hosts 192.168.56.101

# Fragment packets (evade some IDS signature matching)
sudo nmap -sS -f 192.168.56.101

# Use decoys (mix your real scan with fake source IPs)
sudo nmap -sS -D 192.168.56.50,192.168.56.51,ME 192.168.56.101

The -T flag controls timing templates from -T0 (paranoid, one probe every 5 minutes) to -T5 (insane, fires as fast as possible). -T3 is the default. For lab work, -T4 or -T5 is fine -- speed matters more than stealth when you're practicing. For a real pentest, -T2 or custom timing with --scan-delay and --max-rate gives you fine-grained control.

The decoy option (-D) is particularly interesting. It sends scan packets that appear to come from multiple source IPs (the decoys), making it harder for the target to determine which IP is the actual scanner. The ME keyword inserts your real IP among the decoys. The target sees scans from five different IPs and has to investigate all of them -- or ignore all of them.

In practice, you balance speed against stealth based on the engagement scope. A noisy full-port scan on day one tells the defenders you're there (and some organizations run "assume breach" exercises where getting detected is fine -- they want to test their response procedures). A slow, targeted approach based on good passive recon gives you time to be thorough before anyone notices. Good recon from episode 4 is what makes this possible -- if you know from CT logs and DNS enumeration that the target runs web servers on ports 80, 443, and 8443, you don't need to scan all 65535 ports.

Vulnerability Scanners

While Nmap + NSE scripts catch many issues, dedicated vulnerability scanners go deeper and wider:

OpenVAS (free/open-source): comprehensive vulnerability scanner maintained by Greenbone. It has a database of 50,000+ vulnerability tests (called NVTs -- Network Vulnerability Tests) that gets updated daily. Runs in your Kali lab. Heavy on resources but thorough.

# Install and start OpenVAS on Kali
sudo apt install openvas
sudo gvm-setup    # takes a while -- downloads the vulnerability database
sudo gvm-start
# Access web interface at https://localhost:9392

Nessus Essentials (free for 16 IPs): Tenable's scanner, the industry standard for enterprise vulnerability management. Download from tenable.com. The free tier is limited but more than enough for a lab with two target VMs.

Both tools automate what we've been doing manually: scan ports, identify services, check versions against vulnerability databases, attempt common misconfigurations, and generate PDF reports you can hand to a client. They find a LOT. But they also produce false positives (sometimes many), miss business logic flaws entirely, and can't think creatively about attack chains. That's your job as the pentester.

The scanner finds the low-hanging fruit. The pentester finds the interesting stuff that scanners can't see -- the three-step chain where a minor information disclosure leads to a credential leak that leads to admin access. No automated tool finds those. Your brain does, using the methodology and instincts built up over episodes like this one.

Service Enumeration: Going Deeper

Once you know what ports are open and what services are running, the next step is service enumeration -- extracting as much information as possible from each service. This goes beyond version detection into asking specific questions:

# FTP: check anonymous access, list files
nmap --script=ftp-anon,ftp-bounce,ftp-syst 192.168.56.101 -p 21

# SMB: enumerate shares, users, OS info
nmap --script=smb-enum-shares,smb-enum-users,smb-os-discovery 192.168.56.101 -p 445

# HTTP: find directories, check methods, enumerate virtual hosts
nmap --script=http-enum,http-methods,http-title 192.168.56.101 -p 80

# MySQL: check for empty password, enumerate databases
nmap --script=mysql-empty-password,mysql-databases 192.168.56.101 -p 3306

# SSH: enumerate supported auth methods and host keys
nmap --script=ssh-auth-methods,ssh-hostkey 192.168.56.101 -p 22

On Metasploitable2, the FTP anonymous check will likely succeed (anonymous login allowed, with readable files). The SMB enumeration will reveal shared directories and possibly user accounts. The MySQL check might find an empty root password. Each of these is a concrete attack vector -- not theoretical, but immediately exploitable.

I want to stress something here: this is where the real skill lives. Port scanning is mechanical -- anybody can run nmap -p-. Service enumeration is where you start thinking about what you've found. "Anonymous FTP is open -- what files can I access? Are there configuration files, credentials, backup databases?" "SMB shares are accessible -- can I read sensitive documents? Can I write to the share and drop a malicious file?" "MySQL has no root password -- can I dump the entire database? Can I read the filesystem through LOAD_FILE()?"

Each finding branches into multiple follow-up questions, and each question might branch further. This is the exploratory, creative part of penetration testing that no automated scanner replaces. It's also, honestly, the fun part ;-)

What We've Built So Far

After five episodes, you have a solid foundation:

  1. Understanding of the security landscape, the kill chain, and attack methodology (Ep 1)
  2. A lab with attacker and target VMs in an isolated network (Ep 2)
  3. Protocol knowledge -- TCP/IP, DNS, HTTP, TLS at the packet level (Ep 3)
  4. Passive recon skills -- OSINT, Google dorks, CT logs, DNS enumeration, a Python collector script (Ep 4)
  5. Active scanning -- Nmap, version detection, OS fingerprinting, NSE scripts, vulnerability scanning, and your own Python port scanner (this episode)

That's the reconnaissance phase of the kill chain -- complete. You know how to find targets (passive recon) and how to map their attack surfce (active scanning). From here, we start moving into exploitation territory -- actually using what we've found. But before we exploit anything, we need to talk about a few more foundational topics that will make the exploitation phase much more effective. The security landscape has shifted fundamentally in the last two years, and if we don't address that shift, we're teaching 2020-era pentesting in 2026.

Exercises

Exercise 1: Perform a comprehensive Nmap scan of Metasploitable2: sudo nmap -sS -sV -O -A --script=vuln -p- 192.168.56.101 -oN ~/lab-notes/full-scan.txt. This scans all 65535 ports with version detection, OS fingerprinting, and vulnerability scripts, saving output to a file. Read the ENTIRE output (it'll be long). List every finding that NSE scripts flagged as a vulnerability. For each one, search for the CVE number and write a one-line description of the exploit. How many distinct entry points does this single machine have?

Exercise 2: Modify the Python port scanner from this episode to add: (a) a -sU mode that uses UDP sockets instead of TCP (socket.SOCK_DGRAM), (b) a --banner flag that performs banner grabbing on open TCP ports (send a generic probe and print what comes back), and (c) output to both terminal AND a JSON file (scan_results.json). Test the UDP mode against Metasploitable2 ports 53, 111, 137, 161. Note: UDP scanning is trickier because no response doesn't necessarily mean the port is closed -- it could be filtered, or the service might only respond to properly formatted requests.

Exercise 3: Run both your Python scanner AND Nmap against Metasploitable2 on ports 1-1024. Compare the results side by side. Are there any ports where they disagree? If so, investigate WHY -- is it a timeout issue? A filtered port? A service that responds differently to different probe types? Document the differences and what each tool got right or wrong. Write your findings in ~/lab-notes/scanner-comparison.md.


Doorgaan maar!

@scipio



0
0
0.000
1 comments
avatar

Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!

Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).

Consider setting @stemsocial as a beneficiary of this post's rewards if you would like to support the community and contribute to its mission of promoting science and education on Hive. 
 

0
0
0.000