Learn Ethical Hacking (#3) - How the Internet Actually Works - For Attackers

Learn Ethical Hacking (#3) - How the Internet Actually Works - For Attackers

leh-banner.jpg

What will I learn

  • How TCP/IP works from the attacker's perspective -- what's visible and what leaks;
  • DNS as a goldmine for reconnaissance;
  • HTTP in raw detail: headers, cookies, sessions, and what browsers hide from you;
  • What TLS/SSL actually protects (and what it doesn't);
  • How to use Wireshark to see invisible network traffic.

Requirements

  • A working modern computer running macOS, Windows or Ubuntu;
  • Your hacking lab from Episode 2 (Kali + Metasploitable2 + DVWA);
  • The ambition to learn ethical hacking and security research.

Difficulty

  • Beginner

Curriculum (of the Learn Ethical Hacking series):

Solutions to Episode 2 Exercises

Exercise 1 -- Lab setup verification:

Your screenshots should show four things: Kali with a 192.168.56.x IP on the host-only adapter, Metasploitable2 with its own 192.168.56.x IP, successful pings between them (0% packet loss), and a failed ping to google.com ("Network is unreachable" or "Temporary failure in name resolution"). If google.com responds, your isolation is broken -- go back to episode 2 and fix the network adapter settings before continuing.

The key insight: network isolation is your legal and ethical safety net. Without it, tools like Nmap scan real infrastructure -- which is unauthorized access.

Exercise 2 -- Metasploitable2 initial scan:

$ nmap -sV 192.168.56.101

PORT     STATE SERVICE     VERSION
21/tcp   open  ftp         vsftpd 2.3.4
22/tcp   open  ssh         OpenSSH 4.7p1 Debian 8ubuntu1
23/tcp   open  telnet      Linux telnetd
25/tcp   open  smtp        Postfix smtpd
53/tcp   open  domain      ISC BIND 9.4.2
80/tcp   open  http        Apache httpd 2.2.8
111/tcp  open  rpcbind     2 (RPC #100000)
139/tcp  open  netbios-ssn Samba smbd 3.X
445/tcp  open  netbios-ssn Samba smbd 3.X
512/tcp  open  exec        netkit-rsh rexecd
513/tcp  open  login
514/tcp  open  shell
1099/tcp open  java-rmi    GNU Classpath grmiregistry
1524/tcp open  bindshell   Metasploitable root shell
2049/tcp open  nfs         2-4 (RPC #100003)
2121/tcp open  ftp         ProFTPD 1.3.1
3306/tcp open  mysql       MySQL 5.0.51a-3ubuntu5
5432/tcp open  postgresql  PostgreSQL DB 8.3.0 - 8.3.7
5900/tcp open  vnc         VNC (protocol 3.3)
6000/tcp open  X11         (access denied)
6667/tcp open  irc         UnrealIRCd
8009/tcp open  ajp13       Apache Jserv (Protocol v1.3)
8180/tcp open  http        Apache Tomcat/Coyote JSP engine 1.1

That's roughly 23 open ports. Three CVE examples:

  • vsftpd 2.3.4: CVE-2011-2523 -- backdoor command execution (type :) as username to open a root shell on port 6200)
  • UnrealIRCd: CVE-2010-2075 -- backdoor in source distribution allowing remote command execution
  • Apache Tomcat 5.5: CVE-2009-2693, CVE-2009-2901 -- multiple directory traversal and information disclosure vulnerabilities

The key insight: one machine, 23+ attack vectors. Every open port is a conversation, and old software versions are invitations.

Exercise 3 -- DVWA command injection:

# Input: 127.0.0.1; whoami
# Output shows ping results AND: www-data
This works because the application passes user input directly to a system
shell command like: ping -c 3 [user_input]. The semicolon (;) is a shell
command separator -- it tells the OS "run this command, then run the next
one." So the actual command becomes: ping -c 3 127.0.0.1; whoami -- which
pings localhost AND runs whoami, revealing the web server runs as www-data.
This is OS command injection, one of the most dangerous vulnerability
classes because it gives direct shell access.

The key insight: any time user input reaches a system shell without sanitization, the attacker controls the server. This is why parameterized commands and input validation exist.


Learn Ethical Hacking (#3) - How the Internet Actually Works - For Attackers

When you type a URL into your browser and press Enter, about 47 different things happen before the page shows up. Your browser hides all of it from you because it's trying to be helpful. But as a security researcher, you need to see ALL of it -- because attackers do.

In episode 1 we talked about the fundamental asymmetry between attack and defense. In episode 2, we built our lab environment. Now it's time to understand the actual plumbing -- the protocols that make the internet work. At each layer, we'll ask the same question: what can an attacker see here? What can they exploit?

Having said that, this is probably the most "textbook" episode in the series. Stick with it. Everything we do from here on out (reconnaissance, scanning, web attacks, network exploitation) depends on understanding these protocols at a level that goes beyond what most developers ever bother with.

The TCP/IP Stack: What's Actually Happening

Every piece of internet communication follows the TCP/IP model -- four layers stacked on top of each other:

+---------------------------+
| APPLICATION LAYER         |  HTTP, DNS, SMTP, FTP, SSH
| (what applications see)   |  "GET /index.html HTTP/1.1"
+---------------------------+
| TRANSPORT LAYER           |  TCP, UDP
| (reliable delivery)       |  Port numbers, sequence numbers
+---------------------------+
| INTERNET LAYER            |  IP
| (addressing & routing)    |  Source IP, Destination IP
+---------------------------+
| LINK LAYER                |  Ethernet, Wi-Fi
| (physical connection)     |  MAC addresses, frames
+---------------------------+

If you've taken a networking course (or read about it), you probably know this as the "OSI model with fewer layers." The OSI model has seven layers, TCP/IP has four. For practical security work, TCP/IP is what matters -- OSI is for passing certification exams ;-)

From an attacker's perspective, each layer leaks information:

Link Layer -- Your MAC address identifies your network card uniquely. On a local network, an attacker can see every device's MAC address. MAC addresses also reveal the manufacturer (first 3 bytes = the OUI, or Organizationally Unique Identifier). Seeing 00:50:56:xx:xx:xx? That's VMware. 08:00:27:xx:xx:xx? VirtualBox. Already useful intelligence -- you now know the target is running virtual machines, which tells you something about their infrastructure.

Internet Layer -- Your IP address reveals your approximate location, your ISP, and whether you're behind a VPN or proxy. Source and destination IPs are visible to anyone on the network path. This is why VPNs exist -- they replace your real IP with the VPN server's IP. What most people miss though: the VPN provider still sees your real IP. You're just moving your trust from your ISP to the VPN company. (We'll cover this tradeoff in more detail when we get to operational security.)

Transport Layer -- Port numbers reveal what services are running. Port 22 = SSH, 80 = HTTP, 443 = HTTPS, 3306 = MySQL. You saw this first-hand in the episode 2 exercises when you ran nmap -sV against Metasploitable2 -- every open port is a door, and many services announce themselves with version banners when you connect. TCP sequence numbers, if predictable, enable session hijacking (though modern OS implementations have made this much harder than it was in the 1990s).

Application Layer -- This is where the real treasure is. HTTP headers, DNS queries, email contents, authentication tokens -- unless encrypted, all of this is visible to anyone who can intercept the traffic.

DNS: The Internet's Phone Book (and Goldmine)

The Domain Name System translates human-readable names (hive.blog) into IP addresses (135.181.37.31). It's one of the oldest and most critical internet protocols, and from a security perspective, it's incredibly leaky.

Let's see it in action from your Kali VM:

# Basic lookup
dig hive.blog

# Get ALL record types
dig hive.blog ANY

# Find mail servers
dig hive.blog MX

# Find name servers
dig hive.blog NS

# Reverse lookup (IP to name)
dig -x 135.181.37.31

# Trace the full resolution path
dig +trace hive.blog

Each of these reveals something useful to an attacker:

  • A records: where the servers actually are (IP addresses, datacenter locations, hosting providers)
  • MX records: what handles their email (Google Workspace? Self-hosted? Microsoft Exchange?)
  • NS records: who manages their DNS (if you compromise the DNS provider, you own everything)
  • TXT records: often contain SPF, DKIM, and DMARC configs (email security posture), and sometimes verification tokens for third-party services -- I've seen TXT records that leaked which SaaS products a company uses
  • CNAME records: can reveal internal naming conventions and potentially abandonded services (subdomain takeover vectors)

And here's the kicker: DNS queries are unencrypted by default. Your ISP, your network admin, anyone on your local network -- they can see every domain you visit. Not the full URL (that's inside HTTPS), but the domain names. DNS over HTTPS (DoH) and DNS over TLS (DoT) exist to fix this, but adoption is still incomplete.

Think about that for a second. You could be browsing over HTTPS, using a password manager, two-factor authentication on everything -- and your ISP still has a complete log of every website you've ever visited, because the DNS requests were in plaintext the entire time. That's the kind of thing that keeps security people up at night.

There's another subtlety worth mentioning: DNS caching. Your operating system, your router, and your ISP all cache DNS responses to avoid redundant lookups. This is great for performance -- but it also means that an attacker who can poison a DNS cache (insert a fake response) can redirect traffic for everyone using that cache. You type bank.com and end up on the attacker's server, complete with a convincing fake login page. The browser shows bank.com in the address bar. The TLS certificate check might catch it (if the attacker doesn't have a valid cert for the domain) -- but not everyone reads certificate warnings carefully. And some DNS poisoning attacks happen at the ISP level, where the scale of impact is enormous.

Zone transfers are the holy grail of DNS reconnaissance. Misconfigured DNS servers allow anyone to request a complete copy of all DNS records for a domain:

# Attempt zone transfer (usually fails on properly configured servers)
dig @ns1.target.com target.com AXFR

If this works, you just got every subdomain, every IP mapping, every internal hostname. It's like the target handing you a complete map of their infrastructure. Shockingly, this still works against quite some organizations -- even in 2026.

HTTP: What Browsers Hide From You

When your browser requests a webpage, it sends an HTTP request and receives an HTTP response. The browser shows you the rendered page. It does NOT show you the headers, cookies, redirects, or authentication tokens flying back and forth. Your browser is essentially a beautification layer on top of a text-based protocol.

Let's see the raw conversation. From Kali, against Metasploitable2's Apache (the target we set up in episode 2):

# Raw HTTP request using netcat
echo -e "GET / HTTP/1.1\r\nHost: 192.168.56.101\r\n\r\n" | nc 192.168.56.101 80

What you get back:

HTTP/1.1 200 OK
Date: Wed, 16 Apr 2026 06:30:00 GMT
Server: Apache/2.2.8 (Ubuntu) DAV/2
X-Powered-By: PHP/5.2.4-2ubuntu5.10
Content-Type: text/html

<html><head>...</head>...

Look at what leaked: Server: Apache/2.2.8, X-Powered-By: PHP/5.2.4. That's exact version numbers. An attacker immediately searches for known CVEs against those versions. Remember from episode 1 how we discussed the Equifax breach? Same idea -- once you know the exact software version, you search the CVE database and often find documented exploits with ready-made code.

Now let's look at cookies:

# Follow redirects, show headers
curl -v http://192.168.56.101/dvwa/login.php 2>&1 | grep -i "set-cookie\|< HTTP"

You'll see Set-Cookie headers. Three flags to look for:

  • HttpOnly: if missing, JavaScript can read the cookie (XSS = stolen session)
  • Secure: if missing, the cookie transmits over unencrypted HTTP (sniffable on the wire)
  • SameSite: if missing, the cookie is sent with cross-site requests (CSRF vulnerability)

These are not theoretical concerns. They're the default configuration for most web applications. And DVWA, being intentionally vulnerable, has none of these protections on Low security. Real-world apps in production often don't either -- which is why web application testing is such a profitable specialization.

TLS/SSL: What It Protects (and What It Doesn't)

TLS (Transport Layer Security, the successor to SSL) encrypts the connection between client and server. When you see the padlock icon in your browser, TLS is active.

What TLS protects:

  • The HTTP request body (form data, passwords, file uploads)
  • The HTTP response body (page content)
  • HTTP headers (including cookies and auth tokens)
  • The URL path and query string (/secret/page?token=abc)

What TLS does NOT protect:

  • The destination IP address (visible to network observers)
  • The domain name via SNI (Server Name Indication -- sent in plaintext during the TLS handshake so the server knows which certificate to use)
  • DNS queries (unless using DoH/DoT, as we discussed above)
  • The amount of data transferred (traffic analysis)
  • The timing of requests (can reveal user behavior patterns)

This matters enormously. Even with HTTPS everywhere, an observer on your network can see: you connected to secretwhistleblower.org at 3:47 AM, exchanged 2.4 MB of data over 12 minutes, and the connection pattern suggests you uploaded a document. They can't read the document -- but the metadata alone is damaging.

I want to drive this home because there's a common misconception that "HTTPS = safe." It's not. It's encrypted, which is a different thing. Encryption protects content. It doesn't protect metadata. And in security, metadata is often enough. Intelligence agencies have been saying this for decades: "We kill people based on metadata." That's not hyperbole -- it's operational reality.

There's also certificate pinning and HSTS (HTTP Strict Transport Security) to be aware of. HSTS tells browsers "never connect to this domain over plain HTTP, always use HTTPS." Without HSTS, an attacker performing a man-in-the-middle attack can intercept the initial HTTP connection and prevent the upgrade to HTTPS entirely (this is called an SSL stripping attack, and we'll build one later in the series). HSTS largely prevents this -- but only if the browser has seen the header before, or the domain is on the browser's preload list.

Wireshark: Seeing the Invisible

Wireshark is a network protocol analyzer -- it captures and displays every packet on a network interface. If you're on the same network as the traffic, you can see it all (for unencrypted protocols). It's one of the most important tools in any security professional's arsenal, and it's completely free.

Open Wireshark on your Kali VM:

sudo wireshark &

Select the eth0 (or your host-only) interface and start capturing. Now from another terminal:

# Generate some HTTP traffic to Metasploitable
curl http://192.168.56.101/

In Wireshark, you'll see the full TCP conversation: the three-way handshake (SYN, SYN-ACK, ACK), the HTTP GET request, the HTTP 200 response with the full HTML body, and the connection teardown (FIN, FIN-ACK). If you followed the Learn Python Series and remember our socket module episodes, this is the exact same handshake your code was doing under the hood.

Filter for HTTP traffic:

http

Click on any HTTP packet and Wireshark shows you everything: the Ethernet frame, the IP header (source/destination IPs), the TCP header (ports, sequence numbers), and the HTTP layer (method, path, headers, body).

Now try this -- log into DVWA from your browser:

Filter: http.request.method == POST

Find the login POST request. Expand it. You'll see the form data: username=admin&password=password. In plaintext. On the wire. If this were a real network (not HTTPS), anyone with Wireshark on the same network just captured your credentials.

This is why HTTPS matters. And this is why security researchers need to understand what's visible at the packet level.

Let's also look at DNS:

Filter: dns

You'll see every DNS query your VM makes. Destination: the DNS server. Query: the domain name. Response: the IP address. All in plaintext. An attacker on your network (or your ISP, or a government) sees every domain you resolve.

Putting It All Together

Here we go -- from your Kali VM, let's do a quick exercise that combines everything we've covered:

# 1. DNS: what's Metasploitable's reverse DNS?
dig -x 192.168.56.101

# 2. TCP: what ports are open?
nmap -sT 192.168.56.101 -p 1-1000

# 3. HTTP: what server headers leak?
curl -sI http://192.168.56.101 | head -10

# 4. Capture it all in Wireshark
# (have Wireshark running while you do steps 1-3)

With these four commands, you've performed DNS reconnaissance, port scanning, banner grabbing, and packet capture. You know the target's IP, open services, software versions, and you have a full packet-level record of the interaction.

That, fundamentally, is what the first phase of every penetration test looks like. And now you understand the protocols well enough to see why these techniques work. It's not magic. It's not even particularly clever. It's just understanding that networks leak information at every layer, and knowing where to look.

We'll formalize this into a proper methodology soon -- moving from ad hoc poking around to systematic information gathering.

Exercises

Exercise 1: Start a Wireshark capture on your Kali VM, then use curl to log into DVWA (curl -X POST http://localhost/dvwa/login.php -d "username=admin&password=password&Login=Login" -c /tmp/cookies.txt -v). In Wireshark, find the POST request and examine the form data. Take a screenshot showing the captured credentials in the packet. Then repeat the same exercise but against an HTTPS site (any real site -- curl -v https://example.com). Compare: what can you see in the encrypted version? Write 3-4 sentences explaining the difference.

Exercise 2: Write a Python script (using the socket module -- we covered this in the Learn Python Series!) that connects to a given IP and port, sends a minimal HTTP GET request, and prints the response headers. Run it against Metasploitable2 port 80. Then modify it to connect to port 21 (FTP), port 22 (SSH), and port 25 (SMTP) -- these services send banners automatically on connect. Collect all four banners in a file called ~/lab-notes/banners.txt.

import socket

def grab_banner(ip, port):
    try:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.settimeout(3)
        s.connect((ip, port))
        if port == 80:
            s.send(b"GET / HTTP/1.1\r\nHost: " + ip.encode() + b"\r\n\r\n")
        banner = s.recv(1024).decode(errors='replace')
        s.close()
        return banner
    except Exception as e:
        return f"Error: {e}"

# Test against Metasploitable2
target = "192.168.56.101"
for port in [21, 22, 25, 80]:
    print(f"\n=== Port {port} ===")
    print(grab_banner(target, port)[:500])

Exercise 3: Using dig, perform a full DNS reconaissance of hive.blog. Collect: A records, MX records, NS records, TXT records, and any CNAME records. Write a brief report (in ~/lab-notes/dns-recon-hive-blog.md) summarizing: what email provider they use, what CDN/hosting they use, what DNS provider they use, and what security measures (SPF/DKIM/DMARC) their TXT records reveal. Note: this is passive recon against a public service -- no authorization needed for DNS lookups.


Bedankt voor het lezen, en tot de volgende!

@scipio



0
0
0.000
0 comments