Learn Ethical Hacking (#53) - Security Architecture - Designing Systems That Resist Attack
Learn Ethical Hacking (#53) - Security Architecture - Designing Systems That Resist Attack

What will I learn
- Security architecture principles -- defense in depth, least privilege, zero trust, and the principle of economy of mechanism;
- Network segmentation -- designing networks that contain breaches instead of enabling lateral movement;
- Zero trust architecture -- the model where nothing is trusted by default, not even internal traffic;
- Identity-centric security -- why identity is the new perimeter in a cloud-first world;
- Secure system design patterns -- input validation boundaries, trust boundaries, and fail-safe defaults;
- Threat modeling -- systematically identifying what can go wrong BEFORE you build the system;
- STRIDE and PASTA -- structured threat modeling methodologies;
- Defense: security as a design constraint, not an afterthought.
Requirements
- A working modern computer running macOS, Windows or Ubuntu;
- Understanding of attack techniques from the full series;
- Basic understanding of network and system architecture;
- The ambition to learn ethical hacking and security research.
Difficulty
- Intermediate
Curriculum (of the Learn Ethical Hacking Series):
- Learn Ethical Hacking (#1) - Why Hackers Win
- Learn Ethical Hacking (#2) - Your Hacking Lab
- Learn Ethical Hacking (#3) - How the Internet Actually Works - For Attackers
- Learn Ethical Hacking (#4) - Reconnaissance - The Art of Not Being Noticed
- Learn Ethical Hacking (#5) - Active Scanning - Mapping the Attack Surface
- Learn Ethical Hacking (#6) - The AI Slop Epidemic - Why AI-Generated Code Is a Security Disaster
- Learn Ethical Hacking (#7) - Passwords - Why Humans Are the Weakest Cipher
- Learn Ethical Hacking (#8) - Social Engineering - Hacking the Human
- Learn Ethical Hacking (#9) - Cryptography for Hackers - What Protects Data (and What Doesn't)
- Learn Ethical Hacking (#10) - The Vulnerability Lifecycle - From Discovery to Patch to Exploit
- Learn Ethical Hacking (#11) - HTTP Deep Dive - Request Smuggling and Header Injection
- Learn Ethical Hacking (#12) - SQL Injection - The Bug That Won't Die
- Learn Ethical Hacking (#13) - SQL Injection Advanced - Extracting Entire Databases
- Learn Ethical Hacking (#14) - Cross-Site Scripting (XSS) - Injecting Code Into Browsers
- Learn Ethical Hacking (#15) - XSS Advanced - Bypassing Filters and CSP
- Learn Ethical Hacking (#16) - Cross-Site Request Forgery - Making Users Attack Themselves
- Learn Ethical Hacking (#17) - Authentication Bypass - Getting In Without a Password
- Learn Ethical Hacking (#18) - Server-Side Request Forgery - Making Servers Betray Themselves
- Learn Ethical Hacking (#19) - Insecure Deserialization - Code Execution via Data
- Learn Ethical Hacking (#20) - File Upload Vulnerabilities - When Users Upload Weapons
- Learn Ethical Hacking (#21) - API Security - The New Attack Surface
- Learn Ethical Hacking (#22) - Business Logic Flaws - When the Code Works But the Logic Doesn't
- Learn Ethical Hacking (#23) - Client-Side Attacks - Beyond XSS
- Learn Ethical Hacking (#24) - Content Management Systems - Hacking WordPress and Friends
- Learn Ethical Hacking (#25) - Web Application Firewalls - Bypassing the Guards
- Learn Ethical Hacking (#26) - The Full Web Pentest - Methodology and Reporting
- Learn Ethical Hacking (#27) - Bug Bounty Hunting - Getting Paid to Hack the Web
- Learn Ethical Hacking (#28) - The AI Web Attack Surface - AI Features as Vulnerabilities
- Learn Ethical Hacking (#29) - Network Sniffing - Seeing Everything on the Wire
- Learn Ethical Hacking (#30) - Wireless Network Attacks - Breaking Wi-Fi
- Learn Ethical Hacking (#31) - Privilege Escalation - Linux
- Learn Ethical Hacking (#32) - Privilege Escalation - Windows
- Learn Ethical Hacking (#33) - Active Directory Attacks - The Crown Jewels
- Learn Ethical Hacking (#34) - Pivoting and Lateral Movement - Spreading Through Networks
- Learn Ethical Hacking (#35) - Cloud Security - AWS Attack and Defense
- Learn Ethical Hacking (#36) - Cloud Security - Azure and GCP
- Learn Ethical Hacking (#37) - Container Security - Docker and Kubernetes Attacks
- Learn Ethical Hacking (#38) - Infrastructure as Code - Securing the Automation
- Learn Ethical Hacking (#39) - Email Security - Phishing Infrastructure and Defense
- Learn Ethical Hacking (#40) - DNS Attacks - Exploiting the Internet's Foundation
- Learn Ethical Hacking (#41) - Exploitation Frameworks - Metasploit and Cobalt Strike
- Learn Ethical Hacking (#42) - Custom Exploit Development - Writing Your Own
- Learn Ethical Hacking (#43) - Exploit Development Advanced - Modern Mitigations and Bypasses
- Learn Ethical Hacking (#44) - Reverse Engineering - Understanding Binaries
- Learn Ethical Hacking (#45) - Supply Chain Attacks - Poisoning the Source
- Learn Ethical Hacking (#46) - The Human Factor - Why Security Training Fails
- Learn Ethical Hacking (#47) - Physical Security and OSINT - The Forgotten Attack Vectors
- Learn Ethical Hacking (#48) - Insider Threats - When the Call Is Coming from Inside the House
- Learn Ethical Hacking (#49) - Deepfakes and AI Deception - The New Social Engineering
- Learn Ethical Hacking (#50) - Red Team Operations - Simulating Real Attacks
- Learn Ethical Hacking (#51) - Incident Response - When Things Go Wrong
- Learn Ethical Hacking (#52) - Threat Intelligence - Knowing Your Enemy
- Learn Ethical Hacking (#53) - Security Architecture - Designing Systems That Resist Attack (this post)
Learn Ethical Hacking (#53) - Security Architecture - Designing Systems That Resist Attack
Solutions to Episode 52 Exercises
Exercise 1: MISP setup (abbreviated).
# Docker deployment
git clone https://github.com/MISP/misp-docker
cd misp-docker && docker compose up -d
# Imported CIRCL OSINT feed: 14,000+ events with IOCs
# Event structure: attributes (IPs, hashes, domains) grouped
# into events with tags, galaxies, and correlation
# Export formats tested:
# - CSV: 3,200 IP indicators exported
# - Snort: 890 rules generated from network IOCs
# - STIX 2.1: full structured export for TAXII sharing
The event-based organisation in MISP is one of those things that seems obvious once you see it but isn't. Individual IOCs (an IP address, a hash, a domain) are meaningless in isolation. Grouping them into events ("this IP, these 3 hashes, and that domain were all part of the same APT29 campaign targeting energy companies in March 2026") gives you context. And context is what separates threat intelligence from a blocklist. The CIRCL OSINT feed is an excellent starting point because it comes pre-correlated -- you get real campaign events with real attribution, not just a dump of random IP addresses.
Exercise 2: APT28 threat profile (abbreviated).
APT28 / Fancy Bear / Sofacy / STRONTIUM
Attribution: Russian GRU (Unit 26165)
Active since: ~2004
Target sectors: government, military, defense, media, political
Known campaigns:
2016: DNC hack (US election interference)
2017: NotPetya supply chain via MeDoc (Ukraine)
2018: OPCW (chemical weapons watchdog)
2020: COVID-19 vaccine research targeting
Top 10 TTPs (ATT&CK):
T1566.001 Spearphishing Attachment
T1059.001 PowerShell
T1053.005 Scheduled Task
T1071.001 HTTPS C2
T1003.001 LSASS Memory
T1027 Obfuscated Files
T1078 Valid Accounts
T1021.002 SMB/Windows Admin Shares
T1036 Masquerading
T1560.001 Archive via Utility
Tools: X-Agent, Zebrocy, Komplex, LoJax (UEFI rootkit)
APT28 is arguably the most well-documented threat actor in the public record, which makes it a perfect first target for a threat profile exercise. The TTP list reads like a greatest hits compilation of everything we've covered in this series: spearphishing (episode 39), PowerShell execution (episode 31/32), scheduled task persistence (episode 51's persistence checklist), LSASS credential dumping (episode 33), lateral movement via SMB (episode 34). That LoJax UEFI rootkit is worth noting separately -- it's a firmware-level implant that survives complete OS reinstallation and even hard drive replacement. The only way to eradicate it is to reflash the UEFI firmware. That's a level of persistence most IR teams have never dealt with.
Exercise 3: Pyramid of Pain detection rules (abbreviated).
Hash: alert on SHA256 of known Cobalt Strike beacon
Useful for: hours (attacker recompiles, new hash)
IP: alert on connections to 203.0.113.42
Useful for: days (attacker changes VPS)
Domain: alert on DNS queries to c2.evil-domain.com
Useful for: days-weeks (new domain costs $10)
Tool: alert on named pipe \\.\pipe\msagent_* (CS default)
Useful for: weeks-months (requires config change)
TTP: alert on any process accessing lsass.exe memory
Useful for: months-years (fundamental technique)
The progression is the entire point. Hash detection expires in hours. TTP detection lasts for years. But the reason most organizations stay at the bottom of the pyramid is that building TTP-based detection is HARD. A hash match is a string comparison. A TTP detection rule requires understanding what normal behavior looks like on your network and then identifying deviations. The LSASS access rule is a good example -- every Windows system has hundreds of legitimate processes that touch LSASS memory (credential providers, security agents, the OS itself). Your rule needs to distinguish between those legitimate accesses and an attacker running Mimikatz. That requires tuning, baselining, and ongoing maintenance. It's worth the effort, but it's not free.
Episode 52 covered threat intelligence -- the discipline that transforms security from reactive ("something happened, let's respond") to proactive ("we know who's targeting us and what they'll use"). We walked through the three levels of intelligence (strategic for the CISO, operational for the SOC lead, tactical for the analyst), the Pyramid of Pain (why detecting at the TTP level hurts attackers more than blocking hashes), threat intelligence platforms (MISP for IOC operations, OpenCTI for analysis and knowledge graphs), the STIX/TAXII standards for machine-readable intelligence sharing, the six-phase intelligence cycle (direction, collection, processing, analysis, dissemination, feedback), and threat-informed defense -- using intelligence to focus your limited resources on the specific threats that actually target your organization instead of trying to defend against everything simultaneously. The core takeaway: intelligence without action is trivia, and action without intelligence is noise.
Today we go one level deeper.
Everything we've discussed so far -- finding vulnerabilities, exploiting them, responding to incidents, gathering intelligence about who's targeting you -- operates on systems that already exist. Security architecture asks a fundamentally different question: can you design the system so that a breach in one component does NOT cascade into a total compromise?
You can find and fix every vulnerability. You can build the most sophisticated detection rules. You can have a world-class incident response team. And if the underlying architecture is a flat network with no segmentation, shared credentials across tiers, and applications that implicitly trust all internal traffic, you will still lose. One compromised workstation becomes domain admin in 45 minutes (we proved this in episodes 33 and 34). One SSRF gives the attacker access to the cloud metadata service and from there to every S3 bucket in the account (episode 35).
Security architecture is the discipline of making that NOT happen.
Here we go.
The Foundational Principles
Before diving into specific architectures, there are a handful of principles that have survived every technological shift from mainframes to cloud-native microservices. They were formulated by Saltzer and Schroeder in 1975, and they are as valid today as they were half a century ago:
#!/usr/bin/env python3
"""security_principles.py -- the timeless design principles"""
PRINCIPLES = {
'defense_in_depth': {
'statement': 'Multiple independent layers of security',
'why': 'No single control is 100% effective. Each layer '
'catches what the previous layer missed. The attacker '
'must defeat ALL layers, not just one.',
'example': 'Firewall + IDS + EDR + application input validation '
'+ database encryption + backup verification. The '
'attacker bypasses the firewall via VPN creds (ep 17), '
'but the EDR catches the post-exploitation.',
'failure_mode': 'When layers are not independent. If your WAF '
'and your application both rely on the same regex '
'for input validation, bypassing one bypasses both.',
},
'least_privilege': {
'statement': 'Every component gets the MINIMUM permissions needed',
'why': 'Limits the blast radius of a compromise. If the web '
'server only has SELECT on the product table, SQL '
'injection cannot DROP the database.',
'example': 'Web server: read-only DB access. CI/CD pipeline: '
'deploy to staging only, not production. Monitoring '
'agent: read-only access to logs, cannot modify them.',
'failure_mode': 'When convenience wins over security. The "just '
'give it admin access so it stops complaining" '
'approach. Seen in 90% of cloud IAM misconfigs.',
},
'fail_safe_defaults': {
'statement': 'Default to deny. Access is explicitly granted.',
'why': 'A new firewall rule blocks traffic until allowed. A new '
'user has zero permissions until granted. A new API '
'endpoint requires authentication unless explicitly public.',
'example': 'Kubernetes NetworkPolicy: deny all ingress by default. '
'Every pod must explicitly declare what it accepts.',
'failure_mode': 'Default-allow systems. Most cloud platforms start '
'with wide-open security groups (allow all outbound). '
'Azure NSGs allow intra-VNET traffic by default.',
},
'complete_mediation': {
'statement': 'Every access to every resource is checked',
'why': 'Authorization at the front door is not enough. Every '
'API endpoint, every file access, every database query '
'must verify permissions independently.',
'example': 'The IDOR bugs we found in episode 21 -- the API '
'checked authentication (are you logged in?) but not '
'authorization (are you allowed to access THIS record?). '
'Complete mediation would have prevented them all.',
'failure_mode': 'Caching authorization decisions. The token was '
'valid when issued, but the user was fired 3 hours '
'ago. Without re-checking, the revoked user still '
'has access.',
},
'economy_of_mechanism': {
'statement': 'Keep security-critical code as simple as possible',
'why': 'Complex code has more bugs. More bugs means more '
'vulnerabilities. The simplest correct implementation '
'is the most secure.',
'example': 'A 50-line authentication module that uses bcrypt is '
'more secure than a 5,000-line custom crypto framework. '
'The 5,000-line version might be more "feature-rich" '
'but it has 100x more attack surface.',
'failure_mode': 'Overengineering. Adding complexity for "flexibility" '
'that nobody asked for. Every line of code in the '
'security path is a potential bug.',
},
'separation_of_privilege': {
'statement': 'No single credential grants full access',
'why': 'If one key opens every door, stealing one key gives '
'the attacker everything. Separate credentials for '
'separate trust domains.',
'example': 'Dual authorization for wire transfers (episode 49). '
'Separate admin accounts for workstations vs domain '
'controllers (tiered admin model from episode 33).',
'failure_mode': 'Shared service accounts. The same credential that '
'the web app uses for the database is also used by '
'the backup system and the monitoring agent.',
},
}
print("=== Saltzer-Schroeder Security Design Principles ===\n")
for name, data in PRINCIPLES.items():
label = name.replace('_', ' ').title()
print(f"[{label}]")
print(f" Principle: {data['statement']}")
print(f" Why: {data['why']}")
print(f" Example: {data['example']}")
print(f" Failure: {data['failure_mode']}")
print()
The reason I keep refering back to episodes in this series is that almost every attack we've studied is an architectural failure, not just a code bug. SQL injection (episode 12) is a failure of input validation at trust boundaries. Kerberoasting (episode 33) is a failure of separation of privilege (service accounts with unnecessary rights). SSRF to cloud metadata (episode 35) is a failure of least privilege (the web application had implicit access to AWS IAM). The individual vulnerabilities are symptoms. The architectural failures are the disease.
Defense in Depth -- Layers That Actually Work
Defense in depth is the oldest security principle and still the most important. But there's a subtle distinction between having multiple layers and having multiple independent layers:
#!/usr/bin/env python3
"""defense_layers.py -- what each layer catches and what it misses"""
SECURITY_LAYERS = [
{
'layer': 'Perimeter',
'components': ['Firewall', 'WAF', 'Email gateway', 'DDoS protection'],
'catches': 'Known-bad traffic, mass scanning, spam, obvious attacks',
'misses': 'Authenticated attackers, 0-days, encrypted C2, '
'credential-based access via VPN',
'bypass_episodes': [17, 25, 39],
'bypass_desc': 'Auth bypass (ep 17) gets past the firewall with '
'valid creds. WAF bypass (ep 25) delivers payloads '
'the WAF cannot parse. Phishing (ep 39) delivers '
'malware through email.',
},
{
'layer': 'Network',
'components': ['IDS/IPS', 'Network segmentation', 'VLAN firewalls',
'Network monitoring'],
'catches': 'Lateral movement between segments, unusual traffic '
'patterns, protocol anomalies, C2 beaconing',
'misses': 'Traffic within the same segment, encrypted C2 '
'channels, DNS tunneling, living-off-the-land techniques',
'bypass_episodes': [29, 34, 40],
'bypass_desc': 'If the network is flat (no segments), there is '
'nothing to catch. Lateral movement (ep 34) stays '
'within the same VLAN. DNS tunneling (ep 40) looks '
'like normal DNS queries.',
},
{
'layer': 'Host',
'components': ['EDR', 'OS hardening', 'Application whitelisting',
'Patch management', 'Local firewall'],
'catches': 'Malware execution, privilege escalation attempts, '
'suspicious process behavior, known exploit patterns',
'misses': 'Fileless malware in memory, LOLBin abuse, '
'zero-day kernel exploits, supply chain compromised '
'legitimate software',
'bypass_episodes': [31, 32, 43, 45],
'bypass_desc': 'Privilege escalation (ep 31/32) can use '
'legitimate binaries. Modern exploit dev (ep 43) '
'bypasses EDR hooks. Supply chain (ep 45) delivers '
'malware through trusted software.',
},
{
'layer': 'Application',
'components': ['Input validation', 'Authentication', 'Authorization',
'Session management', 'CSRF protection'],
'catches': 'SQL injection, XSS, CSRF, auth bypass, IDOR, '
'file upload attacks, business logic flaws',
'misses': 'Logic flaws that pass all validation, race conditions, '
'deserialization attacks on trusted input, bugs in '
'the validation logic itself',
'bypass_episodes': [12, 14, 18, 19, 22],
'bypass_desc': 'Every web attack episode (12-25) is an '
'application layer bypass. Business logic flaws '
'(ep 22) pass all technical checks.',
},
{
'layer': 'Data',
'components': ['Encryption at rest', 'Encryption in transit',
'DLP', 'Access controls', 'Backup verification'],
'catches': 'Data theft when other layers fail (encrypted data '
'is useless without keys), unauthorized data movement, '
'accidental exposure',
'misses': 'Authorized users exfiltrating data they legitimately '
'have access to (insider threat, ep 48), key compromise, '
'unencrypted backups',
'bypass_episodes': [48],
'bypass_desc': 'Insider threats (ep 48) have legitimate access. '
'DLP catches mass downloads but not slow trickle. '
'Encrypted data is only safe if keys are safe.',
},
]
print("=== Defense in Depth -- Layer Analysis ===\n")
for layer in SECURITY_LAYERS:
eps = ', '.join(str(e) for e in layer['bypass_episodes'])
print(f"--- {layer['layer']} Layer ---")
print(f" Components: {', '.join(layer['components'])}")
print(f" Catches: {layer['catches']}")
print(f" Misses: {layer['misses']}")
print(f" Bypass episodes: {eps}")
print(f" Detail: {layer['bypass_desc']}")
print()
print("No single layer is sufficient. Each layer catches what the")
print("previous layer missed. The attacker must defeat ALL layers.")
The critical insight is the "misses" column. Every layer has predictable failure modes, and a skilled attacker (the kind we've been training to be for 52 episodes) knows exactly what each layer catches and what it doesn't. That is why the layers must be independent. If your perimeter and your application both rely on the same input validation logic, bypassing that logic bypasses two layers simultaneously. Independent layers force the attacker to use different techniques for each one -- and the more techniques they have to chain together, the more opportunities you have to detect them.
Network Segmentation -- Containing the Blast Radius
Episode 34 showed how attackers exploit flat networks for lateral movement. One compromised workstation leads to the file server, which leads to the domain controller, which leads to everything. Segmentation is the architectural counter:
Segmented network design:
Internet
|
[DMZ] ---------- Web servers, reverse proxies, email relay
| (can talk to internet + limited internal)
|
[Firewall/Router]
|
[User VLAN] ---- Workstations (VLAN 10)
| Can access: App VLAN on ports 443/8443
| CANNOT access: DB VLAN, Mgmt VLAN
|
[App VLAN] ----- Application servers, API backends (VLAN 20)
| Can access: DB VLAN on ports 5432/3306
| CANNOT access: User VLAN, Mgmt VLAN
|
[DB VLAN] ------ Database servers (VLAN 30)
| Accept connections: ONLY from App VLAN
| CANNOT initiate: any outbound connection
|
[Mgmt VLAN] ---- Admin workstations, jump boxes (VLAN 40)
ONLY way to access infrastructure management
Requires separate admin credentials (tiered admin)
MFA enforced on all management access
Rules:
- Users CANNOT reach databases directly
- DMZ CANNOT reach internal network directly
- Databases CANNOT initiate outbound connections
- Management access ONLY from dedicated jump boxes
- ALL inter-VLAN traffic is logged and inspected
This design directly defeats the attack chain from episodes 33-34. In a flat network, the attacker goes from compromised workstation to domain controller in one hop. In a segmented network, the workstation (VLAN 10) cannot talk to the domain controller (VLAN 40) at all. The attacker would need to compromise an application server first (VLAN 20), then somehow reach the management VLAN -- which requires separate credentials and MFA.
Micro-segmentation -- Beyond VLANs
Traditional VLANs segment at the network level. Micro-segmentation segments at the workload level -- every server, container, or pod has its own security policy:
#!/usr/bin/env python3
"""microsegmentation.py -- Kubernetes NetworkPolicy example"""
import json
# Kubernetes NetworkPolicy -- deny all ingress by default,
# then explicitly allow only what each pod needs
network_policies = {
'default_deny': {
'apiVersion': 'networking.k8s.io/v1',
'kind': 'NetworkPolicy',
'metadata': {
'name': 'default-deny-all',
'namespace': 'production',
},
'spec': {
'podSelector': {}, # applies to ALL pods
'policyTypes': ['Ingress', 'Egress'],
# No ingress or egress rules = deny everything
},
},
'allow_frontend_to_api': {
'apiVersion': 'networking.k8s.io/v1',
'kind': 'NetworkPolicy',
'metadata': {
'name': 'allow-frontend-to-api',
'namespace': 'production',
},
'spec': {
'podSelector': {
'matchLabels': {'app': 'api-backend'},
},
'policyTypes': ['Ingress'],
'ingress': [{
'from': [{
'podSelector': {
'matchLabels': {'app': 'web-frontend'},
}
}],
'ports': [{'protocol': 'TCP', 'port': 8080}],
}],
},
},
'allow_api_to_db': {
'apiVersion': 'networking.k8s.io/v1',
'kind': 'NetworkPolicy',
'metadata': {
'name': 'allow-api-to-database',
'namespace': 'production',
},
'spec': {
'podSelector': {
'matchLabels': {'app': 'postgres-db'},
},
'policyTypes': ['Ingress'],
'ingress': [{
'from': [{
'podSelector': {
'matchLabels': {'app': 'api-backend'},
}
}],
'ports': [{'protocol': 'TCP', 'port': 5432}],
}],
},
},
}
for name, policy in network_policies.items():
label = name.replace('_', ' ').title()
print(f"=== {label} ===")
print(json.dumps(policy, indent=2))
print()
# Result: even if an attacker compromises the web frontend,
# they can ONLY reach the API backend on port 8080.
# They cannot scan the network. They cannot reach the database.
# They cannot reach other namespaces. Blast radius: minimal.
This is the container security concept from episode 37 applied as an architectural pattern. Most Kubernetes deployments ship without NetworkPolicies -- meaning every pod can talk to every other pod on every port. That's a flat network inside your cluster. Adding default-deny-all and then whitelisting specific pod-to-pod communication is the micro-segmentation equivalent of VLAN firewall rules, but at the individual workload level.
Zero Trust Architecture
Zero trust is the model where nothing is trusted by default -- not the network, not the user, not the device, not even traffic from inside the corporate perimeter:
#!/usr/bin/env python3
"""zero_trust.py -- the model where nothing is trusted"""
ZERO_TRUST_PILLARS = {
'identity': {
'principle': 'Verify every user, every time',
'implementation': [
'Strong MFA (FIDO2/WebAuthn preferred, not SMS)',
'Continuous authentication (not just at login)',
'Risk-based access (location, device, behavior)',
'Session timeout and re-authentication for '
'sensitive operations',
],
'replaces': 'VPN login = trusted for 8 hours. In zero trust, '
'trust is continuous, not binary.',
},
'device': {
'principle': 'Verify every device before granting access',
'implementation': [
'Device health checks (OS patched, EDR running, '
'disk encrypted, compliant with policy)',
'Device certificates (managed vs unmanaged)',
'Conditional access: non-compliant device = '
'read-only access or no access at all',
],
'replaces': 'If you are on the VPN, your device is trusted. '
'In zero trust, a jailbroken phone on the VPN '
'gets no access regardless.',
},
'network': {
'principle': 'Do not trust the network, even internal',
'implementation': [
'Micro-segmentation (as above)',
'Encrypted traffic everywhere (even internal)',
'Software-defined perimeter (SDP)',
'No implicit trust based on IP range',
],
'replaces': 'Inside the 10.0.0.0/8 network = trusted. In zero '
'trust, a packet from 10.0.1.50 is treated the same '
'as a packet from the internet.',
},
'application': {
'principle': 'Authorize every request independently',
'implementation': [
'Per-request authorization (not just per-session)',
'Application-level access proxies',
'Just-in-time access (temporary permissions)',
'Least privilege by default',
],
'replaces': 'VPN gets you network access to the application. '
'In zero trust, you get application access without '
'network access. You connect to the app, not the net.',
},
'data': {
'principle': 'Protect data regardless of location',
'implementation': [
'Classification labels on all data',
'Encryption in transit and at rest',
'DLP policies based on classification',
'Access logging for all data operations',
],
'replaces': 'Data inside the perimeter is safe. In zero trust, '
'data is protected by its own controls regardless '
'of where it lives.',
},
}
print("=== Zero Trust Architecture -- Five Pillars ===\n")
for pillar, data in ZERO_TRUST_PILLARS.items():
print(f"--- {pillar.upper()} ---")
print(f" Principle: {data['principle']}")
print(f" Replaces: {data['replaces']}")
print(f" Implementation:")
for item in data['implementation']:
print(f" - {item}")
print()
The shift from "castle and moat" to zero trust is fundamentaly a shift in what you trust. The old model trusted the network -- if you were inside the firewall (or connected via VPN), you were trusted. The problem (as we proved in episodes 33-34) is that once an attacker gets inside the firewall, they have full network access. Zero trust removes the concept of "inside" entirely. Every request is authenticated, every device is verified, every access is authorized independently. Google's BeyondCorp implementation is the reference architecture -- they eliminated the corporate VPN entirely and replaced it with an identity-aware proxy that grants per-application access based on user identity plus device posture. No VPN, no network access, just direct application access after continuous verification.
Having said that, zero trust is not a product you can buy. It's an architectural philosophy. Vendors will sell you "zero trust solutions" that are just VPNs with a marketing refresh. The real test: does your system still work if the internal network is completely compromised? If the answer is "no, the attacker would have access to everything," you don't have zero trust regardless of what the vendor told you ;-)
Threat Modeling -- Finding Problems Before You Build Them
Threat modeling is the process of systematically identifying threats to a system BEFORE it is built. It answers the question: what can go wrong?
#!/usr/bin/env python3
"""threat_modeling.py -- STRIDE applied to a web application"""
STRIDE_THREATS = {
'Spoofing': {
'question': 'Can someone pretend to be a legitimate user or system?',
'examples': [
'Attacker uses stolen credentials to log in as admin',
'Man-in-the-middle presents fake TLS certificate',
'API caller spoofs the X-Forwarded-For header',
],
'mitigations': [
'Strong MFA (FIDO2, not SMS)',
'Certificate pinning for internal services',
'Validate headers at the application, not the proxy',
],
},
'Tampering': {
'question': 'Can someone modify data in transit or at rest?',
'examples': [
'SQL injection modifies database records (episode 12)',
'Man-in-the-middle alters API responses',
'Attacker modifies log files to cover tracks (episode 51)',
],
'mitigations': [
'Parameterized queries (prevent SQLi)',
'TLS everywhere (prevent MITM)',
'Immutable log storage (prevent log tampering)',
],
},
'Repudiation': {
'question': 'Can someone deny an action they performed?',
'examples': [
'User denies placing an order (no audit trail)',
'Admin denies deleting records (no logging)',
'Attacker clears logs after compromise (episode 51)',
],
'mitigations': [
'Comprehensive audit logging',
'Write-once log storage (append-only)',
'Digital signatures on critical transactions',
],
},
'Information Disclosure': {
'question': 'Can data be exposed to unauthorized parties?',
'examples': [
'SQL injection extracts database contents (episode 13)',
'SSRF reads internal service responses (episode 18)',
'Error messages reveal stack traces and DB schemas',
'Directory listing exposes backup files',
],
'mitigations': [
'Input validation at all trust boundaries',
'Custom error pages (no stack traces)',
'Encryption at rest for sensitive data',
'Network segmentation (limit SSRF reach)',
],
},
'Denial of Service': {
'question': 'Can the system be made unavailable?',
'examples': [
'HTTP flood overwhelms web server',
'Algorithmic complexity attack (ReDoS, hash collision)',
'Resource exhaustion via uncontrolled file upload (ep 20)',
],
'mitigations': [
'Rate limiting at all public endpoints',
'Input size limits',
'CDN/DDoS protection for public-facing services',
'Circuit breakers for internal service calls',
],
},
'Elevation of Privilege': {
'question': 'Can someone gain unauthorized access levels?',
'examples': [
'SQLi escalates from web user to DB admin (episode 12)',
'IDOR lets user A access user B data (episode 21)',
'Kernel exploit gives root from unprivileged user (ep 31)',
'Kerberoasting escalates to domain admin (episode 33)',
],
'mitigations': [
'Least privilege for all service accounts',
'Authorization checks at every access point',
'Patch management for kernel and OS vulnerabilities',
'Tiered administration for AD environments',
],
},
}
print("=== STRIDE Threat Model ===\n")
for category, data in STRIDE_THREATS.items():
print(f"[{category[0]}] {category}")
print(f" Question: {data['question']}")
print(f" Examples:")
for ex in data['examples']:
print(f" - {ex}")
print(f" Mitigations:")
for mit in data['mitigations']:
print(f" - {mit}")
print()
The power of STRIDE is that it gives you a systematic way to think about threats instead of relying on imagination (which is always incomplete) or checklists (which are always outdated). For every component in your data flow diagram, you ask six questions. For every data flow that crosses a trust boundary, you ask six questions. The threats you find drive the security requirements, which drive the architecture.
Applying Architecture to Real Attacks
This is where everything connects. Let me walk through three attack chains from this series and show how architecture (not patching, not detection, not response -- architecture) prevents them:
#!/usr/bin/env python3
"""architecture_vs_attacks.py -- how design prevents kill chains"""
ATTACK_SCENARIOS = [
{
'name': 'AD Compromise (Episodes 33-34)',
'attack_chain': [
'Compromised workstation (phishing)',
'Network scan finds domain controller (flat network)',
'Kerberoast service accounts (over-privileged SPNs)',
'Crack hash offline -> service account password',
'DCSync with service account (replication rights)',
'Golden Ticket -> permanent domain access',
],
'architecture_that_prevents': [
'Network segmentation: workstations CANNOT reach '
'the DC directly (must go through jump box in VLAN 40)',
'Tiered admin: DA accounts are used ONLY on Tier 0 '
'systems (DCs). Never on workstations. Workstation '
'compromise gives local admin, not DA.',
'gMSA for service accounts: Group Managed Service '
'Accounts have 120-character random passwords rotated '
'automatically. Immune to Kerberoasting.',
'Least privilege: no service account has DCSync '
'rights (Replicating Directory Changes). Only the '
'DC machine accounts have this permission.',
'Detection: alert on TGS request volume > 10/minute '
'from a single source (Kerberoasting signature).',
],
},
{
'name': 'Cloud Credential Theft (Episode 35)',
'attack_chain': [
'Find SSRF vulnerability in web application',
'SSRF to 169.254.169.254 (metadata endpoint)',
'Retrieve IAM role credentials from metadata',
'Use credentials to list S3 buckets',
'Exfiltrate customer data from S3',
],
'architecture_that_prevents': [
'IMDSv2 enforced: requires session token for metadata '
'access. SSRF cannot obtain the session token because '
'it requires a PUT request with a hop-count header.',
'Least-privilege IAM: the web app role has ONLY the '
'permissions it needs (read from specific DynamoDB table). '
'No S3 access, no IAM enumeration, no EC2 describe.',
'VPC endpoints with policies: S3 access restricted to '
'specific bucket + specific VPC. Even with valid IAM '
'credentials, data cannot leave the VPC.',
'WAF rules: block requests with internal IP ranges '
'in the URL (169.254.0.0/16, 10.0.0.0/8, etc.).',
],
},
{
'name': 'Insider Data Exfiltration (Episode 48)',
'attack_chain': [
'Employee with legitimate access to customer DB',
'Mass query over several days (slow trickle)',
'Export to personal cloud storage (Dropbox, Google Drive)',
'Walk out the door with company data',
],
'architecture_that_prevents': [
'Least privilege: employee has access to ONLY the '
'records needed for their job function, not the entire '
'database. Query results limited by role.',
'UEBA: behavioral analytics baseline detects when '
'query volume exceeds the employee normal pattern '
'(episode 48 behavioral baselines).',
'DLP at endpoint: USB blocked, personal cloud storage '
'domains blocked, email attachment scanning for '
'sensitive data patterns.',
'Data watermarking: every export is tagged with the '
'requesting user identity. Leaked documents are '
'traceable to the source.',
'Network segmentation: database servers only accessible '
'from application servers, not from workstations. The '
'employee uses the application, not direct DB queries.',
],
},
]
for scenario in ATTACK_SCENARIOS:
print(f"=== {scenario['name']} ===\n")
print("Attack chain:")
for i, step in enumerate(scenario['attack_chain'], 1):
print(f" {i}. {step}")
print("\nArchitecture that prevents it:")
for defense in scenario['architecture_that_prevents']:
print(f" - {defense}")
print()
Notice that NONE of these architectural defenses are "patch the vulnerability." Patching is reactive -- you find the bug, you fix the bug, you wait for the next bug. Architecture is proactive -- even if the SSRF vulnerability exists (and it will, because code has bugs), the architecture ensures that exploiting it doesn't give the attacker anything useful. The metadata endpoint requires IMDSv2. The IAM role has no S3 access. The VPC endpoint restricts data movement. Three independent architectural controls, each of which alone would have prevented the exfiltration. That's defense in depth applied as architecture, not as a checklist.
PASTA -- The Risk-Centric Alternative
STRIDE is great for quick threat assessments during design reviews. For critical systems where business risk justifies deeper analysis, PASTA (Process for Attack Simulation and Threat Analysis) provides a more comprehensive seven-stage methodology:
PASTA seven-stage threat modeling:
Stage 1: Define Business Objectives
"What are we protecting and why?"
- Business criticality of the system
- Revenue impact if compromised
- Regulatory requirements (GDPR, PCI-DSS, HIPAA)
- Acceptable risk thresholds
Stage 2: Define Technical Scope
"What does the system look like?"
- System architecture diagrams
- Technology stack inventory
- Third-party dependencies
- Data flow diagrams with trust boundaries
Stage 3: Application Decomposition
"What are the components and how do they interact?"
- Entry points (APIs, web forms, file uploads)
- Trust levels (anonymous, authenticated, admin)
- Data assets (PII, financial, credentials)
- Privilege boundaries
Stage 4: Threat Analysis
"Who would attack this and why?"
- Relevant threat actors (from episode 52 intelligence)
- Attack motivation (financial, espionage, destruction)
- Attack capability (script kiddie vs nation-state)
- Historical attacks against similar systems
Stage 5: Vulnerability Analysis
"What weaknesses exist in our design?"
- Known vulnerability classes for our tech stack
- Common misconfiguration patterns
- Third-party vulnerability history
- Gap analysis against security controls
Stage 6: Attack Modeling
"How would an attacker chain vulnerabilities?"
- Attack trees (logical paths from entry to objective)
- Kill chain mapping (reconnaissance through exfil)
- Red team scenarios (from episode 50)
- Probability and impact scoring per path
Stage 7: Risk and Impact Analysis
"Which risks do we accept, mitigate, or transfer?"
- Risk = likelihood x impact
- Mitigation cost vs risk cost
- Residual risk after controls
- Risk acceptance documentation (signed by owner)
PASTA vs STRIDE:
STRIDE: quick (hours), developer-focused, per-component
PASTA: thorough (days-weeks), risk-focused, system-wide
Use STRIDE in every sprint. Use PASTA for critical systems.
The key difference is that PASTA incorporates threat intelligence (Stage 4) and business context (Stage 1) into the analysis. STRIDE asks "what COULD go wrong with this component?" PASTA asks "given who is likely to attack us, how they operate, and what our business can tolerate, what is the most important thing to protect?" The answers are often different. STRIDE might tell you that an XSS vulnerability is a threat. PASTA tells you that the XSS vulnerability on the public marketing site is low risk (no sensitive data) but the XSS vulnerability on the internal admin panel is critical (direct path to admin credentials, used by APT28 in documented campaigns against your sector).
The AI Slop Connection
AI-generated architectures are a growing problem and I've been seeing this more and more in the wild. AI assistants draw network diagrams with flat networks because flat networks "work." They suggest firewall rules that are too permissive (allow all from 10.0.0.0/8) because permissive rules don't generate errors. They create IAM policies with wildcard permissions ("Action": "*") because wildcards are simpler than enumerating specific permissions. They generate Kubernetes manifests without NetworkPolicies because NetworkPolicies add complexity.
The fundamental issue: AI optimizes for "it compiles" and "it works," not for "it resists attack." A flat network works perfectly for legitimate users. Segmentation adds complexity that the AI tries to remove. Zero trust requires more configuration than VPN-and-trust-everything. The AI consistently suggests the architecturally weak option because it is the simplest option that satisfies the functional requirement.
Common AI-generated security mistakes:
1. IAM Policy:
AI generates: {"Effect": "Allow", "Action": "*", "Resource": "*"}
Should be: {"Effect": "Allow", "Action": ["s3:GetObject"],
"Resource": "arn:aws:s3:::my-bucket/public/*"}
2. Security Group:
AI generates: Ingress: 0.0.0.0/0 port 0-65535 (allow everything)
Should be: Ingress: 10.0.1.0/24 port 443 (specific subnet, HTTPS only)
3. Kubernetes:
AI generates: No NetworkPolicy (default = allow all pod traffic)
Should be: default-deny-all + explicit allow rules per service
4. Database:
AI generates: GRANT ALL PRIVILEGES ON *.* TO 'webapp'@'%'
Should be: GRANT SELECT ON products.* TO 'webapp'@'10.0.2.%'
In every case the AI-generated version "works" -- the app starts,
the queries run, the pods communicate. But the AI version is also
the version that lets an attacker move from one compromised
component to total system compromise in minutes.
Security architecture is a discipline where experienced human judgment cannot be replaced by generated configurations. The AI does not understand threat models. It generates configurations that compile, not configurations that resist adversaries. Every AI-generated infrastructure artifact must be reviewed by someone who has read episodes 1-52 of this series and can answer the question: "if this system is compromised, what else does the attacker get?"
What Comes Next
Security architecture gives you systems that resist attack by design -- not by detection, not by response, but by the fundamental structure of the system itself. But architecture does not exist in a vacuum. Organizations operate under regulatory frameworks, industry standards, and legal obligations that dictate minimum security requirements, audit procedures, and breach notification timelines. Understanding those frameworks -- what they require, where they overlap, and where organizations get caught between conflicting obligations across jurisdictions -- is the next piece of the puzzle. The business side of security is not optional, and ignoring it has consequences that are measured in fines, not in compromised hosts.
Exercises
Exercise 1: Draw a threat model for a simple web application (user authentication, product catalog, shopping cart, payment processing). Use STRIDE: identify at least 2 threats for each category (S, T, R, I, D, E). For each threat, propose a mitigation. Draw the data flow diagram showing trust boundaries between the browser, web server, application server, database, and payment gateway. Save to ~/lab-notes/threat-model-webshop.md.
Exercise 2: Design a segmented network architecture for a small company (50 employees, 10 servers, 1 web application, 1 database). Create at least 4 VLANs with firewall rules between them. Specify: which systems can communicate with which, on which ports, and what is denied by default. Compare your design against a flat network and explain which attacks from this series your segmentation prevents (reference specific episode numbers). Save to ~/lab-notes/segmented-network.md.
Exercise 3: Research BeyondCorp (Google's zero trust implementation). Document: (a) the core principles, (b) how it eliminates the traditional VPN, (c) what Google uses for device trust verification, (d) how access decisions are made for each request, (e) how it handles the "insider on the corporate network" threat that VPNs cannot address. Save to ~/lab-notes/beyondcorp-analysis.md.