Automated Security Header Scanning in CI/CD

Q: Should a missing header fail the build or just warn?

Fail the build once the assertion is proven. A warning that never blocks is ignored within a sprint and the regression ships. Only warn during the report-only rollout of a new stricter assertion while confirming it fires only on genuine drift, then make it blocking.

Q: Can I scan localhost or do I need a deployed URL?

The curl and Playwright gates scan localhost fine by pointing TARGET_URL at http://localhost:PORT. The Mozilla Observatory gate cannot: it scans from Mozilla's infrastructure and needs a publicly reachable host, so it belongs at the preview or production stage.

Q: How do I scan a page that requires authentication?

Inject credentials from a CI secret, never the script. Use curl -u user:pass or an Authorization Bearer header sourced from a masked CI variable, and assert against a path that returns 200 so headers are emitted. A 401 short-circuits header emission and produces false failures.

This guide is part of the Security Header Auditing & Compliance reference and treats response-header verification as a build gate, not a periodic audit. Headers are configuration, and configuration drifts: a refactored reverse-proxy location block, a framework upgrade that changes middleware ordering, or a CDN rule edit can strip Content-Security-Policy or weaken Strict-Transport-Security without a single test failing. The fix is to assert the exact response headers on every pull request and every deploy, and to fail the build when policy drifts. This page gives runnable assertion scripts for bash/curl, GitHub Actions, GitLab CI, a Playwright/Node response-header test, and the Mozilla Observatory API and CLI in the pipeline. For the manual workflows these gates automate, see auditing headers with curl, openssl, and testssl and header grading with Observatory and securityheaders.com.

Threat Model & Mechanics

Header regressions are silent by construction. Browsers do not surface a missing Content-Security-Policy to your monitoring; they simply enforce nothing. A Strict-Transport-Security header dropped from a location block produces no 500, no log line, and no failed health check — the page renders identically. The regression is invisible until an attacker exercises the gap, which is precisely the window a CI gate closes. The unit of failure here is not a crash but a quietly weakened security posture, and the only reliable detector is an automated assertion that compares live response headers against a declared policy on every change.

The strategic principle is shift-left: catch the regression at the cheapest point in its lifecycle. A header bug caught in a pull request costs a code-review comment; the same bug caught after a production deploy costs an incident. Three gate points form a defense-in-depth ladder:

Pull-request gate — run assertions against a locally booted server or a static-config linter so the diff never merges with a weakened header.
Preview / staging gate — run assertions against the deployed preview URL (Vercel preview, Netlify deploy preview, a Kubernetes review app) so edge and proxy behavior is exercised, not just application code.
Production smoke gate — a non-blocking post-deploy assertion that pages on drift, catching anything the lower gates could not reproduce.

Staging versus production parity

The most common false signal in header scanning is environment skew: staging emits headers via a different proxy, a different CDN tier, or a different framework flag than production. A gate that passes on staging and a header that is absent in production both indicate the same root cause — the two environments do not share a header source of truth. Drive every environment’s headers from the same committed configuration (a shared Nginx snippet, a single Helmet config module, one next.config.js headers array) and assert against each environment with the same script. When the assertion is identical and only the target URL changes, a staging pass becomes a real prediction of production behavior. Where parity is impossible — for example, basic-auth on staging that is absent in production — scope the assertion to the headers that are shared and document the exception in the gate, never silence the whole gate.

This is also where preview-URL scanning earns its place: a preview deploy exercises the real edge stack on a per-branch hostname, so it catches proxy-level regressions that a locally booted app cannot. The interaction with reporting is complementary, not redundant — a CI gate proves the header is present and well-formed at deploy time, whereas runtime CSP violation reporting and monitoring proves the policy does not break real traffic. Use both: the gate stops a bad policy from shipping, the report tells you whether a shipped policy is too tight.

The scan sits between build and deploy: a missing or weak header exits non-zero and the deploy never runs.

Implementation per tool

Every block below is copy-pasteable. They share one design rule: the assertion is declarative and fails loud. A header that is missing, weak, or malformed must produce a non-zero exit code, a human-readable message naming the header, and no deploy. Pick the tool that matches where your pipeline already lives; the curl script is the portable core that the others wrap.

Bash curl assertion script

This is the canonical gate: a single script with no dependencies beyond curl and a POSIX shell. It checks presence and, for headers where the value matters, asserts a minimum-strength substring. It accumulates failures so one run reports every problem instead of stopping at the first.

#!/usr/bin/env bash
# check-headers.sh — assert security headers on a live URL.
# Usage: ./check-headers.sh https://staging.example.com
set -euo pipefail

URL="${1:?usage: check-headers.sh <url>}"
fail=0

# Fetch headers once; follow redirects so we test the final HTTPS response.
headers="$(curl -sSL -D - -o /dev/null --max-time 15 "$URL" \
  | tr -d '\r' | tr '[:upper:]' '[:lower:]')"

# require <header-name> [must-contain-substring]
require() {
  local name="$1" needle="${2:-}"
  local line
  line="$(printf '%s\n' "$headers" | grep -m1 "^${name}:")" || {
    echo "FAIL  ${name}: missing"; fail=1; return
  }
  if [ -n "$needle" ] && ! printf '%s' "$line" | grep -qi -- "$needle"; then
    echo "FAIL  ${name}: present but missing '${needle}'  (got: ${line#*: })"
    fail=1; return
  fi
  echo "PASS  ${name}"
}

require "strict-transport-security" "max-age=31536000"
require "content-security-policy"   "default-src"
require "x-content-type-options"    "nosniff"
require "x-frame-options"
require "referrer-policy"
require "permissions-policy"

# Headers that must NOT be present (information disclosure).
forbid() {
  local name="$1"
  if printf '%s\n' "$headers" | grep -q "^${name}:"; then
    echo "FAIL  ${name}: should be removed"; fail=1
  else
    echo "PASS  ${name} absent"
  fi
}
forbid "server"
forbid "x-powered-by"

if [ "$fail" -ne 0 ]; then
  echo "Header gate failed for ${URL}" >&2
  exit 1
fi
echo "Header gate passed for ${URL}"

Why these choices matter: -L follows the HTTP→HTTPS redirect so you assert against the final secured response, not the 301; --max-time 15 bounds a hung edge so the job fails fast instead of hanging the pipeline; lowercasing both the header dump and the comparison makes the match case-insensitive, which HTTP header names require. The forbid block enforces the removal of disclosure headers covered in the Server and X-Powered-By header removal guide. Keep the asserted values in sync with your real policy: the script is only as honest as its require lines.

GitHub Actions workflow

Wire the script into a workflow that runs on pull requests (against a booted app or preview) and on deploys (against the live URL). The job fails the moment the script exits non-zero, which blocks the merge or the dependent deploy job.

# .github/workflows/header-scan.yml
name: Security header scan
on:
  pull_request:
  workflow_dispatch:
    inputs:
      target_url:
        description: "URL to scan"
        required: true

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Resolve target URL
        id: target
        run: |
          if [ -n "$" ]; then
            echo "url=$" >> "$GITHUB_OUTPUT"
          else
            echo "url=https://staging.example.com" >> "$GITHUB_OUTPUT"
          fi

      - name: Run header gate
        run: |
          chmod +x ./scripts/check-headers.sh
          ./scripts/check-headers.sh "$"

To gate production, make the deploy job depend on the scan against the freshly deployed preview:

  deploy:
    needs: scan
    runs-on: ubuntu-latest
    steps:
      - run: ./scripts/deploy.sh   # only runs if scan passed

The needs: scan edge is the gate. Without it the deploy job runs in parallel and a header regression ships anyway; with it, a non-zero exit from the script removes the deploy from the run.

GitLab CI job

The same script slots into a .gitlab-ci.yml stage. GitLab’s stages ordering provides the gate: the deploy stage never starts if the scan stage job exits non-zero.

# .gitlab-ci.yml
stages:
  - build
  - scan
  - deploy

header-scan:
  stage: scan
  image: curlimages/curl:latest
  variables:
    TARGET_URL: "https://staging.example.com"
  script:
    - chmod +x ./scripts/check-headers.sh
    - ./scripts/check-headers.sh "$TARGET_URL"
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

deploy-prod:
  stage: deploy
  script:
    - ./scripts/deploy.sh
  needs:
    - header-scan
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'

The curlimages/curl image keeps the job container small and guarantees curl is present without an install step. The needs: [header-scan] declares the explicit dependency so a failed scan blocks the deploy even with DAG-style pipelines. For dynamic environments, set TARGET_URL from $CI_ENVIRONMENT_URL so the job scans whatever review app the merge request created.

Playwright / Node response-header test

When your stack already runs Playwright or Node tests, asserting headers as a normal test gives you the same gate with familiar reporting and parallelism. Playwright’s request fixture fetches without a browser, so this is fast and runs anywhere Node does.

// tests/headers.spec.ts
import { test, expect } from '@playwright/test';

const TARGET = process.env.TARGET_URL ?? 'https://staging.example.com';

test('security headers are present and correct', async ({ request }) => {
  const res = await request.get(TARGET, { maxRedirects: 5 });
  expect(res.status()).toBeLessThan(400);

  const h = res.headers(); // keys are lowercased by Playwright

  expect(h['strict-transport-security']).toMatch(/max-age=31536000/);
  expect(h['content-security-policy']).toContain('default-src');
  expect(h['x-content-type-options']).toBe('nosniff');
  expect(h['x-frame-options']).toBeDefined();
  expect(h['referrer-policy']).toBeDefined();
  expect(h['permissions-policy']).toBeDefined();

  // Disclosure headers must be stripped.
  expect(h['x-powered-by']).toBeUndefined();
  expect(h['server']).toBeUndefined();
});

Run it in CI with npx playwright test tests/headers.spec.ts. A failing expect exits non-zero and fails the job exactly like the bash gate. Playwright lowercases header keys, so index them in lowercase regardless of how the server cased them on the wire. The maxRedirects: 5 mirrors curl’s -L: you assert against the final response after the HTTPS upgrade, not the redirect. Use toMatch for value-strength checks (HSTS max-age, CSP default-src) and toBeDefined/toBeUndefined for pure presence/absence, which keeps the test from breaking on harmless value reordering.

Mozilla Observatory API and CLI

A presence check confirms headers exist; a grade confirms the combination is strong. The Observatory grading workflow runs in CI through the mdn-http-observatory CLI or its HTTP API, letting you fail the build below a minimum grade. Note Observatory requires a publicly reachable URL — it cannot scan localhost — so this gate belongs at the preview or production stage, not the pull-request stage.

CLI form, gating on the returned grade:

#!/usr/bin/env bash
# observatory-gate.sh — fail if the Observatory grade is below a floor.
set -euo pipefail
HOST="${1:?usage: observatory-gate.sh <host>}"
MIN_SCORE="${2:-90}"   # B+ ~= 80, A ~= 90

# mdn-http-observatory ships a scan command emitting JSON.
result="$(npx --yes mdn-http-observatory scan "$HOST" --format json)"
score="$(printf '%s' "$result" | node -e \
  'process.stdin.on("data",d=>console.log(JSON.parse(d).scan.score))')"

echo "Observatory score for ${HOST}: ${score} (floor ${MIN_SCORE})"
if [ "$score" -lt "$MIN_SCORE" ]; then
  echo "Observatory grade below floor" >&2
  exit 1
fi

API form, for environments without Node, polling the public scanner:

#!/usr/bin/env bash
set -euo pipefail
HOST="${1:?usage: <host>}"
BASE="https://observatory-api.mdn.mozilla.net/api/v2"

# Trigger a fresh scan, then read the result.
curl -fsS -X POST "${BASE}/scan?host=${HOST}" > /dev/null
score="$(curl -fsS "${BASE}/scan?host=${HOST}" \
  | node -pe 'JSON.parse(require("fs").readFileSync(0)).score')"

echo "score=${score}"
[ "${score:-0}" -ge 90 ] || { echo "below floor" >&2; exit 1; }

Treat the Observatory gate as a strength ratchet layered on top of the presence gate, not a replacement for it. The presence script is fast, deterministic, and offline-friendly, so it guards every PR; the Observatory grade is slower and network-dependent, so it guards the deploy. Pin a numeric floor rather than a letter grade in the assertion, because the grade-to-score mapping is more stable as an integer comparison and avoids parsing a localized grade string.

Verification & Diagnostic

A gate is only trustworthy if you have seen it both pass and fail. Run the bash script against a correctly configured host and confirm the passing output:

$ ./scripts/check-headers.sh https://staging.example.com
PASS  strict-transport-security
PASS  content-security-policy
PASS  x-content-type-options
PASS  x-frame-options
PASS  referrer-policy
PASS  permissions-policy
PASS  server absent
PASS  x-powered-by absent
Header gate passed for https://staging.example.com
$ echo $?
0

Now deliberately break it — remove the CSP from your staging config and re-run — to confirm the gate fails loud and exits non-zero:

$ ./scripts/check-headers.sh https://staging.example.com
PASS  strict-transport-security
FAIL  content-security-policy: missing
PASS  x-content-type-options
PASS  x-frame-options
PASS  referrer-policy
PASS  permissions-policy
PASS  server absent
PASS  x-powered-by absent
Header gate failed for https://staging.example.com
$ echo $?
1

In GitHub Actions the same failure renders as a red Run header gate step with the FAIL content-security-policy: missing line in the log and the job marked failed, which blocks the deploy job through needs: scan. In GitLab the header-scan job turns red and the deploy-prod job is skipped because its needs dependency failed. A weak-value failure looks like this, which is the case a naive presence-only check would miss:

FAIL  strict-transport-security: present but missing 'max-age=31536000'  (got: max-age=300)

That line is the whole point of asserting values, not just presence: a one-year HSTS policy silently downgraded to five minutes passes a presence check and fails a value check. Validate value-strength assertions for HSTS and Content-Security-Policy the same way before trusting the gate in production.

Troubleshooting & Safe Rollback

Header gates fail in a small, recognizable set of ways. Map the symptom to the fix rather than reflexively loosening the assertion.

Flaky scans (intermittent timeouts) → a cold serverless function or a slow edge cache miss exceeds --max-time. Raise the timeout to 20–30s and add a bounded retry around the curl call (for i in 1 2 3; do curl ... && break; sleep 5; done), but never retry a value failure — only a transport failure. A header that is wrong on the first request is wrong; retrying only masks it.
Scanning preview URLs → preview hostnames are dynamic. Read the deployed URL from the platform’s output (Vercel’s deployment URL, $CI_ENVIRONMENT_URL, the Netlify deploy-preview URL) into TARGET_URL rather than hardcoding. Confirm the preview is fully deployed before scanning — gate the scan on the deploy step’s completion, not its start.
Auth-gated pages → a staging site behind basic-auth returns 401 before your headers are emitted, so every assertion fails. Pass credentials to curl (curl -u "$STAGING_USER:$STAGING_PASS") or a bearer header (-H "Authorization: Bearer $TOKEN") from a CI secret, and assert against a path that returns 200. Never bake credentials into the script; inject them as masked CI variables.
Header present locally but missing in CI → the booted app under test is not behind the same proxy that adds edge headers in production. This is environment skew; scan a preview deploy that includes the real edge, not a bare localhost app, or restrict the local gate to application-emitted headers only.
Observatory gate returns null/zero score → the host is not publicly reachable or the scan has not finished. Poll the result endpoint until the scan state is FINISHED before reading the score, and skip the Observatory gate entirely for internal-only hosts.
Exit codes → standardize on 0 = pass, 1 = policy failure, and let transport errors from curl (-f/-S) surface as their own non-zero codes. Do not collapse “header wrong” and “site unreachable” into the same message — they need different responses (fix the config vs. retry the request).

Safe rollback of the gate itself: a header gate is non-destructive — it blocks a deploy, it does not change production — so the rollback is simply to stop blocking. When you introduce a new, stricter assertion (for example, raising the HSTS floor or adding a CSP value check), land it first in report-only mode: run the script with || true appended, or as a continue-on-error: true step in Actions / allow_failure: true in GitLab, so it logs failures without blocking. Watch a few real deploys, confirm the assertion only fires on genuine regressions, then remove the override to make it blocking. This mirrors the report-only rollout pattern used for CSP itself: observe before you enforce. Never silence a gate that has already shipped as blocking; if it fires, fix the header.

Frequently Asked Questions

Where in the pipeline should the header scan run — PR, preview, or production? All three, with different strictness. The fast presence-only bash script runs on every pull request against a booted app to catch application-layer regressions cheaply. The full script plus the Observatory grade runs against the preview/staging deploy to exercise the real edge stack. A non-blocking smoke assertion runs post-production-deploy as a backstop. Each lower gate makes the higher one rarely fire.

Should a missing header fail the build or just warn? Fail the build, once the assertion is proven. A warning that never blocks is ignored within a sprint, and the regression ships. The only time to warn rather than fail is during the report-only rollout of a new stricter assertion, when you are confirming the check fires only on genuine drift. After that window, make it blocking.

Can I scan localhost or do I need a deployed URL? The curl and Playwright gates scan localhost fine — boot the app and point TARGET_URL at http://localhost:PORT. The Mozilla Observatory gate cannot: it scans from Mozilla’s infrastructure and needs a publicly reachable host, so it belongs at the preview or production stage, never the PR stage.

How do I scan a page that requires authentication? Inject credentials from a CI secret, never the script. Use curl -u user:pass or -H "Authorization: Bearer $TOKEN" with the value sourced from a masked CI variable, and assert against a path that returns 200 so the headers are actually emitted. A 401 short-circuits header emission and produces false failures.

Why assert header values and not just presence? Because the dangerous regressions are downgrades, not deletions. An HSTS max-age quietly cut from one year to five minutes, or a CSP that loses default-src but keeps the header name, both pass a presence check while gutting the protection. Asserting a minimum-strength substring (max-age=31536000, default-src) catches the downgrade that presence alone cannot.