UNCLASSIFIED - NO CUI

[Enhancement request] Block network egress in CI jobs to domains that aren't on an allowlist

Problem statement

CI jobs hitting unexpected network endpoints can lead to things getting merged that can be difficult to maintain. This could be "hits NPM too much too often and gets rate limited" or "subtly breaks bbctl releases when attempting to hit a nonexistent URL" (bbctl#39).

Proposed solution

Borrow the "lock down egress to all but a short list of allowed domains" technique from the bigbang umbrella chart's CI jobs. While those jobs can rely on istio hardening to lock down egress, many of our CI jobs don't (and shouldn't) run at the big bang cluster level and so we'd want to manage a job's allowlist at the gitlab job YAML definition level.

Here's an approach that should make that possible, courtesy of @pjoyce over at CNAP:

  • Use iptables to lock down egress in the CI job containers up front
  • Use an entrypoint.sh inside of those containers to unblock anything specified in an ALLOWED_DOMAINS env var (by the CI job!) at container startup time

Example files and test log

Both of these files were written by @pjoyce, he is awesome!

Example: build the test image for the next two examples

> docker build -t restricted-network .

Example: Allowed domain

# run the test container with some allowed domains and confirm we can reach one

› docker run --rm -it --privileged \
    -e ALLOWED_DOMAINS="docker.io,github.com" \
    restricted-network bash

Allowing docker.io (100.29.167.200)
Allowing github.com (140.82.112.3)

root@ca55b13d2646:/# curl -I https://github.com

HTTP/2 200
server: GitHub.com
date: Wed, 05 Feb 2025 17:36:40 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Fr

Example: No allowed domains

# run the test container with no allowed domains and confirm we can't reach one

> docker run --rm -i -t --privileged restricted-network bash

root@f5e08ada1d2f:/# curl --connect-timeout 2 -I https://github.com

curl: (28) Failed to connect to github.com port 443 after 2001 ms: Timeout was reached

Things I don't fully know how to check and plan around yet

Edited by Daniel Pritchett