UNCLASSIFIED - NO CUI

Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • big-bang/bigbang
  • joshwolf/umbrella
  • 90-cos/iac/bigbang
  • cbrechbuhl/bigbang
  • runyontr/bigbang-core
  • snekcode/bigbang
  • michael.mendez/bigbang
  • daniel.dides/bigbang
  • ryan.j.garcia/rjgbigbang
  • nicole.dupree/bigbang
10 results
Show changes
Commits on Source (10)
Showing
with 464 additions and 1207 deletions
......@@ -31,4 +31,4 @@ ignore/*
*.code-workspace
# Local History for Visual Studio Code
.history/
\ No newline at end of file
.history/
......@@ -261,7 +261,7 @@ jaeger:
git:
repo: https://repo1.dso.mil/platform-one/big-bang/apps/core/jaeger.git
path: "./chart"
tag: "2.27.1-bb.4"
tag: "2.29.0-bb.0"
# -- Flux reconciliation overrides specifically for the Jaeger Package
flux:
......@@ -418,7 +418,7 @@ logging:
git:
repo: https://repo1.dso.mil/platform-one/big-bang/apps/core/elasticsearch-kibana.git
path: "./chart"
tag: "0.6.0-bb.2"
tag: "0.7.0-bb.0"
# -- Flux reconciliation overrides specifically for the Logging (EFK) Package
flux:
......@@ -553,7 +553,7 @@ monitoring:
git:
repo: https://repo1.dso.mil/platform-one/big-bang/apps/core/monitoring.git
path: "./chart"
tag: "32.2.1-bb.2"
tag: "33.2.0-bb.0"
# -- Flux reconciliation overrides specifically for the Monitoring Package
flux:
......@@ -847,10 +847,10 @@ addons:
# -- Name of AWS IAM profile to use.
# -- If using an AWS IAM profile, the accessKey and accessSecret values must be left as empty strings eg: ""
iamProfile: ""
redis:
# -- Redis plain text password to connect to the redis server. If empty (""), the gitlab charts will create the gitlab-redis-secret
# with a random password.
# with a random password.
# -- This needs to be set to a non-empty value in order for the Grafana Redis Datasource and Dashboards to be installed.
password: ""
......@@ -1032,7 +1032,7 @@ addons:
git:
repo: https://repo1.dso.mil/platform-one/big-bang/apps/security-tools/anchore-enterprise.git
path: "./chart"
tag: "1.15.0-bb.10"
tag: "1.17.1-bb.0"
# -- Flux reconciliation overrides specifically for the Anchore Package
flux:
......
# Guides
## backups-and-migrations
Guides on handling backups/migrations using Velero for specific Big Bang packages.
## deployment_scenarios
Beginner friendly how to guides are intended to be added to these subfolders over time.
......
# Migrating a Nexus Repository using Velero
This guide demonstrates how to perform a migration of Nexus repositories and
artifacts between Kubernetes clusters.
# Table of Contents
1. [Prerequisites/Assumptions](#prerequisitesassumptions)
1. [Preparation](#preparation)
2. [Backing Up Nexus](#backing-up-nexus)
3. [Restoring From Backup](#restoring-from-backup)
3. [Appendix](#appendix)
<a name="prerequisitesassumptions"></a>
# Prerequisites/Assumptions
# Migrating a Nexus Repository Using Velero
This guide demonstrates how to perform a migration of Nexus repositories and artifacts between Kubernetes clusters.
[[_TOC_]]
## Prerequisites/Assumptions
- K8s running in AWS
- Nexus PersistentVolume is using AWS EBS
- Migration is between clusters on the same AWS instance and availability zone (due to known Velero [limitations](https://velero.io/docs/v1.6/locations/#limitations--caveats))
- Migation occurs between K8s clusters with the same version
- Migration occurs between K8s clusters with the same version
- Velero CLI [tool](https://github.com/vmware-tanzu/velero/releases)
- Crane CLI [tool](https://github.com/google/go-containerregistry)
<a name="preparation"></a>
# Preparation
## Preparation
1. Ensure the Velero addon in the Big Bang values file is properly configured, sample configuration below:
```yaml
addons:
velero:
enabled: true
plugins:
- aws
values:
serviceAccount:
server:
name: velero
configuration:
provider: aws
backupStorageLocation:
bucket: nexus-velero-backup
volumeSnapshotLocation:
provider: aws
config:
region: us-east-1
credentials:
useSecret: true
secretContents:
cloud: |
[default]
aws_access_key_id = <CHANGE ME>
aws_secret_access_key = <CHANGE ME>
```
2. Manually create an S3 bucket that the backup configuration will be stored in (in this case it is named `nexus-velero-backup`), this should match the `configuration.backupStorageLocation.bucket` key above
3. The `credentials.secretContents.cloud` creds should have the necessary permissions to read/write to S3, volumes and volume snapshots
4. As a sanity check, take a look at the Velero logs to make sure the backup location (S3 bucket) is valid, you should see something like:
```yaml
addons:
velero:
enabled: true
plugins:
- aws
values:
serviceAccount:
server:
name: velero
configuration:
provider: aws
backupStorageLocation:
bucket: nexus-velero-backup
volumeSnapshotLocation:
provider: aws
config:
region: us-east-1
credentials:
useSecret: true
secretContents:
cloud: |
[default]
aws_access_key_id = <CHANGE ME>
aws_secret_access_key = <CHANGE ME>
```
1. Manually create an S3 bucket that the backup configuration will be stored in (in this case it is named `nexus-velero-backup`), this should match the `configuration.backupStorageLocation.bucket` key above
1. The `credentials.secretContents.cloud` credentials should have the necessary permissions to read/write to S3, volumes and volume snapshots
1. As a sanity check, take a look at the Velero logs to make sure the backup location (S3 bucket) is valid, you should see something like:
```plaintext
level=info msg="Backup storage location valid, marking as available" backup-storage-location=default controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:121"
```
5. Ensure there are images/artifacts in Nexus. An as example we will use the [Doom DOS image](https://earthly.dev/blog/dos-gaming-in-docker/) and a simple nginx image. Running `crane catalog nexus-docker.bigbang.dev` will show all of the artifacts and images in Nexus:
```
1. Ensure there are images/artifacts in Nexus. An as example we will use the [Doom DOS image](https://earthly.dev/blog/dos-gaming-in-docker/) and a simple nginx image. Running `crane catalog nexus-docker.bigbang.dev` will show all of the artifacts and images in Nexus:
```console
repository/nexus-docker/doom-dos
repository/nexus-docker/nginx
```
<a name="backing-up-nexus"></a>
# Backing Up Nexus
## Backing Up Nexus
In the cluster containing the Nexus repositories to migrate, running the following command will create a backup called `nexus-ns-backup` and will backup all resources in the `nexus-repository-manager` namespace, including the associated PersistentVolume:
`velero backup create nexus-ns-backup --include-namespaces nexus-repository-manager --include-cluster-resources=true`
```shell
velero backup create nexus-ns-backup --include-namespaces nexus-repository-manager --include-cluster-resources=true
```
Specifically, this will backup all Nexus resources to the S3 bucket `configuration.backupStorageLocation.bucket` specified above and will create a volume snapshot of the Nexus EBS volume.
**Double-check** AWS to make sure this is the case by reviewing the contents of the S3 bucket:
`aws s3 ls s3://nexus-velero-backup --recursive --human-readable --summarize`
**Double-check** AWS to make sure this is the case by reviewing the contents of the S3 bucket:
```shell
aws s3 ls s3://nexus-velero-backup --recursive --human-readable --summarize
```
Expected output:
```
```console
backups/nexus-ns-backup/nexus-ns-backup-csi-volumesnapshotcontents.json.gz
backups/nexus-ns-backup/nexus-ns-backup-csi-volumesnapshots.json.gz
backups/nexus-ns-backup/nexus-ns-backup-logs.gz
......@@ -81,51 +87,56 @@ backups/nexus-ns-backup/nexus-ns-backup-volumesnapshots.json.gz
backups/nexus-ns-backup/nexus-ns-backup.tar.gz
backups/nexus-ns-backup/velero-backup.json
```
Also ensure an EBS volume snapshot has been created and the Snapshot status is `Completed`.
![volume-snapshot](images/volume-snapshot.png)
<a name="restoring-from-backup"></a>
# Restoring From Backup
## Restoring From Backup
1. In the new cluster, ensure that Nexus and Velero are running and healthy
* It is critical to ensure that Nexus has been included in the new cluster's Big Bang deployment, otherwise the restored Nexus configuration will not be managed by the Big Bang Helm chart.
2. If you are using the same `velero.values` from above, Velero should automatically be configured to use the same backup location as before. Verify this with `velero backup get` and you should see output that looks like:
```
- It is critical to ensure that Nexus has been included in the new cluster's Big Bang deployment, otherwise the restored Nexus configuration will not be managed by the Big Bang Helm chart.
1. If you are using the same `velero.values` from above, Velero should automatically be configured to use the same backup location as before. Verify this with `velero backup get` and you should see output that looks like:
```console
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
nexus-ns-backup Completed 0 0 2022-02-08 12:34:46 +0100 CET 29d default <none>
```
3. To perform the migration, Nexus must be shut down. In the Nexus Deployment, bring the `spec.replicas` down to `0`.
4. Ensure that the Nexus PVC and PV are also removed (**you may have to delete these manually!**), and that the corresponding Nexus EBS volume has been deleted.
* If you have to remove the Nexus PV and PVC manually, delete the PVC first, which should cascade to the PV; then, manually delete the underlying EBS volume (if it still exists)
1. To perform the migration, Nexus must be shut down. In the Nexus Deployment, bring the `spec.replicas` down to `0`.
1. Ensure that the Nexus PVC and PV are also removed (**you may have to delete these manually!**), and that the corresponding Nexus EBS volume has been deleted.
- If you have to remove the Nexus PV and PVC manually, delete the PVC first, which should cascade to the PV; then, manually delete the underlying EBS volume (if it still exists)
5. Now that Nexus is down and the new cluster is configured to use the same backup location as the old one, perform the migration by running:
1. Now that Nexus is down and the new cluster is configured to use the same backup location as the old one, perform the migration by running:
`velero restore create --from-backup nexus-ns-backup`
6. The Nexus PV and PVC should be recreated (verify before continuing!), but the pod will fail to start due to the previous change in the Nexus deployment spec. Change the Nexus deployment `spec.replicas` back to `1`. This will bring up the Nexus pod which should connect to the PVC and PV created during the Velero restore.
7. Once the Nexus pod is running and healthy, log in to Nexus and verify that the repositories have been restored
* The credentials to log in will have been restored from the Nexus backup, so they should match the credentials of the Nexus that was migrated (not the new installation!)
* It is recommended to log in to Nexus and download a sampling of images/artifacts to ensure they are working as expected.
1. The Nexus PV and PVC should be recreated (verify before continuing!), but the pod will fail to start due to the previous change in the Nexus deployment spec. Change the Nexus deployment `spec.replicas` back to `1`. This will bring up the Nexus pod which should connect to the PVC and PV created during the Velero restore.
1. Once the Nexus pod is running and healthy, log in to Nexus and verify that the repositories have been restored
- The credentials to log in will have been restored from the Nexus backup, so they should match the credentials of the Nexus that was migrated (not the new installation!)
- It is recommended to log in to Nexus and download a sampling of images/artifacts to ensure they are working as expected.
For example, login to Nexus using the migrated credentials:
`docker login -u admin -p admin nexus-docker.bigbang.dev/repository/nexus-docker`
Running `crane catalog nexus-docker.bigbang.dev` should show the same output as before:
```text
```console
repository/nexus-docker/doom-dos
repository/nexus-docker/nginx
```
To ensure the integrity of the migrated image, we will pull and run the `doom-dos` image and defeat evil!
```
```shell
docker pull nexus-docker.bigbang.dev/repository/nexus-docker/doom-dos:latest && \
docker run -p 8000:8000 nexus-docker.bigbang.dev/repository/nexus-docker/doom-dos:latest
```
<img src="images/doom.png" alt="doom" width="750"/>
<a name="appendix"></a>
# Appendix
### Sample Nexus values:
![doom](images/doom.png "doom")
## Appendix
### Sample Nexus values
```yaml
addons:
nexus:
......@@ -137,4 +148,4 @@ addons:
registries:
- host: nexus-docker.bigbang.dev
port: 5000
```
\ No newline at end of file
```
......@@ -163,7 +163,7 @@ Big Bang will automatically create a secret with the TLS key and cert provided f
### Virtual Services
Virtual services use full URL host and path information to route incoming traffic to a Service. Each package in Big Bang manages its own Virtual Services since the paths and ports vary for each package. However, in order to receive traffic at the Virtual Service, it must be connected to a Gateway. In Big Bang we configure this under each package. The followng is an example of this configuration that matches the architecture diagram above.
Virtual services use full URL host and path information to route incoming traffic to a Service. Each package in Big Bang manages its own Virtual Services since the paths and ports vary for each package. However, in order to receive traffic at the Virtual Service, it must be connected to a Gateway. In Big Bang we configure this under each package. The following is an example of this configuration that matches the architecture diagram above.
```yaml
monitoring:
......
# How to monitor Pod Resource using Grafana
# How To Monitor Pod Resource Using Grafana
1. Log in to Grafana url with credentials \
To Get Grafana credentals: \
To Get Grafana credentials: \
Username:
```
```shell
kubectl get secret monitoring-monitoring-grafana -o jsonpath='{.data.admin-user}' | base64 -d
```
Password:
```
```shell
kubectl get secret monitoring-monitoring-grafana -o jsonpath='{.data.admin-password}' | base64 -d
```
Or [review password value within helm chart](https://repo1.dso.mil/platform-one/big-bang/apps/core/monitoring/-/blob/main/chart/values.yaml#L708)
2. Once logged in and directed to the home page, click the menu Dashboard and then select Manage. \
1. Once logged in and directed to the home page, click the menu Dashboard and then select Manage. \
![Manage Dashboard Screenshot](docs/guides/prerequisites/grafana-dashboard-manage.jpeg)
3. From the Dashboard select Kubernetes/Compute Resource / Pod . \
1. From the Dashboard select Kubernetes/Compute Resource / Pod . \
This creates a dashboard to monitor the pod resource CPU Usage, CPU Throttling, CPU quota, Memory Usage, Memory Quota, etc. \
![Pod Resource Grafana Screenshot](docs/guides/prerequisites/grafana-dashboard.jpeg)
# Default Storage Class prerequisite
# Default Storage Class Prerequisite
* BigBang assumes the cluster you're deploying to supports [dynamic volume provisioning](https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/).
* A BigBang Cluster should have 1 Storage Class annotated as the default SC.
* For Production Deployments it is recommended to leverage a Storage Class that supports the creation of volumes that support ReadWriteMany [Access Mode](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes), as there are a few BigBang Addons, where an HA application configuration requires a storage class that supports ReadWriteMany.
* For Production Deployments it is recommended to leverage a Storage Class that supports the creation of volumes that support ReadWriteMany [Access Mode](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes), as there are a few BigBang Addons, where an HA application configuration requires a storage class that supports ReadWriteMany.
## How Dynamic Volume Provisioning Works in a Nut Shell
## How Dynamic volume provisioning works in a nut shell
* StorageClass + PersistentVolumeClaim = Dynamically Created Persistent Volume
* A PersistentVolumeClaim that does not reference a specific StorageClass will leverage the default StorageClass. (Of which there should only be 1, identified using kubernetes annotations.) Some Helm Charts allow a storage class to be explicitly specified so that multiple storage classes can be used simultaneously.
* A PersistentVolumeClaim that does not reference a specific StorageClass will leverage the default StorageClass. (Of which there should only be 1, identified using kubernetes annotations.) Some Helm Charts allow a storage class to be explicitly specified so that multiple storage classes can be used simultaneously.
## How To Check What Storage Classes Are Installed on Your Cluster
## How to check what storage classes are installed on your cluster
* `kubectl get storageclass` can be used to see what storage classes are available on a cluster, the default will be marked as such.
* `kubectl get storageclass` can be used to see what storage classes are available on a cluster, the default will be marked as such.
* Note: You can have multiple storage classes, but you should only have 1 default storage class.
```bash
```shell
kubectl get storageclass
# NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
# local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 47h
......@@ -23,6 +25,7 @@ kubectl get storageclass
## AWS Specific Notes
### Example AWS Storage Class Configuration
```yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
......@@ -40,29 +43,34 @@ reclaimPolicy: Retain
allowVolumeExpansion: true
```
### AWS EBS Volumes:
### AWS EBS Volumes
* AWS EBS Volumes have the following limitations:
* An EBS volume can only be attached to a single Kubernetes Node at a time, thus ReadWriteMany Access Mode isn't supported.
* An EBS PersistentVolume in AZ1 (Availability Zone 1), cannot be mounted by a worker node in AZ2.
### AWS EFS Volumes:
### AWS EFS Volumes
* An AWS EFS Storage Class can be installed according to the [vendors docs](https://github.com/kubernetes-sigs/aws-efs-csi-driver#installation).
* AWS EFS Storage Class supports ReadWriteMany Access Mode.
* AWS EFS Storage Class supports ReadWriteMany Access Mode.
* AWS EFS Persistent Volumes can be mounted by worker nodes in multiple AZs.
* AWS EFS is basically NFS(NetworkFileSystem) as a Service. NFS cons like latency apply equally to EFS, thus it's not a good fit for for databases.
------------------------------------------------------
## Azure Specific Notes
### Azure Disk Storage Class Notes
* The Kubernetes Docs offer an Example [Azure Disk Storage Class](https://kubernetes.io/docs/concepts/storage/storage-classes/#azure-disk)
* An Azure disk can only be mounted with Access mode type ReadWriteOnce, which makes it available to one node in AKS.
* An Azure disk can only be mounted with Access mode type ReadWriteOnce, which makes it available to one node in AKS.
* An Azure Disk PersistentVolume in AZ1, can be mounted by a worker node in AZ2 (although some additional lag is involved in such transitions).
------------------------------------------------------
## Bare Metal/Cloud Agnostic Store Class Notes
* The BigBang Product team put together a [Comparison Matrix of a few Cloud Agnostic Storage Class offerings](../../k8s-storage/README.md#kubernetes-storage-options)
* Note: No storage class specific container images exist in IronBank at this time.
* Approved IronBank Images will show up in https://registry1.dso.mil
* https://repo1.dso.mil/dsop can be used to check status of IronBank images.
* Note: No storage class specific container images exist in IronBank at this time.
* Approved IronBank Images will show up in <https://registry1.dso.mil>
* <https://repo1.dso.mil/dsop> can be used to check status of IronBank images.
# Install the flux cli tool
# Install the Flux CLI Tool
```bash
```shell
sudo curl -s https://fluxcd.io/install.sh | sudo bash
```
> Fedora Note: kubectl is a prereq for flux, and flux expects it in `/usr/local/bin/kubectl` symlink it or copy the binary to fix errors.
## Install flux.yaml to the cluster
```bash
> Fedora Note: kubectl is a prereq for flux, and flux expects it in `/usr/local/bin/kubectl` symlink it or copy the binary to fix errors.
## Install flux.yaml to the Cluster
```shell
export REGISTRY1_USER='REPLACE_ME'
export REGISTRY1_TOKEN='REPLACE_ME'
```
> In production use robot credentials, single quotes are important due to the '$'
`export REGISTRY1_USER='robot$bigbang-onboarding-imagepull'`
```bash
```shell
kubectl create ns flux-system
kubectl create secret docker-registry private-registry \
--docker-server=registry1.dso.mil \
......@@ -23,25 +25,22 @@ kubectl create secret docker-registry private-registry \
--namespace flux-system
kubectl apply -f https://repo1.dso.mil/platform-one/big-bang/bigbang/-/raw/master/scripts/deploy/flux.yaml
```
> k apply -f flux.yaml, is equivalent to "flux install", but it installs a version of flux that's been tested and gone through IronBank.
> k apply -f flux.yaml, is equivalent to "flux install", but it installs a version of flux that's been tested and gone through IronBank.
#### Now you can see new CRD objects types inside of the cluster
```bash
### Now You Can See New CRD Objects Types Inside of the Cluster
```shell
kubectl get crds | grep flux
```
# Advanced Installation
## Advanced Installation
Clone the Big Bang repo and use the awesome installation [scripts](https://repo1.dso.mil/platform-one/big-bang/bigbang/-/tree/master/scripts) directory
```bash
```shell
git clone https://repo1.dso.mil/platform-one/big-bang/bigbang.git
./bigbang/scripts/install_flux.sh
```
> **NOTE** install_flux.sh requires arguments to run properly, calling it will print out a friendly USAGE mesage with required arguments needed to complete installation.
> **NOTE** install_flux.sh requires arguments to run properly, calling it will print out a friendly USAGE mesage with required arguments needed to complete installation.
# Kubernetes Cluster Preconfiguration:
# Kubernetes Cluster Preconfiguration
## Best Practices
## Best Practices:
* A CNI (Container Network Interface) that supports Network Policies (which are basically firewalls for the Inner Cluster Network.) (Note: k3d, which is recommended for the quickstart demo, defaults to flannel, which does not support network policies.)
* All Kubernetes Nodes and the LB associated with the kube-apiserver should all use private IPs.
* In most case User Application Facing LBs should have Private IP Addresses and be paired with a defense in depth Ingress Protection mechanism like [P1's CNAP](https://p1.dso.mil/#/products/cnap/), a CNAP equivalent (Advanced Edge Firewall), VPN, VDI, port forwarding through a bastion, or air gap deployment.
......@@ -10,13 +10,14 @@
* Consider using a licensed Kubernetes Distribution with a support contract.
* [A default storage class should exist](default_storageclass.md) to support dynamic provisioning of persistent volumes.
## Service of Type Load Balancer
## Service of Type Load Balancer:
BigBang's default configuration assumes the cluster you're deploying to supports dynamic load balancer provisioning. Specifically Istio defaults to creating a Kubernetes Service of type Load Balancer, which usually creates an endpoint exposed outside of the cluster that can direct traffic inside the cluster to the istio ingress gateway.
How Kubernetes service of type LB works depends on implementation details, there are many ways of getting it to work, common methods are listed below:
* CSP API Method: (Recommended option for Cloud Deployments)
The Kubernetes Control Plane has a --cloud-provider flag that can be set to aws, azure, etc. If the Kubernetes Master Nodes have that flag set and CSP IAM rights. The control plane will auto provision and configure CSP LBs. (Note: a Vendors Kubernetes Distro automation, may have IaC/CaC defaults that allow this to work turn key, but if you have issues when provisioning LBs, consult with the Vendor's support for the recommended way of configuring automatic LB provisioning.)
The Kubernetes Control Plane has a --cloud-provider flag that can be set to aws, azure, etc. If the Kubernetes Master Nodes have that flag set and CSP IAM rights. The control plane will auto provision and configure CSP LBs. (Note: a Vendors Kubernetes Distribution automation, may have IaC/CaC defaults that allow this to work turn key, but if you have issues when provisioning LBs, consult with the Vendor's support for the recommended way of configuring automatic LB provisioning.)
* External LB Method: (Good for bare metal and 0 IAM rights scenarios)
You can override bigbang's helm values so istio will provision a service of type NodePort instead of type LoadBalancer. Instead of randomly generating from the port range of 30000 - 32768, the NodePorts can be pinned to convention based port numbers like 30080 & 30443. If you're in a restricted cloud env or bare metal you can ask someone to provision a CSP LB where LB:443 would map to Nodeport:30443 (of every worker node), etc.
* No LB, Network Routing Methods: (Good options for bare metal)
......@@ -24,81 +25,87 @@ You can override bigbang's helm values so istio will provision a service of type
* [kubevip](https://kube-vip.io/)
* [kube-router](https://www.kube-router.io)
## BigBang Doesn’t Support PSPs (Pod Security Policies)
## BigBang doesn't support PSPs (Pod Security Policies):
* [PSP's are being removed from Kubernetes and will be gone by version 1.25.x](https://repo1.dso.mil/platform-one/big-bang/bigbang/-/issues/10)
* [Open Policy Agent Gatekeeper can enforce the same security controls as PSPs](https://github.com/open-policy-agent/gatekeeper-library/tree/master/library/pod-security-policy#pod-security-policies), and is core component of BigBang, which operates as an elevated [validating admission controller](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) to audit and enforce various [constraints](https://github.com/open-policy-agent/frameworks/tree/master/constraint) on all requests sent to the kubernetes api server.
* We recommend users disable PSPs completely given they're being removed, we have a replacement, and PSPs can prevent OPA from deploying (and if OPA is not able to deploy, nothing else gets deployed).
* Different ways of Disabling PSPs:
* Edit the kube-apiserver's flags (methods for doing this varry per distro.)
* ```bash
* Edit the kube-apiserver's flags (methods for doing this vary per distribution.)
* ```shell
kubectl patch psp system-unrestricted-psp -p '{"metadata": {"annotations":{"seccomp.security.alpha.kubernetes.io/allowedProfileNames": "*"}}}'
kubectl patch psp global-unrestricted-psp -p '{"metadata": {"annotations":{"seccomp.security.alpha.kubernetes.io/allowedProfileNames": "*"}}}'
kubectl patch psp global-restricted-psp -p '{"metadata": {"annotations":{"seccomp.security.alpha.kubernetes.io/allowedProfileNames": "*"}}}'
```
## Kubernetes Distribution Specific Notes
* Note: P1 has forks of various [Kubernetes Distribution Vendor Repos](https://repo1.dso.mil/platform-one/distros), there's nothing special about the P1 forks.
* We recommend you leverage the Vendors upstream docs in addition to any docs found in P1 Repos; infact, the Vendor's upstream docs are far more likely to be up to date.
### VMWare Tanzu Kubernetes Grid:
### VMWare Tanzu Kubernetes Grid
[Prerequisites section of VMware Kubernetes Distribution Docs's](https://repo1.dso.mil/platform-one/distros/vmware/tkg#prerequisites)
### Cluster API
* Note that there are some OS hardening and VM Image Build automation tools in here, in addition to Cluster API.
* https://repo1.dso.mil/platform-one/distros/clusterapi
* https://repo1.dso.mil/platform-one/distros/cluster-api/gov-image-builder
* <https://repo1.dso.mil/platform-one/distros/clusterapi>
* <https://repo1.dso.mil/platform-one/distros/cluster-api/gov-image-builder>
### OpenShift
OpenShift
1) When deploying BigBang, set the OpenShift flag to true.
```
# inside a values.yaml being passed to the command installing bigbang
openshift: true
```yaml
# inside a values.yaml being passed to the command installing bigbang
openshift: true
```
# OR inline with helm command
helm install bigbang chart --set openshift=true
```
```shell
# OR inline with helm command
helm install bigbang chart --set openshift=true
```
2) Patch the istio-cni daemonset to allow containers to run privileged (AFTER istio-cni daemonset exists).
1) Patch the istio-cni daemonset to allow containers to run privileged (AFTER istio-cni daemonset exists).
Note: it was unsuccessfully attempted to apply this setting via modifications to the helm chart. Online patching succeeded.
```
kubectl get daemonset istio-cni-node -n kube-system -o json | jq '.spec.template.spec.containers[] += {"securityContext":{"privileged":true}}' | kubectl replace -f -
```
3) Modify the OpenShift cluster(s) with the following scripts based on https://istio.io/v1.7/docs/setup/platform-setup/openshift/
```shell
kubectl get daemonset istio-cni-node -n kube-system -o json | jq '.spec.template.spec.containers[] += {"securityContext":{"privileged":true}}' | kubectl replace -f -
```
```
# Istio Openshift configurations Post Install
oc -n istio-system expose svc/public-ingressgateway --port=http2
oc adm policy add-scc-to-user privileged -z istio-cni -n kube-system
oc adm policy add-scc-to-group privileged system:serviceaccounts:logging
oc adm policy add-scc-to-group anyuid system:serviceaccounts:logging
oc adm policy add-scc-to-group privileged system:serviceaccounts:monitoring
oc adm policy add-scc-to-group anyuid system:serviceaccounts:monitoring
cat <<\EOF >> NetworkAttachmentDefinition.yaml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: istio-cni
EOF
oc -n logging create -f NetworkAttachmentDefinition.yaml
oc -n monitoring create -f NetworkAttachmentDefinition.yaml
```
1) Modify the OpenShift cluster(s) with the following scripts based on <https://istio.io/v1.7/docs/setup/platform-setup/openshift/>
```shell
# Istio Openshift configurations Post Install
oc -n istio-system expose svc/public-ingressgateway --port=http2
oc adm policy add-scc-to-user privileged -z istio-cni -n kube-system
oc adm policy add-scc-to-group privileged system:serviceaccounts:logging
oc adm policy add-scc-to-group anyuid system:serviceaccounts:logging
oc adm policy add-scc-to-group privileged system:serviceaccounts:monitoring
oc adm policy add-scc-to-group anyuid system:serviceaccounts:monitoring
cat <<\EOF >> NetworkAttachmentDefinition.yaml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: istio-cni
EOF
oc -n logging create -f NetworkAttachmentDefinition.yaml
oc -n monitoring create -f NetworkAttachmentDefinition.yaml
```
### Konvoy
* [Prerequistes can be found here](https://repo1.dso.mil/platform-one/distros/d2iq/konvoy/konvoy/-/tree/master/docs/1.5.0#prerequisites)
* [Prerequisites can be found here](https://repo1.dso.mil/platform-one/distros/d2iq/konvoy/konvoy/-/tree/master/docs/1.5.0#prerequisites)
* Konvoy clusters need a [Metrics API Endpoint](https://github.com/kubernetes/metrics#resource-metrics-api) available within the cluster to allow Horizontal Pod Autoscalers to correctly fetch pod/deployment metrics.
* [Different Deployment Scenarios have been documented here](https://repo1.dso.mil/platform-one/distros/d2iq/konvoy/konvoy/-/tree/master/docs/1.4.4/install)
### RKE2
* RKE2 turns PSPs on by default (see above for tips on disabling)
* RKE2 sets selinux to enforcing by default ([see os_preconfiguration.md for selinux config](os_preconfiguration.md))
......@@ -115,4 +122,3 @@ cloud-provider-config: ...
For example, if using the aws terraform modules provided [on repo1](https://repo1.dso.mil/platform-one/distros/rancher-federal/rke2/rke2-aws-terraform), setting the variable: `enable_ccm = true` will ensure all the necessary resources tags.
In the absence of an in-tree cloud provider (such as on-prem), the requirements can be met by ensuring a default storage class and automatic load balancer provisioning exist.
# OS Configuration Pre-Requisites:
# OS Configuration Pre-Requisites
## Disable Swap (Kubernetes Best Practice)
## Disable swap (Kubernetes Best Practice)
1. Identify configured swap devices and files with cat /proc/swaps.
2. Turn off all swap devices and files with swapoff -a.
3. Remove any matching reference found in /etc/fstab.
(Credit: Above copy pasted from Aaron Copley of [Serverfault.com](https://serverfault.com/questions/684771/best-way-to-disable-swap-in-linux))
## ECK Specific Configuration (ECK Is a Core BB App)
## ECK specific configuration (ECK is a Core BB App):
Elastic Cloud on Kubernetes (Elasticsearch Operator) deployed by BigBang uses memory mapping by default. In most cases, the default address space is too low and must be configured.
To ensure unnecessary privileged escalation containers are not used, these kernel settings should be applied before BigBang is deployed:
```bash
```shell
sudo sysctl -w vm.max_map_count=262144 #(ECK crash loops without this)
```
More information can be found from elasticsearch's documentation [here](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-virtual-memory.html#k8s-virtual-memory)
### AKS Configuration
### AKS Configuration
Ensure this block is present in the terraform configuration for the `azurerm_kubernetes_cluster_node_pool` resource section for your AKS cluster:
......@@ -30,25 +30,25 @@ linux_os_config {
}
```
## SELinux Specific Configuration
## SELinux specific configuration:
* If SELinux is enabled and the OS hasn't received additional pre-configuration, then users will see istio init-container crash loop.
* Depending on security requirements it may be possible to set selinux in permissive mode: `sudo setenforce 0`.
* Additional OS and Kubernetes specific configuration are required for istio to work on systems with selinux set to `Enforcing`.
By default, BigBang will deploy istio configured to use `istio-init` (read more [here](https://istio.io/latest/docs/setup/additional-setup/cni/)). To ensure istio can properly initialize enovy sidecars without container privileged escalation permissions, several system kernel modules must be pre-loaded before installing BigBang:
By default, BigBang will deploy istio configured to use `istio-init` (read more [here](https://istio.io/latest/docs/setup/additional-setup/cni/)). To ensure istio can properly initialize envoy sidecars without container privileged escalation permissions, several system kernel modules must be pre-loaded before installing BigBang:
```bash
```shell
modprobe xt_REDIRECT
modprobe xt_owner
modprobe xt_statistic
```
## Sonarqube Specific Configuration (Sonarqube Is a BB Addon App)
## Sonarqube specific configuration (Sonarqube is a BB Addon App):
Sonarqube requires the following kernel configurations set at the node level:
Sonarqube requires the following kernel configurations set at the node level:
```bash
```shell
sysctl -w vm.max_map_count=524288
sysctl -w fs.file-max=131072
ulimit -n 131072
......@@ -64,5 +64,5 @@ addons:
initSysctl:
enabled: true
```
**This is not the recommended solution as it requires running an init container as privileged.**
**This is not the recommended solution as it requires running an init container as privileged.**
# Credentials for Big Bang Packages
This document includes details on credentials to access each package in a default install (without SSO). It is safe to assume that any packages not listed in the two categories below either have no need for authentication or use different methods (ex: velero require kubectl access).
This document includes details on credentials to access each package in a default install (without SSO). It is safe to assume that any packages not listed in the two categories below either have no need for authentication or use different methods (ex: Velero require kubectl access).
## Packages with no built in authentication
## Packages With No Built in Authentication
Although the below applications have no built in authentication, Big Bang's helm values can be configured to deploy authservice in front of these endpoints. Authservice is an Authentication Proxy that can integrate with SSO providers like Keycloak.
......@@ -10,7 +10,7 @@ Although the below applications have no built in authentication, Big Bang's helm
- Monitoring (Prometheus)
- Monitoring (Alertmanager)
## Packages with built in authentication
## Packages With Built in Authentication
The applications in the table below provide both SSO and built in auth. The table gives default credentials and ways to access and/or override those.
......
# ImagePullPolicy at Big Bang Level
# ImagePullPolicy for Big Bang
## Setting ImagePullPolicy at the Big Bang Level
Big Bang is currently working to standardize the adoption of a global image pull policy so that customers can set a single value and have it passed to all packages.
......@@ -6,7 +8,7 @@ The global image pull policy has been adopted in Big Bang for the core packages
We have also documented the package overrides required if you want to set a single package/pod with a different pull policy than the global.
# ImagePullPolicy per Package
## Setting ImagePullPolicy per Package
| Package | Default | Value Override |
|---|---|---|
......@@ -16,7 +18,7 @@ We have also documented the package overrides required if you want to set a sing
| Kiali | `IfNotPresent` | <pre lang="yaml">kiali:<br> values:<br> image:<br> pullPolicy: IfNotPresent<br> cr:<br> spec:<br> deployment:<br> image_pull_policy: IfNotPresent</pre> |
| Cluster Auditor | `Always` | <pre lang="yaml">clusterAuditor:<br> values:<br> image:<br> imagePullPolicy: IfNotPresent</pre> |
| OPA Gatekeeper | `IfNotPresent` | <pre lang="yaml">gatekeeper:<br> values:<br> postInstall:<br> labelNamespace:<br> image:<br> pullPolicy: IfNotPresent<br> postUpgrade:<br> cleanupCRD:<br> image:<br> pullPolicy: IfNotPresent<br> image:<br> pullPolicy: IfNotPresent</pre> |
| Kyverno | `IfNotPresent` | <pre lang="yaml">addons:<br> kyverno:<br> values:<br> image:<br> pullPolicy: IfNotPresent<br> initImage:<br> pullPolicy: IfNotPresent</pre> |
| Kyverno | `IfNotPresent` | <pre lang="yaml">addons:<br> kyverno:<br> values:<br> image:<br> pullPolicy: IfNotPresent<br> initImage:<br> pullPolicy: IfNotPresent</pre> |
| Elasticsearch / Kibana | `IfNotPresent` | <pre lang="yaml">logging:<br> values:<br> imagePullPolicy: IfNotPresent</pre> |
| ECK Operator | `IfNotPresent` | <pre lang="yaml">eckoperator:<br> values:<br> image:<br> pullPolicy: IfNotPresent</pre> |
| Fluentbit | `Always` | <pre lang="yaml">fluentbit:<br> values:<br> image:<br> pullPolicy: IfNotPresent</pre> |
......
......@@ -153,6 +153,7 @@ gatekeeper:
- kiali/kiali-operator-cypress-test
- mattermost/mattermost-cypress-test
- keycloak/keycloak-cypress-test
- jaeger/jaeger-operator-cypress-test
# Allow kyverno test vectors for Helm test
- default/restrict-host-path-mount-.?
- default/restrict-host-path-write-.?
......@@ -243,6 +244,7 @@ gatekeeper:
- kiali/kiali-operator-cypress-test
- mattermost/mattermost-cypress-test
- keycloak/keycloak-cypress-test
- jaeger/jaeger-operator-cypress-test
# Allow kyverno test vectors for Helm test
- default/restrict-host-path-mount-.?
- default/restrict-host-path-write-.?
......@@ -346,6 +348,7 @@ kyvernopolicies:
- mattermost
- nexus-repository-manager
- keycloak
- jaeger
names:
- "*-cypress-test*"
parameters:
......@@ -361,6 +364,7 @@ kyvernopolicies:
- mattermost
- nexus-repository-manager
- keycloak
- jaeger
names:
- "*-cypress-test*"
parameters:
......@@ -391,6 +395,7 @@ kyvernopolicies:
- mattermost
- nexus-repository-manager
- keycloak
- jaeger
names:
- "*-cypress-test*"
update-image-pull-policy:
......