Release 2.12.0

changed milestone to %2.12.0

added kindchore priority2 release labels

set weight to 3

changed iteration to Big Bang Iterations Oct 3, 2023 - Oct 16, 2023

Release Process

Forward

*** Important: Before you begin to work on a release be sure that you have all the tools listed here installed and that you're able to connect to the Dogfood cluster (see instructions here). It's a good idea to do this before the day you're scheduled to start working on a release in case you run into any problems. ***

DO NOT blindly copy and paste commands prepended with $. Carefully read and evalulate these commands and execute them if applicable.
This upgrade process is applicable only to minor Big Bang version upgrades of Big Bang major versions 1 and 2. It is subject to change in future major versions. Patch version upgrades follow a different process.
You can use the R2D2 release automation tool to automate certain parts of this release process. Follow the README for instructions on how to clone and install R2D2. Usage of R2D2 is denoted in the document by the R2-D2 prefix.
The release branch format is as follows: release-<major>.<minor>.x where <minor> is the minor release number. Example: release-2.7.x. The <minor> string is a placeholder for the minor release number in this document.
The release tag format for major version 2 of Big Bang is: 2.<minor>.0.

NOTE: As you work through the release make a list of pain points / unclear steps. Once the release is complete, provide this feedback to the maintainers via MM and/or an MR to update the documentation. This allows us to continuously improve this process for future release engineers.

1. Release Prep

Check last release SHAs

Verify that the previous release branch commit hash matches the last release tag hash. Investigate with previous release engineer if they do not match. There are two ways you can do this:
1. Manually
  - Go to Branches in Big Bang (the product).
  - Find the latest release branch. It should be in the format release-<major_version>.<minor_version>.x (e.g. release-2.7.x). You will see an eight-character commit hash below it (e.g. 6c746edd)
  - Go to Tags in a separate tab. Here you will see the latest release tag listed first. Verify that this tag has the same hash you found in the previous step.
2. Automatically with R2D2.
  - Run R2D2 from a bash shell in your locally cloned Big Bang (the product) repo by entering the command r2d2.
  - Select the Check last release SHAs option with your up and down arrow keys, press the spacebar, then press Enter to run R2D2 with this option.
Create release branch in Big Bang (product). You will need to request permissions from the maintainers if you run into permissions issues. ¹
- R2-D2: run w/ the Create release branch option selected
  
  or
- In either the Gitlab UI or the Git CLI, create a new branch from master with the name release-2.<minor>.x.
  
  Important The release branch name must end with x. For example, if the release version is 2.7.0, the release branch will be called release-2.7.x
```
# CLI example, replace `<release-branch>` with the name of the release branch as specifed above
$ cd bigbang
$ git checkout master
$ git pull
$ git checkout -b <release-branch>
$ git branch --show-current
$ git push --set-upstream origin <release-branch>
```
Check to see if any packages have been added or graduated from BETA. There is no single resource that lists this information, but you should generally be aware of packages that have been added or moved out of BETA from discussions with other team members during your workdays. You can also ask the Big Bang anchors if unsure. If any packages have been added or moved out of BETA, adjust your local R2D2 config as needed. Also open an MR for R2D2 with the change to the default config.

Build the release notes:

R2-D2: run with the Build release notes option selected

# clone the dogfood cluster repo
$ git clone https://repo1.dso.mil/big-bang/team/deployments/bigbang/ dogfood-bigbang
# cd into the dogfood-bigbang repo's release notes dir
$ cd dogfood-bigbang/docs/release
# install R2-D2
$ python3 -m pip install git+https://repo1.dso.mil/big-bang/team/tools/R2-D2.git
# Select the `Build release notes` option
$ r2d2
# commit the release notes
$ cd ../../
$ git add .
$ git commit -m "add initial release notes"

Upgrade Big Bang (product) version references

Tip Make the following changes in a single commit so that they can be cherry picked into master later.
- In base/gitrepository.yaml, update ref.tag to your current release version.
- Update chart release version chart/Chart.yaml
- Update docs/packages.md: Add any new Packages. Review if any columns need updates (mTLS STRICT is a common change, follow the other examples where STRICT is noted). Also make sure to update and remove the BETA badge from any packages that have moved out of BETA.
- Add a new changelog entry for the release, ex:
```
## [2.<minor>.0]

- [!2.<minor>.0](https://repo1.dso.mil/big-bang/bigbang/-/merge_requests?scope=all&utf8=%E2%9C%93&state=merged&milestone_title=2.<minor>.0); List of merge requests in this release.


```
- Update /docs/understanding-bigbang/configuration/base-config.md using helm-docs.
```
# example release 2.<minor>.x
$ cd bigbang
$ git checkout release-2.<minor>.x
$ docker run -v "$(pwd):/helm-docs" -u $(id -u) jnorwood/helm-docs:v1.5.0 -s file -t .gitlab/base_config.md.gotmpl --dry-run > ./docs/understanding-bigbang/configuration/base-config.md
```
- Commit changes git commit -am 'version updates for release <release-branch-name>'
- Push changes (git push)
- Reach out to Anchors @ryan.j.garcia @chris.oconnell to review your commits to the release branch prior to moving on to the next step

2. Upgrade and Debug Cluster

️ WARNING: Upgrade only, do not delete and redeploy.

Connect to the dogfood cluster

NOTE: If you have issues with the AWS CLI commands, adding via the AWS web console is another option. Reach out to core maintainers for assistance. ¹

Review Elasticsearch health and trial license status:

The url to do this is https://kibana.dogfood.bigbang.dev/login?next=%2F. Log on with the information for Logging (Kibana) here.

Run a 'kubectl get pods -A' and confirm that all the pods are less than 30 days old. Also, login to Kibana via SSO - SSO is paywalled so it will fail if the license is expired. If the license is expired, follow the below steps to renew it.

NOTE: only run if trial is expired. after running this you will need to re-configure role mapping

Renew ECK Trial

kubectl delete hr ek eck-operator -n bigbang
kubectl delete ns eck-operator logging
flux reconcile kustomization environment -n bigbang
flux suspend hr bigbang -n bigbang
flux resume hr bigbang -n bigbang
flux suspend hr loki -n bigbang
flux resume hr loki -n bigbang
flux suspend hr promtail -n bigbang
flux resume hr promtail -n bigbang
flux suspend hr fluentbit -n bigbang
flux resume hr fluentbit -n bigbang
flux suspend hr mattermost -n bigbang
flux resume hr mattermost -n bigbang
flux suspend hr jaeger -n bigbang
flux resume hr jaeger -n bigbang
kubectl delete pods -n mattermost --all
kubectl delete po -n jaeger -l app.kubernetes.io/component=all-in-one

NOTE: Suspend/Resume and pod cycling for Jaeger and Mattermost cycles the mounted Elastic certificates.

Review Mattermost Enterprise trial license status:
- Login to https://chat.dogfood.bigbang.dev as the "robot admin" (find credentials in encrypted values or with sops -d bigbang/prod/environment-bb-secret.enc.yaml | grep "Robot admin"), navigate to the System Console -> Edition and License tab. If the license is expired, follow the steps below to renew it.
  
  NOTE: only run if trial is expired.
  Renew Mattermost Enterprise Trial
  
  To "renew" Mattermost Enterprise trial license, connect to RDS postgres DB using psql. Follow the guide (guide will need to be sops decrypted) to connect to the DB. Contact the core maintainers if you need additional assistance. ¹
  
  Then run the commands below from within the psql connection, which will cycle the license:
```
\c mattermost
select * from "public"."licenses";
delete from "public"."licenses";
\q
kubectl delete mattermost mattermost -n mattermost
flux suspend hr -n bigbang mattermost
flux resume hr -n bigbang mattermost
```
  Validate that the new Mattermost pod rolls out successfully. If it hasn't reconciled you may need to suspend/resume bigbang again. Login as a system admin, navigate to the System Console -> Edition and License tab. Click the "Start trial" button

If Flux has been updated in the latest release:

Run the Flux install script on your locally cloned Big Bang repo as in the example below

$ cd bigbang
$ git checkout release-2.<minor>.x
$ git pull
$ ./scripts/install_flux.sh -s
# the `-s` option will reuse the existing secret so you don't have to provide credentials
$ cd ../dogfood-bigbang
# go back to dogfood after flux upgrade

Before upgrading the cluster do a sanity check on health of existing cluster resources. Check pods, helmreleases, etc. You want to be aware of and possibly fix any issues before you upgrade the Big Bang deployment.
Upgrade the release branch on dogfood cluster master by completing the following.
- WIP: R2-D2: run w/ the Upgrade dogfood cluster option selected
  
  or
- Upgrade base kustomization ref=release-2.<minor>.x in bigbang/base/kustomization.yaml to the release branch.
- Upgrade prod kustomization branch: "release-2.<minor>.x" in bigbang/prod/kustomization.yaml to the release branch.
- Verify the changes above are correct, then:
```
$ git add bigbang/base bigbang/prod
$ git commit -m "upgrade kustomizations to release-2.<minor>.x"
$ git push
```
Verify cluster has updated to the new release
- Packages have fetched new git repository revisions and match the new release
  
  kubectl get gitrepositories -A
- Packages have reconciled
  - Watch the Release HRs, Gitrepos, and Kustomizations to check when all HRs have properly reconciled
```
# check release
watch kubectl get gitrepositories,kustomizations,hr,po -A
```
  - If flux has not updated after ten minutes:
```
flux reconcile hr -n bigbang bigbang --with-source
```
  - If flux is still not updating, delete the flux source controller:
```
kubectl get all -n flux-system
kubectl delete pod/source-controller-xxxxxxxx-xxxxx -n flux-system
```
  - If the helm release shows max retries exhausted, check a describe of the HR. If it shows "another operation (install/upgrade/rollback) is in progress", this is an issue caused by too many Flux reconciles happening at once and typically the Helm controller crashing. You will need to delete helm release secrets and reconcile in flux as follows. Note that ${HR_NAME} is the same HR you described which is in a bad state (typically a specific package and NOT bigbang itself).
```
# example w/ kiali
$ HR_NAME=kiali
$ kubectl get secrets -n bigbang | grep ${HR_NAME}
```
```
# example output:
# some hr names are duplicated w/ a dash in the middle, some are not
sh.helm.release.v1.kiali-kiali.v1                                       helm.sh/release.v1                    1      18h
sh.helm.release.v1.kiali-kiali.v2                                       helm.sh/release.v1                    1      17h
sh.helm.release.v1.kiali-kiali.v3                                       helm.sh/release.v1                    1      17m
```
```
# Delete the latest one:
$ kubectl delete secret -n bigbang sh.helm.release.v1.${HR_NAME}-${HR_NAME}.v3

# suspend/resume the hr
$ flux suspend hr -n bigbang ${HR_NAME}
$ flux resume hr -n bigbang ${HR_NAME}
```
  - If you see errors about no space left when you kubectl describe a failed deployment. The logs have probably filled the filesystem of the node. Determine which node the deployment was scheduled on and ssh to it and delete the logs. You need to have sshuttle running in order to reach the IP of the nodes.
```
ssh -i ~/.ssh/dogfood.pem ec2-user@xx.x.x.xx
sudo -i
rm -rf /var/log/containers/*
rm -rf /var/log/pods/*
```
    Then the deployment should recover on its own
- Run kubectl get pods -A and verify that all Pods are in "Running" or "Completed" status.

3. UI Tests

Important When verifying each application UI is loading, also verify the website certificates are valid.

Logging

Login to kibana with SSO.
- Note: If you get "You do not have permission to access the requested page," follow the instructions under "Renew ECK Trial" in the 2. Upgrade and Debug Cluster section above. Don't forget to log in as admin and reconfigure role mapping after you run the kubectl and flux commands.
Verify that Kibana is actively indexing/logging.
- To do this, click on the drop-down menu on the upper-left corner, then Under "Analytics" click Discover. Click "Create data view." In the "Index pattern" field, enter jaeger*. Set the "Timestamp field" to I don't want to use the time filter, then click "Use without saving". Log data should populate.

Monitoring

Log in to grafana with SSO.
Click on the drop-down menu in the upper-left and choose Dashboards. Verify that several Kubernetes and Istio dashboards are listed. Click on a few of these and verify that metrics are populating in them.
Login to prometheus

Go to Status > Targets. Validate that no unexpected services are marked as Unhealthy

KNOWN UNHEALTHY INCLUDE:
serviceMonitor/monitoring/monitoring-monitoring-kube-kube-controller-manager/0
serviceMonitor/monitoring/monitoring-monitoring-kube-kube-etcd/0
serviceMonitor/monitoring/monitoring-monitoring-kube-kube-scheduler/0

Loki

Login to grafana as admin. User name "admin". Retrieve password with sops -d bigbang/prod/environment-bb-secret.enc.yaml | grep -A 2 grafana
Click on the drop-down on upper-left, choose Dashboards, then Loki Dashboard quick search. Verify that this dashboard shows data/logs from the past few minutes and hours.
Click on the drop-down on upper-left, choose Dashboards, then Loki / Operational. Verify that both the Distributor Success Rate and Ingester Success rate are 100%, and that when you expand the Chunks item (toward the bottom of the page), data exists.
Click on the drop-down in the upper-left corner again and choose Data Sources, then choose Loki. Scroll down and click Save & Test. A message should appear that reads "Data source connected and labels found."

Tempo

Stay logged in to grafana as admin
Click on the drop-down menu on the upper-left corner, then choose Data Sources, then Tempo. Scroll down and click Save & Test. A message should appear that reads "Data source is working."
Visit tempo tracing & ensure Services are populating under Service drop down. For example, you might see jaeger-query and tempo-grpc-plugin listed as options.

Cluster Auditor

Login to grafana with SSO
OPA Violations dashboard is present and shows violations in namespaces

Kiali

Login to kiali with SSO
Validate graphs and traces are visible under applications/workloads
- Graphs at the overview
- Graphs for inbound metrics
- Graphs for outbound metrics
- Graphs for Traces
Validate no errors appear

️ Note Red notification bell would be visible if there are errors. Errors on individual application listings for labels, etc are expected and OK.

GitLab

Login to gitlab with SSO
Edit profile and change user avatar. If your avatar doesn't change right away, check again later. It can take several minutes or more.
Create new public group with release name. ie. release-2-<minor>-x
Create new public project (under the group you just made), also with release name (e.g.release-2.7.x if you're working on release 2.7.0).
git clone project
Pick one of the project folders from Sonarqube Samples and copy all the files into your clone from dogfood
- Go to Security > Access Token and create a personal access token. Grant yourself the Developer role and select all scopes. Record the token. You'll need it for the next two steps.
- Git commit and push your changes to the repo. When prompted enter your username and the access token as your password.

Test a docker image push/pull to/from registry

docker pull alpine
docker tag alpine registry.dogfood.bigbang.dev/<GROUPNAMEHERE>/<PROJECTNAMEHERE>/alpine:latest
docker login registry.dogfood.bigbang.dev # Enter your gitlab username and personal access token
docker push registry.dogfood.bigbang.dev/<GROUPNAMEHERE>/<PROJECTNAMEHERE>/alpine:latest
docker image rm registry.dogfood.bigbang.dev/<GROUPNAMEHERE>/<PROJECTNAMEHERE>/alpine:latest
docker pull registry.dogfood.bigbang.dev/<GROUPNAMEHERE>/<PROJECTNAMEHERE>/alpine:latest

Sonarqube

Login to sonarqube with SSO
Add a project for your release. When prompted for how you want to analyze your repository, choose "Locally."
Generate a token for the project and copy the token somewhere safe for use later
When prompted to "Run analysis on your project" choose "Other (for JS, TS, Go, Python, PHP...)". For "What is your OS?" choose Linux. Copy everything that appears under "Running a SonarQube analysis is straighforward. You just need to execute the following commands in your project's folder" and save it somewhere secure for use later.
After completing the gitlab runner test return to sonar and check that your project now has analysis

️ Note The project token and project key are different values.

Gitlab Runner

Log back into gitlab and navigate to your project
Under settings, CI/CD, variables add two vars:
- SONAR_HOST_URL set equal to https://sonarqube.dogfood.bigbang.dev/
- SONAR_TOKEN set equal to the token you copied from Sonarqube earlier (make this masked)
Under settings, CI/CD, deselect "Default to Auto DevOps pipeline" and click Save changes.
Add a .gitlab-ci.yml file to the root of the project, paste in the contents of sample_ci.yaml, replacing -Dsonar.projectKey=XXXXXXX with what you copied earlier
Commit, validate the pipeline runs and succeeds (may need to retry if there is a connection error), then return to the last step of the sonar test

Nexus

R2-D2: run w/ the Nexus Test option selected

or

Manually

Login to Nexus as admin, password is in the nexus-repository-manager-secret secret:

# username is admin, password is the output of this command
kubectl get secret -n nexus-repository-manager nexus-repository-manager-secret -o go-template='{{index .data "admin.password" | base64decode}}' ; echo

Validate there are no errors displaying in the UI (an "Available CPUs" error about the host system "allocating a maximum of 1 cores to the application" is acceptable).

Push/pull an image to/from the nexus registry

With the credentials from the encrypted values (or the admin user credentials) login to the nexus registry
```
$ docker login containers.dogfood.bigbang.dev
```

Tag and push an image to the registry:

# ex: <release> = `1-32-0`
$ docker tag alpine:latest containers.dogfood.bigbang.dev/alpine:<release>
$ docker push containers.dogfood.bigbang.dev/alpine:<release>

Pull down the image for the previous release

# ex: <last-release> = `1-31-0`
$ docker pull containers.dogfood.bigbang.dev/alpine:<last-release>

Anchore

Login to Anchore with SSO
Log out and log back in as the admin user - password is in anchore-anchore-engine-admin-pass secret (admin will have pull credentials set up for the registries):
```
kubectl get secret anchore-anchore-engine-admin-pass -n anchore -o json | jq -r '.data.ANCHORE_ADMIN_PASSWORD' | base64 -d ; echo ' <- password'
```
Scan image in dogfood registry, registry.dogfood.bigbang.dev/GROUPNAMEHERE/PROJECTNAMEHERE/alpine:latest
- Go to Images Page, Analyze Tag
  - Registry: registry.dogfood.bigbang.dev
  - Repository: GROUPNAME/PROJECTNAME/alpine
  - Tag: latest
Scan image in nexus registry, containers.dogfood.bigbang.dev/alpine:<release-number> (use your release number, ex: 1-XX-0). If authentication fails that means that the Nexus credentials have changed. Retrive the Nexus credentials using instructions from Nexus above. Update creds in Anchore by clicking on menu Images > Analyze Repository > link at bottom "clicking here" > click on registry name > then edit the credentials.
Validate scans complete and Anchore displays data (click the SHA value for each image)

Argocd

Login to argocd with SSO

Logout and login with username admin. The password is in the argocd-initial-admin-secret secret. If that doesn't work attempt a password reset:

kubectl -n argocd get secret argocd-initial-admin-secret -o json |  jq '.data|to_entries|map({key, value:.value|@base64d})|from_entries'

Create application

Click [Create Application], fill in the below

Setting	Value
Application Name	podinfo
Project	default
Sync Policy	Automatic
Sync Policy	check both boxes
Sync Options	check "auto-create namespace"
Repository URL	https://repo1.dso.mil/big-bang/apps/sandbox/podinfo.git
Revision	HEAD
Path	chart
Cluster URL	https://kubernetes.default.svc
Namespace	podinfo

Click [Create] (top of page)

Validate app syncs/becomes healthy

WIP: Creating application with YAML template

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: podinfo
spec:
destination:
    name: ''
    namespace: podinfo
    server: 'https://kubernetes.default.svc'
source:
    path: chart
    repoURL: 'https://repo1.dso.mil/big-bang/apps/sandbox/podinfo.git'
    targetRevision: HEAD
project: default
syncPolicy:
    automated:
    prune: true
    selfHeal: true
    syncOptions:
    - CreateNamespace=true

Delete app

Minio

R2-D2: run w/ the Minio Test option selected

or

Manually

Log into the Minio UI - access and secret key are in the minio-root-creds-secret secret

  kubectl -n minio get secret minio-creds-secret -o json | jq -r '.data.accesskey' | base64 -d ; echo ' <- access key'
  kubectl -n minio get secret minio-creds-secret -o json | jq -r '.data.secretkey' | base64 -d ; echo ' <- secret key'

Create bucket
Store file to bucket. To do this after creating the bucket, click on Object Browser, then click the bucket, then Upload.
Download file from bucket
Delete bucket and files

Mattermost

Login to mattermost with SSO.
Update/modify profile picture
Log out and log back in as admin. You can login as the robot admin if you do not have this access (find credentials in encrypted values or with sops -d bigbang/prod/environment-bb-secret.enc.yaml | grep "Robot admin")
Send chats and validate that chats from previous releases are visible.
Under System Console -> Environment, click Elasticsearch and then click Test Connection and Index Now. Validate that both are successful. If Elasticsearch does not appear as an option, go to About > Edition and License on the System Console menu and click the Start Trial button.

Twistlock

Validate that the Twistlock init job pod ran and completed. This should do all setup (license/user) and the required defender updates automatically (Pod is automatically removed after 30 minutes). If the pod is gone already, check the Prometheus target for Twistlock. Log in here and in the Targets drop down, choose serviceMonitor/twistlock/twistlock-console/0, then click the small "show more" button that appears. You should see a State of UP and no errors.

Login to twistlock/prisma cloud with the credentials in the secret:

kubectl get secret -n twistlock twistlock-console -o go-template='{{.data.TWISTLOCK_USERNAME | base64decode}}' ; echo ' <- username'
kubectl get secret -n twistlock twistlock-console -o go-template='{{.data.TWISTLOCK_PASSWORD | base64decode}}' ; echo ' <- password'

Under Manage -> Defenders, make sure the number of defenders online is equal to the number of nodes on the cluster. You can list cluster nodes with kubectl get nodes -A

Defenders will scale with the number of nodes in the cluster. If there is a defender that is offline, check whether the node exists in cluster anymore. Cluster autoscaler will often scale up/down nodes which can result in defenders spinning up and getting torn down. As long as the number of defenders online is equal to the number of nodes everything is working as expected.

Neuvector

Login to Neuvector with the default login (currently admin:admin, update this to something secure)
- If admin password is unknown reset it to default
Navigate to Assets -> System Components and validate that all components are showing as healthy
Under Assets -> Containers, click on any image and run a scan. When the scan finishes, click on the container. You'll see the results in the Compliance and Vulnerabilities tabs below.
Under Network Activity, validate that the graph loads and shows pods and traffic. This graph can take several minute or more to load. You may want to leave the tab up and move on to the next UI test while itpopulates.

Kyverno

NOTE: if using MacOS make sure that you have gnu sed installed and add it to your PATH variable GNU SED Instructions

Test secret sync in new namespace

# create secret in kyverno NS
kubectl create secret generic \
  -n kyverno kyverno-bbtest-secret \
  --from-literal=username='username' \
  --from-literal=password='password'

# Create Kyverno Policy
kubectl apply -f https://repo1.dso.mil/big-bang/product/packages/kyverno/-/raw/main/chart/tests/manifests/sync-secrets.yaml

# Wait until the policy shows as ready before proceeding
kubectl get clusterpolicy sync-secrets

# Create a namespace with the correct label (essentially we are dry-running a namespace creation to get the yaml, adding the label, then applying)
kubectl create namespace kyverno-bbtest --dry-run=client -o yaml | sed '/^metadata:/a\ \ labels: {"kubernetes.io/metadata.name": "kyverno-bbtest"}' | kubectl apply -f -

# Check for the secret that should be synced - if it exists this test is successful
kubectl get secrets kyverno-bbtest-secret -n kyverno-bbtest

Delete the test resources

# If above is successful, delete test resources
kubectl delete -f https://repo1.dso.mil/big-bang/product/packages/kyverno/-/raw/main/chart/tests/manifests/sync-secrets.yaml
kubectl delete secret kyverno-bbtest-secret -n kyverno
kubectl delete ns kyverno-bbtest

Velero

Login to https://nexus.dogfood.bigbang.dev/#browse/welcome as admin. Confirm the following to allow the test deployment to pull the alpine image from Nexus without credentials
Verify that the velero-tests.dogfood.bigbang.dev/alpine:test repository exists on Nexus
Verify anonymous docker pull is enabled for velero-tests repository: Settings -> Repositories -> velero-tests -> Check Allow anonymous docker pull (Docker Bearer Token Realm required) -> Save
Go to Settings -> Security -> Realms and verify that Docker Bearer Token is in the Active column. If not, click on it to move it there, then click Save
Verify Anonymous Access is enabled. Settings -> Security -> Anonymous Access -> Check Allow anonymous users to access the server -> Save
Install the velero CLI on your workstation if you don't already have it (for MacOS, run brew install velero).
Then set VERSION to the release you are testing:
```
$ VERSION=2-<minor>-0
```
The following steps can be done via R2-D2: run w/ the Velero Test option selected. However, you will still need to verify that both old and new entries appear in logs, to confirm the backups were done correctly.

or

Manually
Backup PVCs using velero_test.yaml.

    $ kubectl apply -f ./docs/release/velero_test.yaml
    # wait 30s for velero to be ready then:
    # exec into velero_test container, check log
    $ veleropod=$(kubectl get pod -n velero-test -o json | jq -r '.items[].metadata.name')
    $ kubectl exec $veleropod -n velero-test -- tail /mnt/velero-test/test.log

Then set VERSION to the release you are testing and use the CLI to create a test backup:

$ VERSION=2-<minor>-0
$ velero backup create velero-test-backup-${VERSION} -l app=velero-test
$ velero backup get

Wait a bit, re-run velero backup get, when it shows "Completed" delete the app.

    $ kubectl delete -f ./docs/release/velero_test.yaml
    # namespace "velero-test" deleted
    # persistentvolumeclaim "velero-test" deleted
    # deployment.apps "velero-test" deleted

Restore the test resources from the backup

    $ velero restore create velero-test-restore-${VERSION} --from-backup velero-test-backup-${VERSION}
    # exec into velero_test container
    $ kubectl exec $veleropod -n velero-test -- cat /mnt/velero-test/test.log

Confirm both old and new log entries appear in logs, this confirms backup was done correctly

Example output of container logs:

Running command: kubectl exec velero-test-6549b5768d-872jc -n velero-test -- tail /mnt/velero-test/test.log
Command output:
Fri Jul 21 13:47:00 UTC 2023
Fri Jul 21 13:47:10 UTC 2023

Running command: kubectl exec velero-test-6549b5768d-872jc -n velero-test -- cat /mnt/velero-test/test.log
Command output:
Fri Jul 21 13:47:00 UTC 2023
Fri Jul 21 13:47:10 UTC 2023
Fri Jul 21 13:47:20 UTC 2023
Fri Jul 21 13:47:30 UTC 2023
Fri Jul 21 13:50:45 UTC 2023
Fri Jul 21 13:50:55 UTC 2023

Cleanup test and delete resources

    $ kubectl delete -f ./docs/release/velero_test.yaml
    # namespace "velero-test" deleted
    # persistentvolumeclaim "velero-test" deleted
    # deployment.apps "velero-test" deleted

Keycloak

Login to Keycloak admin console. The credentials are in the keycloak-env secret:

kubectl get secret keycloak-env -n keycloak -o jsonpath="{.data.KEYCLOAK_USER}" | base64 -d ; echo " <- admin user"
kubectl get secret keycloak-env -n keycloak -o jsonpath="{.data.KEYCLOAK_ADMIN_PASSWORD}" | base64 -d ; echo " <- password"

Tracing (Jaeger)

Load tracing, login with SSO, and ensure there are no errors on main page and that traces can be found for apps

Alertmanager

Load alertmanager, login with SSO, and validate that the watchdog alert at minimum is firing

4. Create Release

Finalize the tag in chart/Chart.yaml (remove -rc.x if present), commit and push this change

Re-run helm docs to update to the latest version + latest package versions.

$ cd bigbang
$ git pull
# pull any last minute cherry picks, verify nothing has greatly changed
$ git checkout release-1.<minor>.x
$ docker run -v "$(pwd):/helm-docs" -u $(id -u) jnorwood/helm-docs:v1.5.0 -s file -t .gitlab/base_config.md.gotmpl --dry-run > ./docs/understanding-bigbang/configuration/base-config.md
# commit and push the changes (if any)

Create release candidate tag based on release branch, ie. 1.<minor>.0-rc.0. Tagging will additionally create release artifacts and the pipeline runs north of 1 hour. You will need to request tag permissions from the maintainers ¹.
- To do this via the UI (generally preferred): tags page -> new tag, name: 1.<minor>.0-rc.0, create from: release-1.<minor>.x, message: "release candidate", release notes: leave empty
- To do this via git CLI:
```
$ git tag -a 1.<minor>.0-rc.0 -m "release candidate"
# list the tags to make sure you made the correct one
$ git tag -n
# push
$ git push --tags
```
Passed pipeline for Release Candidate tag.
Passed pipeline in bb-docs-compiler for latest RC tag (Gets created/scheduled towards the end of the bigbang release pipeline).

Review all pipeline output looking for failures or warnings. Reach out to the maintainers for a quick review before proceeding. ¹
Create release tag based on release branch. ie. 2.<minor>.0.
- To do this via the UI (generally preferred): tags page -> new tag, name: 1.<minor>.0, create from: release-1.<minor>.x, message: release 1.<minor>.0, release notes: leave empty
- To do this via git CLI:
```
$ git tag -a 2.<minor>.0 -m "release 1.<minor>.0"
# list the tags to make sure you made the correct one
$ git tag -n
# push
$ git push --tags
```
Passed release pipeline.

Review all pipeline output looking for failures or warnings. Reach out to the maintainers for a quick review before proceeding. ¹
Add release notes to release. To do this:
- In the Dogfood release folder, click on the release notes you created above (e.g. release-notes-2-7-0.md if you're working on release 2.7.0). Click the </> button to dispaly source, then copy the file contents to your clipboard.
- Go to https://repo1.dso.mil/big-bang/bigbang/-/releases
- Click New Release
- Tag name: Set this to the tag you created above
- Release title=Big Bang 2.7.0 # For example, if you're working on release 2.7.0
- Milestones: No milestone
- Release date: today
- Release notes: Paste the release notes from Dogfood here.
- Click Create Release
Modify release notes:
- Create Upgrade Notices, based off of the listed notices in the release issue. Also review with maintainers to see if any other major changes should be noted.
- Move MRs from 'Big Bang MRs' into their specific sections below. If MR is for Big Bang (docs, CI, etc) and not a specific package, they can remain in place under 'Big Bang MRs'. General rule of thumb: if the git tag changes for a package - put the MR under that package; and if there is no git tag change for a package - put it under BB.
- Adjust known issues as needed: If an issue has been resolved it can be removed and if any new issues were discovered they should be added.
- Verify table contains all chart upgrades mentioned in the MR list below.
- Verify table contains all application versions. Compare with previous release notes. If a package was upgraded there will usually be a bump in application version and chart version in the table.
- Verify all internal comments are removed from Release Notes, such as comments directing the release engineer to copy/move things around

Cherry-pick release commit(s) as needed with merge request back to master branch. We do not ever merge release branch back to master. Instead, make a separate branch from master and cherry-pick the release commit(s) into it. Use the resulting MR to close the release issue.

# Navigate to your local clone of the BB repo
cd path/to/my/clone/of/bigbang
# Get the most up to date master
git checkout master
git pull
# Check out new branch for cherrypicking
git checkout -b 1.<minor>-cherrypick

# Repeat the following for whichever commits are required from release
# Typically this is the initial release commit (bumped GitRepo, Chart.yaml, CHANGEGLOG, README) and a final commit that re-ran helm-docs (if needed)
git cherry-pick <commit sha for Nth commit>

git push --set-upstream origin 1.<minor>-cherrypick
# Create an MR using Gitlab, merging this branch back to master, reach out to maintainers to review

Close Big Bang Milestone in GitLab.
Handoff the release to the maintainers ¹, they will review then celebrate and announce the release in the public MM channel

@ryan.j.garcia @chris.oconnell ² ³ ⁴ ⁵ ⁶ ⁷

marked the checklist item Copy contents of https://repo1.dso.mil/big-bang/team/deployments/bigbang/-/blob/master/docs/release/README.md into this description and check the boxes as you test. as completed

assigned to @jimmy.bourque

unassigned @jimmy.bourque

mentioned in merge request !3247 (merged)

closed with merge request !3247 (merged)

mentioned in commit c9fa5c40

Release 2.12.0

Designs

Child items ...

Activity

Release Process

Forward

1. Release Prep

2. Upgrade and Debug Cluster

3. UI Tests

Logging

Monitoring

Loki

Tempo

Cluster Auditor

Kiali

GitLab

Sonarqube

Gitlab Runner

Nexus

Anchore

Argocd

Minio

Mattermost

Twistlock

Neuvector

Kyverno

Velero

Keycloak

Tracing (Jaeger)

Alertmanager

4. Create Release

Admin message

Release 2.12.0

Activity

Release Process

Forward

1. Release Prep

2. Upgrade and Debug Cluster

3. UI Tests

Logging

Monitoring

Loki

Tempo

Cluster Auditor

Kiali

GitLab

Sonarqube

Gitlab Runner

Nexus

Anchore

Argocd

Minio

Mattermost

Twistlock

Neuvector

Kyverno

Velero

Keycloak

Tracing (Jaeger)

Alertmanager

4. Create Release