Customer Feecback on Upgrades
Transparency & Clarity needed on what is being upgraded
- Sometimes the release notes aren't as clear on what is being upgraded or they are missed.
- Unclear why some packages & images are being upgraded.
- Is it because of a security update?
- New features? If so, what are the features?
-
Potential Solutions
- Identify items that are being updated and document them in further detail in release notes.
- In cases where an update is applied by Iron Bank or the upstream vendor highlight the appropriate links to the customer so they are aware of what was updated.
- Identify items that are being updated and document them in further detail in release notes.
Manual steps needed to perform an upgrade add more time to the upgrade process.
- Updating kustomize overlays
- Running flux suspend when updating the package repositories. Flux has displayed a tendency to delete local dirs & cluster resources when new releases are pulled in to air-gapped/high-side environment repositories. A flux suspend -> repository update -> flux resume workflow alleviates this but requires time and manual effort to achieve.
Keeping up with Upgrades
- It is easy to fall behind the release cadence
- Some users work in environments (air-gapped/high-side) where they are timebound when performing installs + upgrades.
- Some users skip upgrades due to issues introduced, which compounds the issue of falling behind if they skip releases.
- The current (n-2) upgrade path is challenging to keep up with for some customers.
- It takes time to reconcile changes, moreso in environments where access is restricted and timebound and help cannot easily be requested.
- Customers have a need to catch up more quickly to the latest release. This pain point has greater impact on the high-side, air-gapped environments where releases are skipped to maintain stability or simply because teams cannot keep up with applying updates sequentially.
-
Potential Solutions
- Providing upgrade path documentation could alleviate this pain.
- Build a test environment that replicates these conditions and work with customer teams to recreate some of their issues here.
Feature completeness & stability
- Long-term stability for upgrades is needed
- Does big bang keep pushing updates or is there a goalpost for what is considered "feature complete?"
General issues with upgrades
- Between version 1.12 and 1.14
- The Monitoring packages, SonarQube, Logging, and others exhibit a number of issues when Flux Resume is run following a Suspend to update the releases on the high-side.
- Ran into a Webhook issue after Flux Resume, the solution was to turned off admission webhooks.
- The Monitoring packages, SonarQube, Logging, and others exhibit a number of issues when Flux Resume is run following a Suspend to update the releases on the high-side.
- In one release there was a downstream impact for a SonarQube update where openID authentication didn't work.
- TBD (follow-up sessions will be needed with engineers to drill down to specific errors).