Provide customer with projected availability date for BB AirGap deployments
2 Desired Outcomes of this ticket:
1. Prioritize AirGap by staffing the effort/throwing more bodies at it
2. Start to come up with timelines even if it's just a timeline for a game plan.
(Actual Ask is at the bottom)
Useful Background Context:
https://confluence.il4.dso.mil/display/BB/Customers+BB+Compatibility+Matrix
3/3 Red Carpet Customers + 2 other customers asking for Internet Disconnected Deployment Mechanism.
- Either because they want a bare metal deployment
- or because they want to deploy to IL6, which implicitly requires internet disconnected deployment mechanism.
- So far 3 customers have each implemented their own unique air gap deployment with semi automation and manual steps, because an official deployment solution doesn't exist.
- I want to make sure that everyone realizes that this means that this feature is well behind schedule and needs to be prioritized. The reason I'm mentioning this is that we only have 2 people assigned to this epic when we should probably have 4 full time engineers assigned to this full time for at least months + throw any spare capacity customer integration engineers from customers who are slightly spinning down at it.
- Keycloak being figured out is a hard dependency
- Cloud Agnostic Container Attached Storage, Storage Class is a soft dependency
Customer Context Info:
- Every customer's AO is different some require more stringent criteria, others allow more flexibility, like going forward with their own airgap deployment mechanism.
- One of the Red Carpet Customers has an AO that requires very strict/stringent criteria, and they have a hard requirement that they use an officially supported by BigBang AirGap deployment mechanism.
- It was said they originally wanted to deploy in March but would push back to May 2021.
- ChrisM, Toby, and several others of the BB team set expectations that AirGap wouldn't be ready by May because we want to not rush and come up with a long term maintainable solution. Customers response was find that makes a ton of sense but I need dates and timelines for when to expect it by. We set the expectation that we will be able to say when it'll be ready vs told when it has to be ready by, that being said it's not a blank check to take as long as we want. It's a give and take we got that extra time flexibility in exchange for promising rough timelines of when to expect it. "We need to commit to a timeline of when it will be ready by, but we get to decide what that timeline is."
Acceptance Criteria:
- Assume an internet disconnected AWS VPC where 0 pre-existing infra exists and only ec2 instances + kms are available
- Doesn't need to be 100% E2E automated with hand holding, it'll be deployed by highly technical people. However if 10 different people follow a set of documentation guidelines + semi automation to deploy from scratch to the exact same environment, all 10 people should end up at the same result.
- Guidance and documentation on an official pattern in the absense of automation is requested, with the hope that automation can be added as the solution matures, but initially documented reproducibility is more important than automation.
- Guidance on how to setup airgap repo (should be able to have logic for both BB supplied images + customer supplied images)
- "Easy to Change", "Easy to Maintain", "Consistently reproducible regardless of who implements it via docs and semi-automation" should be our Guiding Principle with this. (An evolvable maintainable solution is better than a perfect solution)
- Solution needs to support Private PKI / customer supplied CA.crt and HTTPS.crts
Ideas of how this could be implemented:
- Let's start KISS and go with non HA docker / docker-compose solutions that are cloud agnostic, deployable to a single box/minimize dependencies, and focus on consistent reproducibility. Instead of getting stuck on analysis paralysis trying to come up with some perfect over engineered solution that's backed by a HA cluster that somehow solves the chicken and egg problems associated with that.
- If there's a utility/deployment/management server that kicks off automation, don't have the registry exist on that server, allow the registry to exist as a stand alone VM. (this way manual CI can reuse said standalone VM / no need to constantly make a docker registry from scratch on 100% of CI runs)
The Ask: Chunk the effort into subtasks and Provide Projected availability date for components of BB AirGap Deployment strategy:
- I'm 100% aware that there are too many unknowns to communicate a delivery date to customer.
- We should be able to chunk this and at least come up with timelines for:
- Chunked components maybe we don't have a gameplan/timeline for the wholistic solution, but we're confident on a piece of the solution and can come up with a timeline for that piece. (like airgap registry in isolation)
- Timeline by which we can expect to have a full holistic gameplan by, or at least a gameplan for chunked components/pieces of a solution to be determined. Like a timeline for airgap registry in isolation.
- Maybe some of these chunked components can start to be worked on immediately, while the big picture holistic gameplan gets figured out in parallel.
Edited by Christopher McGrath