And here is the full stack trace from the keycloak-0 pod:
2024-03-28 16:29:04,882 ERROR [org.keycloak.services.error.KeycloakErrorHandler] (executor-thread-4) Uncaught server error: java.lang.IllegalArgumentException: issuerCertificate cannot be null at org.keycloak.utils.OCSPProvider.check(OCSPProvider.java:84) at dod.p1.keycloak.registration.X509Tools.getX509IdentityFromCertChain(X509Tools.java:212) at dod.p1.keycloak.registration.X509Tools.getX509Identity(X509Tools.java:293) at dod.p1.keycloak.registration.X509Tools.getX509Username(X509Tools.java:112) at dod.p1.keycloak.registration.X509Tools.getX509Username(X509Tools.java:125) at dod.p1.keycloak.registration.RegistrationValidation.buildPage(RegistrationValidation.java:181) at org.keycloak.authentication.FormAuthenticationFlow.renderForm(FormAuthenticationFlow.java:304) at org.keycloak.authentication.FormAuthenticationFlow.processFlow(FormAuthenticationFlow.java:285) at org.keycloak.authentication.DefaultAuthenticationFlow.processSingleFlowExecutionModel(DefaultAuthenticationFlow.java:380) at org.keycloak.authentication.DefaultAuthenticationFlow.processFlow(DefaultAuthenticationFlow.java:249) at org.keycloak.authentication.AuthenticationProcessor.authenticateOnly(AuthenticationProcessor.java:1028) at org.keycloak.authentication.AuthenticationProcessor.authenticate(AuthenticationProcessor.java:885) at org.keycloak.protocol.oidc.endpoints.AuthorizationEndpoint.buildRegister(AuthorizationEndpoint.java:349) at org.keycloak.protocol.oidc.endpoints.AuthorizationEndpoint.process(AuthorizationEndpoint.java:198) at org.keycloak.protocol.oidc.endpoints.AuthorizationEndpoint.buildGet(AuthorizationEndpoint.java:113) at org.keycloak.protocol.oidc.endpoints.AuthorizationEndpoint$quarkusrestinvoker$buildGet_4b690b27439f19dd29733dc5fd4004f24de0adb6.invoke(Unknown Source) at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29) at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:141) at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:145) at io.quarkus.vertx.core.runtime.VertxCoreRecorder$14.runWith(VertxCoreRecorder.java:576) at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2513) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1538) at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29) at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:840)
Switched to the bigbang master branch and pulled latest changes
Changed all instances of bigbang.dev to dev.bigbang.mil in my values
Redeployed
I'm still experiencing the same issue. These same values work fine for us with the keycloak chart bundled with bigbang 2.22.0, so there's definitely some sort of issue. We've reproduced it with multiple CACs.
According to that logic, issuerCertificate would be set to null if there is only one certificate in the certificate chain. Then, the OCSP check itself seems to always throw an exception if the passed issuerCertificate is null.
I don't know why there would only be one certificate in the certificate chain. Maybe this logic doesn't work with all CACs all the time?
We just tried @ryan.j.garcia's CAC against my "broken" instance, and he was able to register without any issue. So, the issue doesn't exist for all CACs.
More information: This seems to be a problem with macOS. Everyone who has reproduced the problem (several) is using macOS. It seems that macOS does not send the entire certificate chain up, but only the client certificate. The new OCSP checking code in p1-auth-plugin's user registration workflow throws an exception if there is only one certificate in the certificate chain. This means macOS users cannot register using a CAC.
I created a Windows virtual machine to test. Using Windows, everything worked just fine, and I was able to complete user registration with a CAC. With debug logging, I could see that the entire certificate chain was sent to the server, which allowed the new changes in p1-auth-plugin to get through the OCSP check without error.
Hey guys, thanks for everyone's patience on this. I was finally able to carve out some time this evening to look at the issue at the code level. From everyone's feedback it appears we might have an issue with our OCSP check we introduced in 3.3.0 to address the activecac attribute getting set with a user that attempts a CAC login that fails OCSP check, but still proceeds to login via user/pw from the same session. This new feature was tested in Keycloak Proxy mode behind Istio, so my suspicion is that Istio might be adding the chain in to the XFCC header that Keycloak looks at in proxy mode, but this does not appear to happen when Keycloak is in edge/passthrough mode, so when certain browsers don't provide the full chain we hit this error.
The two solutions discussed were:
Toggle this feature off via config (Pending)
Check against the Keycloak truststore if the httpRequest does not include the chain "X509Certificate[] certs = provider.getCertificateChain(httpRequest);"
I updated the plugin code in a new branch with solution 2 implemented. This means make sure your truststore is complete and loaded, which goes without saying when using CAC.
I don't have a CAC so I can't test, but I pushed it to a container for testers. I did load it in our P1/CNAP staging cluster and it loaded fine, but staging is behind Istio proxy mode, so this was more to confirm it loaded and didn't blow up anything. Once we confirm this works we will add the needed test code coverage and get it pushed to IB.
I tested the beta plugin in my environment.
registry.dso.mil/big-bang/product/packages/keycloak/cnap-keycloak/p1-keycloak-plugin:3.3.1-1-beta
CAC login/authentication works with Linux Ubuntu
Will get a team member to test with Mac
I wonder if the truststore.jks needs to be updated with new DoD CA certs. With Linux Ubuntu CAC authentication I see this in the logs
2024-04-04 16:16:33,919 ERROR [dod.p1.keycloak.registration.X509Tools] (executor-thread-50) No trusted CA in certificate found: CN=DOD ID CA-65, OU=PKI, OU=DoD, O=U.S. Government, C=US. Add it to truststore SPI if valid.
Just to add content here from our testing together @kevin.wilder, we have 2 Mac users that it does work for on your dev Keycloak, so we will wait for your feedback once your Mac users re-test with your updated truststore dev Keycloak. Luis on our team will also test himself (Mac User) on his own dev Keycloak and once we get a confirmation we will work to push it into IB.
There is still an error in the logs from the OCSP check
2024-04-04 19:37:12,749 ERROR [dod.p1.keycloak.registration.X509Tools] (executor-thread-17) No trusted CA in certificate found: CN=DOD ID CA-70, OU=PKI, OU=DoD, O=U.S. Government, C=US. Add it to truststore SPI if valid.
@kevin.wilder do you still have access to create an MR? I went through the DOD certs and truststore process the other week but the script kept failing and couldn't tell if there were any actual changes
Note: CNAP team has plans to fix the OCSP check error that shows in the logs. Not sure on the turnaround time.
2024-04-04 19:37:12,749 ERROR [dod.p1.keycloak.registration.X509Tools] (executor-thread-17) No trusted CA in certificate found: CN=DOD ID CA-70, OU=PKI, OU=DoD, O=U.S. Government, C=US. Add it to truststore SPI if valid.
The fix now supports toggling the feature on/off. By default it will be off.
To enable:
via command arg: "--spi-baby-yoda-ocsp-enabled=true" or via ENV: KC_SPI_BABY_YODA_OCSP_ENABLED: "true"These also need to be set so the code can find the truststore when the bundle isn't presented by CAC: KC_SPI_TRUSTSTORE_FILE_FILE: "/opt/keycloak/certs/truststore.jks"KC_SPI_TRUSTSTORE_FILE_PASSWORD: "trust_pw"
I can't fully test as I don't have a revoked CAC or active CAC, but hopefully this is good to go. We will wait for confirmation from testers before moving forward though.
I just ran some tests with registry.dso.mil/big-bang/product/packages/keycloak/cnap-keycloak/p1-keycloak-plugin:3.3.1-3-beta. I think this probably still isn't functioning quite the way you intended.
When OCSP checking is enabled and the issuer certificate can't be found, the OCSP check is skipped, and user registration is still allowed to continue. I would expect that when OCSP checking is enabled, failure to find the issuer certificate would behave the same as a failed OCSP check, blocking user registration.
Here are the three tests I ran. I strongly suggest writing unit tests for these conditions.
PASS Test OCSP checking disabled and no trust store set.
Environment variables:
KC_SPI_BABY_YODA_OCSP_ENABLED unset
KC_SPI_TRUSTSTORE_FILE_FILE unset
KC_SPI_TRUSTSTORE_FILE_PASSWORD unset
Expected behavior:
Keycloak will not attempt to find my issuer certificate.
Keycloak will not attempt OCSP checking.
I will be able to register.
Actual behavior:
PASS Keycloak did not attempt to find my issuer certificate.
PASS Keycloak did not attempt OCSP checking.
PASS I was able to register.
Logs
2024-04-12 15:10:02,193 INFO [dod.p1.keycloak.registration.X509Tools] (executor-thread-2) ZacsOCSPProvider Mode Set: false2024-04-12 15:10:02,202 INFO [dod.p1.keycloak.registration.X509Tools] (executor-thread-2) P1_X509_TOOLS_GET_X509_IDENTITY_FROM_CHAIN_5dff3e80-327b-40a2-9af3-e67bdb8dad0c checking cert policy 2.16.840.1.101.2.1.11.422024-04-12 15:10:02,376 INFO [dod.p1.keycloak.registration.X509Tools] (executor-thread-2) ZacsOCSPProvider Mode Set: false2024-04-12 15:10:02,377 INFO [dod.p1.keycloak.registration.X509Tools] (executor-thread-2) P1_X509_TOOLS_GET_X509_IDENTITY_FROM_CHAIN_5dff3e80-327b-40a2-9af3-e67bdb8dad0c checking cert policy 2.16.840.1.101.2.1.11.42
FAIL Test OCSP checking enabled, but trust store is not set.
Environment variables:
KC_SPI_BABY_YODA_OCSP_ENABLED: "true"
KC_SPI_TRUSTSTORE_FILE_FILE: unset
KC_SPI_TRUSTSTORE_FILE_PASSWORD unset
Expected behavior:
Keycloak will be unable to find my issuer certificate.
OCSP checking will be attempted and ultimately fail because my issuer certificate is not present.
I will be unable to register.
Actual behavior:
PASS Keycloak was unable to find my issuer certificate.
FAIL OCSP checking did not occur.
FAIL I was still able to register.
Logs
2024-04-12 14:19:39,354 WARN [org.keycloak.services] (executor-thread-4) KC-SERVICES0091: Request is missing scope 'openid' so it's not treated as OIDC, but just pure OAuth2 request.2024-04-12 14:19:39,367 INFO [dod.p1.keycloak.registration.X509Tools] (executor-thread-4) ZacsOCSPProvider Mode Set: true2024-04-12 14:19:39,375 WARN [dod.p1.keycloak.registration.X509Tools] (executor-thread-4) P1_X509_TOOLS_GET_X509_IDENTITY_FROM_CHAIN_07aa2456-eed9-4667-993a-148659f900c1 No trusted CA in certificate found: CN=DOD ID CA-64, OU=PKI, OU=DoD, O=U.S. Government, C=US2024-04-12 14:19:39,376 INFO [dod.p1.keycloak.registration.X509Tools] (executor-thread-4) ZacsOCSPProvider Mode Set: true2024-04-12 14:19:39,376 WARN [dod.p1.keycloak.registration.X509Tools] (executor-thread-4) P1_X509_TOOLS_GET_X509_IDENTITY_FROM_CHAIN_07aa2456-eed9-4667-993a-148659f900c1 No trusted CA in certificate found: CN=DOD ID CA-64, OU=PKI, OU=DoD, O=U.S. Government, C=US
PASS Test OCSP checking enabled, and trust store variables are set.
The CAC Auth issue seems to have been solved by Zac Williamson. Some had issues with their CAC on 3.3.0 but it is working on CNAPs 3.3.1-3-beta image -- closing this issue.
Maybe it was premature to close this issue. We have the beta plugin image, but Big Bang has not yet released an official 3.3.1 image. I am sure they are working on it.
Re-opening this issue, since the official 3.3.1 plugin has not been fully tested and rolled out.
I hit this testing SSO on Linux yesterday. Hacking the plugin, I saw the full chain wasn't being sent. @jrb pointed me to this existing issue -- glad it was a known issue.
@luis.lahoz I'll test the changes on your Keycloak P1 Auth Plugin Keycloak P1 Auth Plugin / fix-ocsp-check branch and review the code too.
I can verify on Linux. Looks like @jrb can verify on MAC.
Linux / Firefox --> Intermediate passed through if loaded into Firefox
Linux / Chrome --> Intermediate NOT passed through even if loaded
Windows / ? --> Passed Through
SOLUTION
I found out Firefox was NOT passing the Issuer, because the issuer wasn't trusted.
When I loaded in the DODIDCA_63.cer (the Authority for my cert) into Firefox, the intermediate issuer was passed through, and I hit the registration page.
Chrome on Linux still doesn't pass the intermediate cert.
@kevin.wilder@jrb can you test the existing 3.3.0 and verify the CA is loaded/recognized as an authority in your browser -- if you're using Firefox.
Just synced up with @michaelmartin and @zacw. We determined the fix in registry.dso.mil/big-bang/product/packages/keycloak/cnap-keycloak/p1-keycloak-plugin:3.3.1-3-beta is working as intended.
The auth plugin is correctly performing an OCSP check when the issuer cert is found in the trust store.
When the issuer cert is not found in the trust store, it's correctly creating the user without the activecac attribute.
What do we need to do to push this across the finish line?
@luis.lahoz I don't have access to the code.il2 link you provided above. Can you confirm the new tag version? I think we may need to create an issue for IB to publish the corresponding image, which I will do once you confirm the version.
Who/what is responsible for tagging/is there anyone we can poke to help move things forward? This is blocking a few issues and it would be great to get it closed out.
Hi Fellas. You need to do more than just open an Iron Bank issue. Iron Bank does not own the container hardening project, the Big Bang Team owns it. The detailed documentation was removed from the maintenance doc sometime in the past year, not sure why. I went back 1 year to an old release tag and found the steps. See step 19 here: https://repo1.dso.mil/big-bang/product/packages/keycloak/-/blob/18.4.3-bb.0/docs/DEVELOPMENT_MAINTENANCE.md
19. After all testing locally and k8s testing with a BigBang deployment has been completed publish an official plugin image in [IronBank](https://repo1.dso.mil/dsop/big-bang/p1-keycloak-plugin). The order that things should happen with BigBang MRs, BigBang release, and publishing plugin image in IronBank is still unclear. Brief instructions for IronBank pipeline maintenance: 1. The PartyBus IL2 MDO pipeline publishes the p1-keycloak-plugin-X.X.X.jar artifact back to the [P1 Keycloak Package Registry](https://repo1.dso.mil/big-bang/apps/product-tools/keycloak-p1-auth-plugin/-/packages). If PartyBus IL2 MDO pipeline is failing for reasons outside of our control, worst case, we can [manualy publish](https://docs.gitlab.com/ee/user/packages/generic_packages/) to the plugin package registry. The BigBang MR will need the `tests/test-values.yaml` updated to use the new plugin init-container tag. The test values can use the image that was created in Big Bang Keycloak plugin container registry](https://repo1.dso.mil/big-bang/apps/product-tools/keycloak-p1-auth-plugin/container_registry/). There is no hard requirement to use the IronBank plugin image for the test values. 2. At the [IronBank repo](https://repo1.dso.mil/dsop/big-bang/p1-keycloak-plugin) create a new issue with template "application update" 3. Create a branch and MR from the issue. 4. update hardening_manifest.yaml - the tag version - the label org.opencontainers.image.version - resources.url - resources.validation.value (use sha256sum to get the hash value of the jar file) 5. commit and push code changes 6. Verify that pipline passes 7. Complete checkbox items in the issue. You will need to request [VAT](https://vat.dso.mil/) "Vendor Contributor" access if there are any new findings that need justification. 8. Mark the MR as ready and apply label "Hardening:Review" to the MR and the issue. 9. Monitor the issue to make sure it keeps moving with the Container Hardening Team(CHT).