Skip to content

coco: initial integration for Confidential Containers and Trustee operators#80

Open
beraldoleal wants to merge 8 commits intovalidatedpatterns:mainfrom
beraldoleal:integration-v2
Open

coco: initial integration for Confidential Containers and Trustee operators#80
beraldoleal wants to merge 8 commits intovalidatedpatterns:mainfrom
beraldoleal:integration-v2

Conversation

@beraldoleal
Copy link

@beraldoleal beraldoleal commented Dec 8, 2025

Vide individual commits for messages.

@beraldoleal beraldoleal force-pushed the integration-v2 branch 13 times, most recently from 29c9c84 to 341c962 Compare December 10, 2025 23:06
@beraldoleal beraldoleal force-pushed the integration-v2 branch 10 times, most recently from 5074bb3 to 74e2c74 Compare December 17, 2025 01:08
@beraldoleal beraldoleal marked this pull request as ready for review December 17, 2025 01:08
@beraldoleal
Copy link
Author

ZTWIM GA reconciles changes so the imperative configurations applied here are reverted immediately

We fixed it by adding CREATE_ONLY_MODE=true env var to the ZTWIM operator via OLM subscription config in values-coco-dev.yaml

There is no mention about applying labels to nodes. Otherwise the sample workload fails to be scheduled
We removed the nodeSelector entirely. Peer pods run as VMs, not on worker nodes directly, so the label it was unnecessary for now.

There should be a callout about the instance types that may need to be configured. I tested in eastasia region and the configured instance was not available

I will add a proper CONFIDENTIAL-CONTAINERS.md file.

@beraldoleal
Copy link
Author

Hey @sabre1041, @butler54 , @bpradipt ... let's give this a second shot! I addressed all the comments from the previous review. Feel free to reopen any or add new ones.

This was tested on Azure with AMD SEV-SNP (DCasv6 / Genoa), OCP 4.20.8, using the ZTWIM operator stable-v1 channel, sandbox operator v1.11.0, and trustee operator v1.0.0.

The chart references still point to custom branches.... waiting for @butler54 's PRs. Once those PRs merge, I will update the references. Hopefully that won't be a blocker for review.

@beraldoleal beraldoleal force-pushed the integration-v2 branch 2 times, most recently from a67a701 to c40eff1 Compare March 6, 2026 14:58
@beraldoleal
Copy link
Author

beraldoleal commented Mar 6, 2026

@sabre1041 @butler54 @bpradipt no more fork references. Its using now the official validatedpatterns/charts release versions.

Copy link
Collaborator

@butler54 butler54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few hardcoded vars that definitely need to change.
The biggest question is the use of the imperative framework to generate certificates. If this can be moved to generation in cert manager I think that would be more 'kube friendly'

The justification for this is the imperative framework is always the second last option (the last option being a work done on the developer workstation.

Comment on lines +1 to +2
---

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flag for future work - We should create an issue to add this playbook to the VP ansible collection. @mhjacks

# Generate SPIRE x509pop certificates for CoCo integration
# Creates CA certificate and agent certificates for all workloads

- name: Generate SPIRE x509pop certificates
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to check here whether we should be using certificate manager or the ansible approach here.

To me this would make a lot more sense (if we can) do use cert manager then we'd have less janking around to get things done (still a non-zero amount of janking).

Copy link
Author

@beraldoleal beraldoleal Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree cert-manager would be cleaner. went the imperative route to unblock testing faster. the main friction is SPIRE expects the CA as a ConfigMap and cert-manager outputs Secrets. will explore in a follow-up PR.

Comment on lines +55 to +59
ansible.builtin.shell: |
hash=$(sha256sum "{{ rendered_path }}" | cut -d' ' -f1)
initial_pcr=0000000000000000000000000000000000000000000000000000000000000000
echo -n "$initial_pcr$hash" | python3 -c "import sys,hashlib; print(hashlib.sha256(bytes.fromhex(sys.stdin.read())).hexdigest())"
register: pcr8_hash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get the python script to demonstrably work? This needs to be backported into the coco-pattern to avoid a custom container.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this works. And that is the plan.

@beraldoleal beraldoleal force-pushed the integration-v2 branch 3 times, most recently from ee81a09 to 107cf46 Compare March 12, 2026 13:28
Copy link
Collaborator

@sabre1041 sabre1041 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial set of feedbac

@beraldoleal
Copy link
Author

@sabre1041 thanks for the review, sent a force push with your suggestions. Feel free to reopen or send more comments.

Copy link
Collaborator

@butler54 butler54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beraldoleal.

The original intent of doing this offline was it was an 'as expected' deployment measured from a trusted zone as to what the RVPS values and the deployment should be.

I'm wondering whether rather than changing the script we could do this via a --online flag (e.g. check as deployed).

WDYT

@beraldoleal
Copy link
Author

beraldoleal commented Mar 16, 2026

@beraldoleal.

The original intent of doing this offline was it was an 'as expected' deployment measured from a trusted zone as to what the RVPS values and the deployment should be.

I'm wondering whether rather than changing the script we could do this via a --online flag (e.g. check as deployed).

WDYT

I see, well then I would say to use --osc-version instead. Doesn't query the cluster and its intent based. But then we will start creating two things to do the same calculation, so maybe evaluate the usage of veritas here? If straight forward good, if not we can stick with get-pcr.sh --osc-version for now and evaluate using it later? wdyt?

@beraldoleal beraldoleal force-pushed the integration-v2 branch 2 times, most recently from 198af38 to 007efa8 Compare March 17, 2026 15:07
beraldoleal and others added 8 commits March 20, 2026 17:38
This adds initial integration for Confidential Containers and Trustee
Operators as a separated clustergroup.

Co-authored-by: Chris Butler <chris.butler@redhat.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Add automated configuration for SPIRE Server x509pop NodeAttestor plugin
required for CoCo peer-pods attestation.

CoCo peer-pods run on untrusted cloud infrastructure. Using k8s_psat
would require trusting the cloud provider's cluster. Instead, pods
perform hardware TEE attestation to KBS to obtain x509 certificates as
cryptographic proof of running in genuine confidential hardware, then
use x509pop to register with SPIRE.

The Red Hat SPIRE Operator's SpireServer CRD does not expose x509pop
configuration, requiring a ConfigMap patch via this imperative job.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
Add hello-coco Helm chart demonstrating SPIRE agent deployment in
confidential containers using x509pop node attestation. The chart
deploys a test pod in a CoCo peer-pod (confidential VM with AMD SNP or
Intel TDX) that fetches SPIRE agent certificates from KBS after TEE
attestation, establishing hardware as the root of trust instead of
Kubernetes.

The pod contains three containers: init container fetches sealed
secrets from KBS, SPIRE agent uses x509pop for node attestation, and
test workload receives SPIFFE SVIDs via unix attestation. This
validates the complete integration flow between ZTVP and CoCo
components.

Note: This could be dropped, if we stick with only the todoapp.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Basic markdown file with deployment steps.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
Peer-pods don't have access to the node's pull-secret, needed for
private repos. Use ESO kubernetes provider to sync pull-secret from
openshift-config to the workload namespace.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants