Conversation
7edfb48 to
b271b83
Compare
Collaborator
jonathanmetzman
left a comment
There was a problem hiding this comment.
left some very surface level comments.
d5684e5 to
44737d3
Compare
This commit introduces the Kubernetes job client and service, providing a mechanism to schedule tasks on Kubernetes clusters (including GKE and Kind), supporting both standard and Kata Containers.
Key Features & Changes:
- **Kubernetes Service**: Implemented `KubernetesService` in `clusterfuzz._internal.k8s.service` to manage job creation.
- **Kata Support**: Added specialized job creation for Kata Containers (`create_kata_container_job`) with required security context (`privileged`, `capabilities: ALL`), networking (`hostNetwork: True`), and environment variables (`HOST_UID`).
- **Dependency Management**: Added `kubernetes` and necessary Google Cloud dependencies (`google-api-python-client`, `google-cloud-storage`, `google-cloud-ndb`, etc.) to `Pipfile`.
- **E2E Testing**:
- Created `tests.core.k8s.k8s_service_e2e_test` to verify job lifecycle on a local Kind cluster.
- Updated `local/tests/kubernetes_e2e_test.bash` to provision the test environment.
- Updated CI workflow (`.github/workflows/kubernetes-e2e-tests.yaml`) to install JDK 21 (required for Datastore emulator).
- Tests now verify job "Running" status to avoid timeouts with long-running commands.
- `KubernetesService` skips default credential loading when `K8S_E2E` is set to utilize the test-provided kubeconfig.
- **Unit Tests**: Added comprehensive unit tests in `tests.core.k8s.k8s_service_test` and `tests.core.kubernetes.kubernetes_test`, including mocking of `load_kube_config` and `_load_gke_credentials` to ensure robust testing without external dependencies.
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
decoNR
reviewed
Jan 19, 2026
src/clusterfuzz/_internal/tests/core/k8s/k8s_service_e2e_test.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
9ba397a to
5a92336
Compare
decoNR
reviewed
Jan 20, 2026
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
74cf7da to
5a3a5a2
Compare
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
7c57a49 to
1aa9ae5
Compare
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
30cc0ae to
506f583
Compare
Collaborator
Author
We're not, and will it's not part of the plan using it for the clusters. You can see more details on go/clusterfuzz-to-kubernetes |
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
506f583 to
43ddec1
Compare
decoNR
reviewed
Jan 21, 2026
|
|
||
|
|
||
| def _get_config_names(remote_tasks: typing.List[types.RemoteTask]): | ||
| """"Gets the name of the configs for each batch_task. Returns a dict |
Collaborator
There was a problem hiding this comment.
""""Gets
should be """Gets
| ca_cert = base64.b64decode(cluster['masterAuth']['clusterCaCertificate']) | ||
|
|
||
| # Write CA cert to a temporary file. | ||
| fd, ca_cert_path = tempfile.mkstemp() |
jonathanmetzman
approved these changes
Jan 22, 2026
ViniciustCosta
approved these changes
Jan 22, 2026
81816be to
fd8f6b3
Compare
Signed-off-by: Javan Lacerda <javanlacerda@google.com>
vitaliset
added a commit
that referenced
this pull request
Jan 22, 2026
This reverts commit c97d8cc.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces full support for scheduling and managing fuzzing tasks on Kubernetes clusters,
specifically targeting GKE. It implements a new KubernetesService to
handle batch job creation, supports Kata Containers for isolation, and includes robust testing
and configuration mechanisms.
Key Features:
Jobs. It supports both standard and Kata Container runtimes, automatic Service Account
creation with Workload Identity, and intelligent job limiting to prevent cluster overload.
routes tasks between the legacy GCP Batch service and the new Kubernetes service based on
configurable probabilities, allowing for a gradual, controlled migration.
behaviors like job concurrency limits.
Detailed Changes by Module:
Kubernetes Integration (
src/clusterfuzz/_internal/k8s/):monitoring, limiting). Includes GKE credential loading, Kata Container spec generation,
and Service Account provisioning.
k8s_service_e2e_test.py (integration on Kind).
Remote Task Management (
src/clusterfuzz/_internal/remote_task/):RemoteTaskInterface. It initializes both GcpBatchService and KubernetesService and
distributes tasks between them based on probabilities defined in job_frequency.py. This
enables traffic splitting (e.g., 10% to K8s, 90% to Batch) for safe rollout.
abstractions.
Datastore & Configuration (
src/clusterfuzz/_internal/datastore/):K8S_PENDING_JOBS_LIMITER).
Batch & Legacy Refactoring (
src/clusterfuzz/_internal/batch/):structure.
Infrastructure & CI:
cluster.
Bot & Metrics:
gate.
Evidences: