Skip to content

feat(helm): Add Presto query engine support to the Helm chart.#2004

Open
junhaoliao wants to merge 5 commits intoy-scope:mainfrom
junhaoliao:k8s-presto
Open

feat(helm): Add Presto query engine support to the Helm chart.#2004
junhaoliao wants to merge 5 commits intoy-scope:mainfrom
junhaoliao:k8s-presto

Conversation

@junhaoliao
Copy link
Member

@junhaoliao junhaoliao commented Feb 17, 2026

Description

Add Presto coordinator and worker deployments to the CLP Helm chart, gated by
clpConfig.package.query_engine: "presto". When Presto mode is enabled, the chart deploys a Presto coordinator (single replica) and configurable Presto worker(s) instead of the clp-s query pipeline (query scheduler, query workers, reducers). The api-server is also excluded in Presto mode.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

1. Helm template renders correctly (Presto mode)

Task: Verify the Helm chart renders all Presto resources and excludes query/reducer/api-server
templates when query_engine is "presto".

Command:

cd tools/deployment/package-helm
helm template clp . --set clpConfig.package.query_engine="presto" 2>&1 | grep "^# Source:" | sort

Output:

# Source: clp/templates/compression-scheduler-deployment.yaml
# Source: clp/templates/compression-scheduler-logs-pvc.yaml
# Source: clp/templates/compression-scheduler-logs-pv.yaml
# Source: clp/templates/compression-scheduler-user-logs-pvc.yaml
# Source: clp/templates/compression-scheduler-user-logs-pv.yaml
# Source: clp/templates/compression-worker-deployment.yaml
# Source: clp/templates/compression-worker-logs-pvc.yaml
# Source: clp/templates/compression-worker-logs-pv.yaml
# Source: clp/templates/compression-worker-tmp-pvc.yaml
# Source: clp/templates/compression-worker-tmp-pv.yaml
# Source: clp/templates/configmap.yaml
# Source: clp/templates/database-data-pv.yaml
# Source: clp/templates/database-logs-pv.yaml
# Source: clp/templates/database-secret.yaml
# Source: clp/templates/database-service.yaml
# Source: clp/templates/database-statefulset.yaml
# Source: clp/templates/db-table-creator-job.yaml
# Source: clp/templates/garbage-collector-deployment.yaml
# Source: clp/templates/garbage-collector-logs-pvc.yaml
# Source: clp/templates/garbage-collector-logs-pv.yaml
# Source: clp/templates/job-watcher-rolebinding.yaml
# Source: clp/templates/job-watcher-role.yaml
# Source: clp/templates/job-watcher-serviceaccount.yaml
# Source: clp/templates/presto-coordinator-deployment.yaml
# Source: clp/templates/presto-coordinator-service.yaml
# Source: clp/templates/presto-worker-deployment.yaml
# Source: clp/templates/queue-logs-pv.yaml
# Source: clp/templates/queue-secret.yaml
# Source: clp/templates/queue-service.yaml
# Source: clp/templates/queue-statefulset.yaml
# Source: clp/templates/redis-data-pv.yaml
# Source: clp/templates/redis-logs-pv.yaml
# Source: clp/templates/redis-secret.yaml
# Source: clp/templates/redis-service.yaml
# Source: clp/templates/redis-statefulset.yaml
# Source: clp/templates/results-cache-data-pv.yaml
# Source: clp/templates/results-cache-indices-creator-job.yaml
# Source: clp/templates/results-cache-logs-pv.yaml
# Source: clp/templates/results-cache-service.yaml
# Source: clp/templates/results-cache-statefulset.yaml
# Source: clp/templates/shared-data-archives-pvc.yaml
# Source: clp/templates/shared-data-archives-pv.yaml
# Source: clp/templates/shared-data-streams-pvc.yaml
# Source: clp/templates/shared-data-streams-pv.yaml
# Source: clp/templates/storage-class.yaml
# Source: clp/templates/webui-deployment.yaml
# Source: clp/templates/webui-service.yaml

Explanation: Presto coordinator, service, and worker templates are rendered. Api-server, query scheduler, query worker, and reducer templates are correctly excluded.

2. Helm template renders correctly (clp-s mode)

Task: Verify the clp-s query pipeline resources are rendered when query_engine is "clp-s", and Presto resources are excluded.

Command:

helm template clp . --set clpConfig.package.query_engine="clp-s" 2>&1 | grep "^# Source:" | sort

Output:

# Source: clp/templates/api-server-deployment.yaml
# Source: clp/templates/api-server-logs-pvc.yaml
# Source: clp/templates/api-server-logs-pv.yaml
# Source: clp/templates/api-server-service.yaml
...
# Source: clp/templates/query-scheduler-deployment.yaml
# Source: clp/templates/query-scheduler-logs-pvc.yaml
# Source: clp/templates/query-scheduler-logs-pv.yaml
# Source: clp/templates/query-scheduler-service.yaml
# Source: clp/templates/query-worker-deployment.yaml
# Source: clp/templates/query-worker-logs-pvc.yaml
# Source: clp/templates/query-worker-logs-pv.yaml
...
# Source: clp/templates/reducer-deployment.yaml
# Source: clp/templates/reducer-logs-pvc.yaml
# Source: clp/templates/reducer-logs-pv.yaml
# Source: clp/templates/reducer-service.yaml
# Source: clp/templates/results-cache-data-pv.yaml
# Source: clp/templates/results-cache-indices-creator-job.yaml
# Source: clp/templates/results-cache-logs-pv.yaml
# Source: clp/templates/results-cache-service.yaml
# Source: clp/templates/results-cache-statefulset.yaml
...

Explanation: Api-server, query scheduler, query worker, reducer, and results cache templates are rendered. Presto templates are correctly excluded. (Shared infrastructure templates omitted for brevity.)

3. Helm template renders correctly using the docs' presto-values.yaml

Task: Verify that a presto-values.yaml file created by following the guides-using-presto.md documentation renders without errors.

presto-values.yaml (following the docs, with placeholders replaced):

clpConfig:
  package:
    query_engine: "presto"

  results_cache:
    retention_period: null

  presto:
    split_filter:
      clp.default.default:
        - columnName: "timestamp"
          customOptions:
            rangeMapping:
              lowerBound: "begin_timestamp"
              upperBound: "end_timestamp"
          required: false

Command:

helm template clp . -f /tmp/presto-values.yaml 2>&1 | grep "^# Source:" | sort

Output:

# Source: clp/templates/compression-scheduler-deployment.yaml
...
# Source: clp/templates/presto-coordinator-deployment.yaml
# Source: clp/templates/presto-coordinator-service.yaml
# Source: clp/templates/presto-worker-deployment.yaml
...
# Source: clp/templates/webui-deployment.yaml
# Source: clp/templates/webui-service.yaml

Explanation: The template renders without errors using the exact values file from the docs
(with placeholders filled in). The garbage-collector is correctly excluded because
results_cache.retention_period is set to null and archive_output.retention_period defaults to null, so there is nothing to garbage-collect.

4. Multi-node cluster: clp-s mode (default)

Task: Verify that the Helm chart deploys correctly in clp-s mode on a multi-node kind cluster with dedicated worker nodes.

Command:

cd tools/deployment/package-helm
bash set-up-multi-dedicated-test.sh

Output (final pod status):

NAME                                              READY   STATUS      RESTARTS   AGE
test-clp-api-server-bbcc6684-nsx5k                1/1     Running     0          2m43s
test-clp-compression-scheduler-5c74dfd974-vtdk8   1/1     Running     0          2m43s
test-clp-compression-worker-57c58768b-k8gs5       1/1     Running     0          2m43s
test-clp-compression-worker-57c58768b-kgx4x       1/1     Running     0          2m43s
test-clp-database-0                               1/1     Running     0          2m43s
test-clp-db-table-creator-77qv6                   0/1     Completed   0          2m43s
test-clp-garbage-collector-595f847dc-gqqsw        1/1     Running     0          2m43s
test-clp-query-scheduler-57879c8c7b-lw2fj         1/1     Running     0          2m43s
test-clp-query-worker-7fc9f65988-flmrm            1/1     Running     0          2m43s
test-clp-query-worker-7fc9f65988-ndp9x            1/1     Running     0          2m43s
test-clp-queue-0                                  1/1     Running     0          2m43s
test-clp-redis-0                                  1/1     Running     0          2m43s
test-clp-reducer-664c8545c8-dwhgm                 1/1     Running     0          2m43s
test-clp-reducer-664c8545c8-qzs9q                 1/1     Running     0          2m43s
test-clp-results-cache-0                          1/1     Running     0          2m43s
test-clp-results-cache-indices-creator-8wdmk      0/1     Completed   0          2m43s
test-clp-webui-dc49878f7-nfdpl                    1/1     Running     0          2m43s

Explanation: All clp-s pipeline pods are running (api-server, query-scheduler, 2x query-worker, 2x reducer, results-cache, etc.). No Presto pods are present.

Node scheduling verification:

=== Pod scheduling ===
test-clp-compression-worker-57c58768b-k8gs5   clp-test-worker    (compression node)
test-clp-compression-worker-57c58768b-kgx4x   clp-test-worker    (compression node)
test-clp-query-worker-7fc9f65988-flmrm        clp-test-worker4   (query node)
test-clp-query-worker-7fc9f65988-ndp9x        clp-test-worker4   (query node)

Compression workers are scheduled on compression-labeled nodes and query workers on query-labeled
nodes.

5. Multi-node cluster: Presto mode

Task: Verify that the Helm chart deploys correctly in Presto mode on a multi-node kind cluster.
The test script was temporarily modified to add -f /tmp/presto-values.yaml to the helm install
command and remove the query worker/reducer --set flags (since those components are not deployed in
Presto mode).

Command:

# Temporarily modified set-up-multi-dedicated-test.sh helm install to:
#   helm install test "${script_dir}" \
#       -f /tmp/presto-values.yaml \
#       --set "distributedDeployment=true" \
#       ...
bash set-up-multi-dedicated-test.sh

Output (final pod status):

NAME                                              READY   STATUS      RESTARTS   AGE
test-clp-compression-scheduler-5c74dfd974-jvlq8   1/1     Running     0          2m28s
test-clp-compression-worker-57c58768b-d26nl       1/1     Running     0          2m28s
test-clp-compression-worker-57c58768b-pqwvh       1/1     Running     0          2m28s
test-clp-database-0                               1/1     Running     0          2m28s
test-clp-db-table-creator-kz8dv                   0/1     Completed   0          2m28s
test-clp-presto-coordinator-77897595b5-jns8j      1/1     Running     0          2m28s
test-clp-presto-worker-794b756d5d-bs5hs           1/1     Running     0          2m28s
test-clp-queue-0                                  1/1     Running     0          2m28s
test-clp-redis-0                                  1/1     Running     0          2m28s
test-clp-results-cache-0                          1/1     Running     0          2m28s
test-clp-results-cache-indices-creator-dm4p8      0/1     Completed   0          2m28s
test-clp-webui-dc49878f7-nzx7h                    1/1     Running     0          2m28s
All jobs completed and services are ready.

Explanation: Presto coordinator and worker pods are running. Api-server, query-scheduler,
query-worker, and reducer pods are correctly absent. The Presto worker init container successfully detected the coordinator version and initialized its config.

Presto pod verification:

=== Presto pods running ===
test-clp-presto-coordinator-77897595b5-jns8j      1/1     Running
test-clp-presto-worker-794b756d5d-bs5hs           1/1     Running

=== Query/reducer pods (should be NONE) ===
(none - correct)

=== Api-server pods (should be NONE) ===
(none - correct)

Compressed sample logs postgresql.jsonl , performed queries in the webui and observed results were correctly returned.

Summary by CodeRabbit

  • New Features

    • Added Presto as an alternative query engine with coordinator/worker components and configurable worker scheduling and storage options.
  • Documentation

    • Added comprehensive Presto deployment guides for Kubernetes (Helm) and Docker Compose, including example values, deployment steps, scheduling guidance and configuration references.

@junhaoliao junhaoliao requested a review from a team as a code owner February 17, 2026 20:40
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 17, 2026

Walkthrough

Adds Presto support to the Helm deployment: new Presto coordinator/worker templates, Presto-specific config and values, and conditional gating that disables default query-scheduler/worker/reducer (and related PV/PVC/Service) when .Values.clpConfig.package.query_engine == "presto". Documentation for Kubernetes and Docker Compose is added.

Changes

Cohort / File(s) Summary
Documentation
docs/src/user-docs/guides-k8s-deployment.md, docs/src/user-docs/guides-using-presto.md
Adds Presto deployment guidance for Kubernetes (Helm) and Docker Compose, example values (presto-values.yaml), scheduling notes, anchors and cross-references.
Helm: API Server
tools/deployment/package-helm/templates/api-server-deployment.yaml, tools/deployment/package-helm/templates/api-server-logs-pv.yaml, tools/deployment/package-helm/templates/api-server-logs-pvc.yaml, tools/deployment/package-helm/templates/api-server-service.yaml
Adds conditional checks to only render API server resources when clpConfig.api_server is true and package.query_engine != "presto".
Helm: Query Scheduler
tools/deployment/package-helm/templates/query-scheduler-deployment.yaml, .../query-scheduler-logs-pv.yaml, .../query-scheduler-logs-pvc.yaml, .../query-scheduler-service.yaml
Wraps scheduler Deployment, PV, PVC and Service in conditionals so they are omitted when query_engine == "presto".
Helm: Query Worker & Storage
tools/deployment/package-helm/templates/query-worker-deployment.yaml, .../query-worker-logs-pv.yaml, .../query-worker-logs-pvc.yaml, .../query-worker-staged-streams-pv.yaml, .../query-worker-staged-streams-pvc.yaml
Guards worker Deployment and related PV/PVCs behind query_engine != "presto"; staged-streams PV creation call augmented with explicit metadata fields.
Helm: Reducer
tools/deployment/package-helm/templates/reducer-deployment.yaml, .../reducer-logs-pv.yaml, .../reducer-logs-pvc.yaml, .../reducer-service.yaml
Adds conditional wrapper to omit reducer Deployment, PV, PVC and Service when using Presto.
Helm: Presto resources
tools/deployment/package-helm/templates/presto-coordinator-deployment.yaml, .../presto-coordinator-service.yaml, .../presto-worker-deployment.yaml
Introduces new templates for Presto coordinator and worker (deployments, service); includes init containers, readiness/liveness probes, configMap/volume mounts and optional archive volume handling.
Helm: Config & Values
tools/deployment/package-helm/templates/configmap.yaml, tools/deployment/package-helm/values.yaml, tools/deployment/package-helm/Chart.yaml
Adds Presto blocks to the ConfigMap (coordinator/worker, catalog, JVM, node properties, split-filter) gated by package.query_engine == "presto"; adds Presto image entries, prestoWorker values, memory settings and bumps chart version.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Helm as Helm chart renderer
participant Values as Values.yaml
participant ConfigMap as clp-config ConfigMap
participant PrestoCoord as Presto Coordinator (Deployment + Service)
participant PrestoWorker as Presto Worker (Deployment)
participant Legacy as Default Scheduler/Worker/Reducer

Values->>Helm: set package.query_engine="presto"
Helm->>ConfigMap: render Presto config (catalog, JVM, node props)
Helm->>PrestoCoord: create Deployment + Service (presto-coordinator)
PrestoWorker->>PrestoCoord: init waits for coordinator endpoint
Helm->>PrestoWorker: create Presto worker Deployment (setup init containers, mount configs)
Helm->>Legacy: do not render query-scheduler/worker/reducer resources (conditional skip)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding Presto query engine support to the Helm chart, which is the primary focus of all modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/src/user-docs/guides-k8s-deployment.md`:
- Around line 359-362: Update the note text in the docs to correct the list of
components replaced when `query_engine` = "presto": remove "results cache" from
the excluded list and add "api-server"; reference the Helm gating logic when
editing the sentence so it reads that Presto replaces the "query scheduler,
query workers, reducers, and api-server" (since `query_scheduler`,
`query_workers`, and `reducers` are gated by `{{- if ne
.Values.clpConfig.package.query_engine \"presto\" }}` and the `api-server`
deployment is gated by `{{- if and .Values.clpConfig.api_server (ne
.Values.clpConfig.package.query_engine \"presto\") }}`), while `results-cache-*`
manifests are not conditionally removed.
- Around line 311-315: The prestoCoordinator and prestoWorker image definitions
currently use mutable tag values (tag: "dev"); update these to pinned/stable
release tags (e.g., tag: "vX.Y.Z") for both prestoCoordinator and prestoWorker
so deployments pull a fixed image; locate the tag fields under the
prestoCoordinator and prestoWorker blocks and replace "dev" with the chosen
semantic version string (or a chart value reference like {{
.Values.image.prestoCoordinator.tag }} and {{ .Values.image.prestoWorker.tag }}
if you prefer templating) to ensure reproducible deployments.

In `@docs/src/user-docs/guides-using-presto.md`:
- Around line 66-70: The docs show only a helm install command which will fail
when CLP is already deployed; update the step under the Presto install to
include the alternative helm upgrade invocation so users can perform an upgrade
when a release exists—specifically add guidance to run `helm upgrade clp clp/clp
DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml` as the alternative to `helm
install` (referencing the existing `helm install clp clp/clp
DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml` and the `presto-values.yaml`
values file).

In `@tools/deployment/package-helm/templates/configmap.yaml`:
- Around line 369-378: The JVM Xmx in presto-coordinator-config-jvm.config is
hardcoded as "-Xmx4G"; change it to render a Helm value so operators can
configure it: replace the literal "-Xmx4G" with a template that reads
.Values.clpConfig.presto.coordinator.jvm_max_heap (with a sensible default like
"4G"), ensure consumers pass the unit (e.g., "4G") or document it, and add the
new key plus a descriptive default and comment to values.yaml so the chart is
configurable and documented.
- Around line 347-348: The ConfigMap is embedding sensitive values
(clp.metadata-db-user, clp.metadata-db-password, clp.s3-access-key-id,
clp.s3-secret-access-key); remove these keys from templates/configmap.yaml and
instead create a Helm Secret template that stores those values (e.g.,
templates/secret.yaml with .Values.credentials.database and
.Values.credentials.s3 entries), then update consumers to read them from the
Secret (use env valueFrom.secretKeyRef, secret projection into a volume, or
template lookup/include to inject non-plain text values) so credentials are
stored as Kubernetes Secrets rather than literal ConfigMap entries.
- Line 397: The ConfigMap currently hardcodes AWS hostnames for the Presto S3
path via the clp.s3-end-point entry using {{ .s3_config.bucket }} and {{
.s3_config.region_code }}, which breaks S3-compatible stores; change the
template to prefer the configured endpoint_url on .s3_config (e.g.,
archive_output.storage.s3_config.endpoint_url / .s3_config.endpoint_url) when
present and fall back to the existing AWS-format URL only if endpoint_url is
empty, so Presto uses the explicit endpoint for MinIO/other S3-compatible
services.

In `@tools/deployment/package-helm/templates/presto-coordinator-deployment.yaml`:
- Around line 33-63: The presto-coordinator container lacks Kubernetes resource
requests/limits; add a configurable resources block for the container named
"presto-coordinator" (e.g., read values from a new or existing Helm key such as
.Values.prestoCoordinator.resources or .Values.resources.prestoCoordinator) and
populate both requests and limits for cpu and memory so they can be tuned
per-environment; update the chart values.yaml to include sensible defaults and
reference those values in the presto-coordinator deployment template.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml`:
- Around line 62-93: Add Kubernetes resource requests and limits to the
presto-worker container (container name "presto-worker") so CPU/memory are
enforced; update the deployment template to include a resources: block under
that container using Helm values (e.g. reference values like
.Values.resources.prestoWorker.requests.cpu,
.Values.resources.prestoWorker.requests.memory,
.Values.resources.prestoWorker.limits.cpu,
.Values.resources.prestoWorker.limits.memory or similar keys used elsewhere) and
ensure the memory request/limit aligns reasonably with the Presto config
(system_memory_gb/query_memory_gb) to avoid OOMKills; keep the block next to
ports/volumeMounts in the "presto-worker" container spec.
- Around line 38-55: The startup script that sets DISCOVERY_URI, calls wget into
VERSION and extracts PRESTO_VERSION can produce an empty PRESTO_VERSION (e.g.,
coordinator not ready) and currently would write a blank presto.version; update
the boot script (the block using set -e, DISCOVERY_URI, VERSION, PRESTO_VERSION)
to validate the coordinator response: implement a short retry loop with a
timeout that retries wget on ${DISCOVERY_URI}/v1/info a few times (or at least
reject empty output), extract PRESTO_VERSION only after non-empty VERSION, and
if PRESTO_VERSION is still empty log an error and exit non-zero (do not append a
blank presto.version); ensure the script does not proceed to echo
"presto.version=${PRESTO_VERSION}" unless PRESTO_VERSION is non-empty.
- Around line 69-86: The volume mount ordering for the Presto worker container
is subtle: the emptyDir mount named "presto-etc" at /opt/presto-server/etc must
be mounted before the child mounts ("presto-catalog" at
/opt/presto-server/etc/catalog and "presto-config" via subPath at
/opt/presto-server/etc/velox.properties) so init-container writes to the
emptyDir (config.properties, node.properties) remain visible; add a clear inline
comment near the volumeMounts block explaining this dependency and that
Kubernetes mounts parents before children (so do not reorder or replace
"presto-etc" with a deeper mount), referencing the mount names presto-etc,
presto-catalog, presto-config and the target path /opt/presto-server/etc.
- Around line 33-55: The init container "setup-configs" is using an unpinned
busybox image and a brittle sed-based JSON parse for PRESTO_VERSION; change the
image (in the "setup-configs" container spec) from busybox:latest to a pinned
image that includes jq (or make the tag configurable via values.yaml), and
replace the sed pipeline that sets PRESTO_VERSION (the commands that assign
VERSION and PRESTO_VERSION) to use a robust JSON parser like jq (e.g., fetch
DISCOVERY_URI "/v1/info" and pipe to jq -r .version) and add a simple check to
fail or default if PRESTO_VERSION is empty so the echo "presto.version=..." step
is reliable.

In `@tools/deployment/package-helm/values.yaml`:
- Around line 20-27: The Helm values set mutable dev tags for Presto images:
update prestoCoordinator.tag and prestoWorker.tag to use a stable release tag
convention (e.g., match clpPackage.tag which uses "main" or a semantic version)
so defaults are deterministic; modify the values for prestoCoordinator and
prestoWorker to a stable tag string and document in values.yaml comment that
operators should override tag for development workflows, ensuring consistency
across clpPackage, prestoCoordinator, and prestoWorker.

Comment on lines +311 to +315
repository: "ghcr.io/y-scope/presto/coordinator"
tag: "dev"
prestoWorker:
repository: "ghcr.io/y-scope/presto/prestissimo-worker"
tag: "dev"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's run the provided shell script to check Helm chart values
fd "values.yaml" tools/deployment/package-helm \
  --exec grep -n "prestoCoordinator\|prestoWorker\|presto.*tag\|tag.*presto" {}

Repository: y-scope/clp

Length of output: 114


🏁 Script executed:

# Let's also check the full context around the documentation lines mentioned
cd docs/src/user-docs
cat -n guides-k8s-deployment.md | sed -n '300,330p'

Repository: y-scope/clp

Length of output: 1087


🏁 Script executed:

# Check if DOCS_VAR_CLP_GIT_REF is used in the documentation
grep -n "DOCS_VAR_CLP_GIT_REF" docs/src/user-docs/guides-k8s-deployment.md | head -20

Repository: y-scope/clp

Length of output: 37


🏁 Script executed:

# Let's explore the Helm chart structure
find tools/deployment/package-helm -type f -name "*.yaml" -o -name "*.yml" | head -20

Repository: y-scope/clp

Length of output: 1474


🏁 Script executed:

# Read the Helm chart values.yaml to see the actual configuration
cd tools/deployment/package-helm
cat values.yaml | head -150

Repository: y-scope/clp

Length of output: 4185


🏁 Script executed:

# Check specifically for image tag configurations in values.yaml
grep -A 5 -B 5 "prestoCoordinator\|prestoWorker" tools/deployment/package-helm/values.yaml

Repository: y-scope/clp

Length of output: 730


🏁 Script executed:

# Look at other image configurations in the documentation to see how they handle tags
grep -n "tag:" docs/src/user-docs/guides-k8s-deployment.md | head -20

Repository: y-scope/clp

Length of output: 114


Replace "dev" image tags with stable version tags for Presto components.

The prestoCoordinator and prestoWorker images use tag: "dev", which is mutable and pulled fresh on every deployment (due to pullPolicy: "Always"). This can lead to unexpected breaking changes between deployments.

Use a stable version tag (e.g., a specific release version) instead of the floating "dev" tag to ensure consistent, predictable deployments. This aligns with the stability provided by other pinned image tags in the Helm chart.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/user-docs/guides-k8s-deployment.md` around lines 311 - 315, The
prestoCoordinator and prestoWorker image definitions currently use mutable tag
values (tag: "dev"); update these to pinned/stable release tags (e.g., tag:
"vX.Y.Z") for both prestoCoordinator and prestoWorker so deployments pull a
fixed image; locate the tag fields under the prestoCoordinator and prestoWorker
blocks and replace "dev" with the chosen semantic version string (or a chart
value reference like {{ .Values.image.prestoCoordinator.tag }} and {{
.Values.image.prestoWorker.tag }} if you prefer templating) to ensure
reproducible deployments.

Comment on lines +359 to +362
:::{note}
When `query_engine` is set to `"presto"`, the chart deploys a Presto coordinator and Presto
worker(s) instead of the query scheduler, query workers, reducers, and results cache.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for conditional logic around results_cache, api-server, and other components
# in the Helm templates to determine what's excluded when query_engine="presto"

echo "=== Searching for results_cache conditional logic ==="
rg -n 'results.cache|resultsCache' tools/deployment/package-helm/templates/ \
  -A 5 -B 5 --type yaml

echo -e "\n=== Searching for api-server conditional logic ==="
rg -n 'api.server|apiServer|api_server' tools/deployment/package-helm/templates/ \
  -A 5 -B 5 --type yaml

echo -e "\n=== Searching for query_engine conditional logic ==="
rg -n 'query_engine|queryEngine' tools/deployment/package-helm/templates/ \
  -A 5 -B 5 --type yaml

Repository: y-scope/clp

Length of output: 50367


Note is inaccurate: results_cache should not be listed as excluded, and api-server is missing.

The note states Presto replaces "the query scheduler, query workers, reducers, and results cache", but:

  1. The Helm templates show no conditional gating on query_engine for any results-cache-* manifests (service, statefulset, indices-creator job), confirming it remains deployed when Presto is enabled — only with retention disabled.
  2. The api-server deployment is actually gated by {{- if and .Values.clpConfig.api_server (ne .Values.clpConfig.package.query_engine "presto") }}, so it is excluded and should be mentioned.
  3. The query scheduler, query workers, and reducers are all gated by {{- if ne .Values.clpConfig.package.query_engine "presto" }}, confirming they are excluded.

Update the note to list: query scheduler, query workers, reducers, and api-server—not results cache.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/user-docs/guides-k8s-deployment.md` around lines 359 - 362, Update
the note text in the docs to correct the list of components replaced when
`query_engine` = "presto": remove "results cache" from the excluded list and add
"api-server"; reference the Helm gating logic when editing the sentence so it
reads that Presto replaces the "query scheduler, query workers, reducers, and
api-server" (since `query_scheduler`, `query_workers`, and `reducers` are gated
by `{{- if ne .Values.clpConfig.package.query_engine \"presto\" }}` and the
`api-server` deployment is gated by `{{- if and .Values.clpConfig.api_server (ne
.Values.clpConfig.package.query_engine \"presto\") }}`), while `results-cache-*`
manifests are not conditionally removed.

Comment on lines +66 to +70
2. Install (or upgrade) the Helm chart with the Presto values:

```bash
helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

helm install will fail for users with an existing CLP deployment.

The requirement at line 33 explicitly states the user already has "a running CLP Kubernetes deployment," yet step 2 only shows helm install, which would error out with release already exists. The parenthetical "(or upgrade)" acknowledges the upgrade path but never shows the corresponding command.

💡 Suggested fix
-2. Install (or upgrade) the Helm chart with the Presto values:
+2. If you haven't installed CLP yet, install it with Presto enabled:
 
    ```bash
    helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
    ```
+
+   If CLP is already installed, upgrade the release instead:
+
+   ```bash
+   helm upgrade clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
+   ```
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
2. Install (or upgrade) the Helm chart with the Presto values:
```bash
helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
```
2. If you haven't installed CLP yet, install it with Presto enabled:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/user-docs/guides-using-presto.md` around lines 66 - 70, The docs
show only a helm install command which will fail when CLP is already deployed;
update the step under the Presto install to include the alternative helm upgrade
invocation so users can perform an upgrade when a release exists—specifically
add guidance to run `helm upgrade clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f
presto-values.yaml` as the alternative to `helm install` (referencing the
existing `helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f
presto-values.yaml` and the `presto-values.yaml` values file).

Comment on lines +347 to +348
clp.metadata-db-user={{ $.Values.credentials.database.username }}
clp.metadata-db-password={{ $.Values.credentials.database.password }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Credentials stored in plaintext in a ConfigMap — use a Secret instead.

clp.metadata-db-password (and clp.metadata-db-user) are written directly into the ConfigMap. ConfigMaps are stored unencrypted in etcd by default, are readable by any principal with get/list on ConfigMaps in the namespace, and are fully visible in kubectl get configmap -o yaml. Kubernetes Secrets are the appropriate resource for sensitive values: they can be encrypted at rest, carry separate RBAC controls, and are omitted from audit logs by default.

The same concern applies to the S3 credentials at lines 396/398 (clp.s3-access-key-id, clp.s3-secret-access-key).

Consider rendering these files from a Kubernetes Secret (or multiple projected volume mounts) rather than embedding credentials in the ConfigMap.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 347 -
348, The ConfigMap is embedding sensitive values (clp.metadata-db-user,
clp.metadata-db-password, clp.s3-access-key-id, clp.s3-secret-access-key);
remove these keys from templates/configmap.yaml and instead create a Helm Secret
template that stores those values (e.g., templates/secret.yaml with
.Values.credentials.database and .Values.credentials.s3 entries), then update
consumers to read them from the Secret (use env valueFrom.secretKeyRef, secret
projection into a volume, or template lookup/include to inject non-plain text
values) so credentials are stored as Kubernetes Secrets rather than literal
ConfigMap entries.

Comment on lines +369 to +378
presto-coordinator-config-jvm.config: |
-server
-Xmx4G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-Djdk.attach.allowAttachSelf=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Hardcoded -Xmx4G should be a configurable value.

The coordinator JVM heap is fixed at 4 GB regardless of the resource limits/requests set on the coordinator pod. If the pod is scheduled with less memory, it will OOM-kill; if scheduled with more, the heap cannot grow to use it. Expose this as a value (e.g., .Values.clpConfig.presto.coordinator.jvm_max_heap) so operators can tune it alongside query.max-memory and query.max-memory-per-node.

♻️ Suggested approach
  presto-coordinator-config-jvm.config: |
    -server
-   -Xmx4G
+   -Xmx{{ .coordinator.jvm_max_heap_gb }}G
    -XX:+UseG1GC
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 369 -
378, The JVM Xmx in presto-coordinator-config-jvm.config is hardcoded as
"-Xmx4G"; change it to render a Helm value so operators can configure it:
replace the literal "-Xmx4G" with a template that reads
.Values.clpConfig.presto.coordinator.jvm_max_heap (with a sensible default like
"4G"), ensure consumers pass the unit (e.g., "4G") or document it, and add the
new key plus a descriptive default and comment to values.yaml so the chart is
configurable and documented.

Comment on lines +33 to +55
- name: "setup-configs"
image: "busybox:latest"
command:
- "/bin/sh"
- "-c"
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties

# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fragile JSON parsing with sed and unpinned busybox:latest image.

Two concerns in this init container:

  1. busybox:latest — Using the latest tag hurts reproducibility and can introduce unexpected breakages. Pin to a specific version (e.g., busybox:1.37), or make it configurable via values.

  2. JSON parsing via sed (Lines 46–48) — Extracting version from a JSON response using sed 's/.*"version":"//' | sed 's/".*//' is brittle. If the JSON structure changes (e.g., field ordering, nested version keys, or whitespace variations), parsing will silently produce incorrect results. Since busybox doesn't include jq, consider switching to an image that has jq, or using a more robust awk/grep pattern.

Proposed fix for image tag pinning and safer parsing
         - name: "setup-configs"
-          image: "busybox:latest"
+          image: "busybox:1.37"
           command:
             - "/bin/sh"
             - "-c"
             - |
               set -e
               cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
               cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

               # Query coordinator for version and add it to config
               DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
               VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
-              PRESTO_VERSION=$(echo "$VERSION" \
-                | sed 's/.*"version":"//' \
-                | sed 's/".*//')
+              PRESTO_VERSION=$(echo "$VERSION" \
+                | grep -o '"version":"[^"]*"' \
+                | head -1 \
+                | sed 's/"version":"//;s/"//')
               echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 33 - 55, The init container "setup-configs" is using an unpinned busybox
image and a brittle sed-based JSON parse for PRESTO_VERSION; change the image
(in the "setup-configs" container spec) from busybox:latest to a pinned image
that includes jq (or make the tag configurable via values.yaml), and replace the
sed pipeline that sets PRESTO_VERSION (the commands that assign VERSION and
PRESTO_VERSION) to use a robust JSON parser like jq (e.g., fetch DISCOVERY_URI
"/v1/info" and pipe to jq -r .version) and add a simple check to fail or default
if PRESTO_VERSION is empty so the echo "presto.version=..." step is reliable.

Comment on lines +38 to +55
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties

# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

No error handling if the coordinator version query fails.

If wget on Line 45 fails (e.g., coordinator not yet ready, network issue), $VERSION will be empty and $PRESTO_VERSION will be an empty string. This would write presto.version= to the config, likely causing the worker to fail with a confusing error. While set -e is present, wget returning an empty body with HTTP 200 would not trigger it.

Consider adding a validation check:

Proposed validation
               VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
               PRESTO_VERSION=$(echo "$VERSION" \
                 | sed 's/.*"version":"//' \
                 | sed 's/".*//')
+              if [ -z "$PRESTO_VERSION" ]; then
+                echo "ERROR: Failed to retrieve Presto version from coordinator" >&2
+                exit 1
+              fi
               echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties
# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties
# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
if [ -z "$PRESTO_VERSION" ]; then
echo "ERROR: Failed to retrieve Presto version from coordinator" >&2
exit 1
fi
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 38 - 55, The startup script that sets DISCOVERY_URI, calls wget into
VERSION and extracts PRESTO_VERSION can produce an empty PRESTO_VERSION (e.g.,
coordinator not ready) and currently would write a blank presto.version; update
the boot script (the block using set -e, DISCOVERY_URI, VERSION, PRESTO_VERSION)
to validate the coordinator response: implement a short retry loop with a
timeout that retries wget on ${DISCOVERY_URI}/v1/info a few times (or at least
reject empty output), extract PRESTO_VERSION only after non-empty VERSION, and
if PRESTO_VERSION is still empty log an error and exit non-zero (do not append a
blank presto.version); ensure the script does not proceed to echo
"presto.version=${PRESTO_VERSION}" unless PRESTO_VERSION is non-empty.

Comment on lines +62 to +93
containers:
- name: "presto-worker"
image: "{{ .Values.image.prestoWorker.repository }}:{{ .Values.image.prestoWorker.tag }}"
imagePullPolicy: "{{ .Values.image.prestoWorker.pullPolicy }}"
ports:
- name: "presto-worker"
containerPort: 8080
volumeMounts:
- name: "presto-catalog"
mountPath: "/opt/presto-server/etc/catalog"
readOnly: true
- name: "presto-config"
mountPath: "/opt/presto-server/etc/velox.properties"
subPath: "presto-worker-config-velox.properties"
readOnly: true
- name: "presto-etc"
mountPath: "/opt/presto-server/etc"
{{- if eq .Values.clpConfig.archive_output.storage.type "fs" }}
- name: {{ include "clp.volumeName" (dict
"component_category" "shared-data"
"name" "archives"
) | quote }}
mountPath: "/var/data/archives"
readOnly: true
{{- end }}
readinessProbe:
{{- include "clp.readinessProbeTimings" . | nindent 12 }}
tcpSocket: &presto-worker-health-check
port: "presto-worker"
livenessProbe:
{{- include "clp.livenessProbeTimings" . | nindent 12 }}
tcpSocket: *presto-worker-health-check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

No resources (requests/limits) defined on the worker container.

Same concern as the coordinator: the worker container has no CPU/memory resource requests or limits. This is especially important for Presto workers, which can be memory-intensive. The system_memory_gb and query_memory_gb values are set in the Presto config but not enforced at the Kubernetes level, which could lead to OOMKills or overcommitment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 62 - 93, Add Kubernetes resource requests and limits to the presto-worker
container (container name "presto-worker") so CPU/memory are enforced; update
the deployment template to include a resources: block under that container using
Helm values (e.g. reference values like
.Values.resources.prestoWorker.requests.cpu,
.Values.resources.prestoWorker.requests.memory,
.Values.resources.prestoWorker.limits.cpu,
.Values.resources.prestoWorker.limits.memory or similar keys used elsewhere) and
ensure the memory request/limit aligns reasonably with the Presto config
(system_memory_gb/query_memory_gb) to avoid OOMKills; keep the block next to
ports/volumeMounts in the "presto-worker" container spec.

Comment on lines +69 to +86
volumeMounts:
- name: "presto-catalog"
mountPath: "/opt/presto-server/etc/catalog"
readOnly: true
- name: "presto-config"
mountPath: "/opt/presto-server/etc/velox.properties"
subPath: "presto-worker-config-velox.properties"
readOnly: true
- name: "presto-etc"
mountPath: "/opt/presto-server/etc"
{{- if eq .Values.clpConfig.archive_output.storage.type "fs" }}
- name: {{ include "clp.volumeName" (dict
"component_category" "shared-data"
"name" "archives"
) | quote }}
mountPath: "/var/data/archives"
readOnly: true
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Volume mount layering at /opt/presto-server/etc — verify correctness.

The main container mounts:

  1. presto-catalog (configMap) at /opt/presto-server/etc/catalog
  2. presto-config (configMap, subPath) at /opt/presto-server/etc/velox.properties
  3. presto-etc (emptyDir) at /opt/presto-server/etc

Kubernetes sorts mounts by path length (parent before child), so the emptyDir is mounted first, then the catalog and velox.properties are overlaid. The init container's writes (config.properties, node.properties) live in the emptyDir and should remain visible. This layering should work, but it's subtle and easy to break during future refactors.

Consider adding a comment in the template explaining the mount ordering dependency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 69 - 86, The volume mount ordering for the Presto worker container is
subtle: the emptyDir mount named "presto-etc" at /opt/presto-server/etc must be
mounted before the child mounts ("presto-catalog" at
/opt/presto-server/etc/catalog and "presto-config" via subPath at
/opt/presto-server/etc/velox.properties) so init-container writes to the
emptyDir (config.properties, node.properties) remain visible; add a clear inline
comment near the volumeMounts block explaining this dependency and that
Kubernetes mounts parents before children (so do not reorder or replace
"presto-etc" with a deeper mount), referencing the mount names presto-etc,
presto-catalog, presto-config and the target path /opt/presto-server/etc.

Comment on lines +20 to +27
prestoCoordinator:
repository: "ghcr.io/y-scope/presto/coordinator"
pullPolicy: "Always"
tag: "dev"
prestoWorker:
repository: "ghcr.io/y-scope/presto/prestissimo-worker"
pullPolicy: "Always"
tag: "dev"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Presto image tags default to "dev" — consider aligning with a release tag convention.

The clpPackage image defaults to "main", while both Presto images default to "dev". This is fine for development but worth noting that users deploying from the defaults will pull a mutable, potentially unstable tag.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/values.yaml` around lines 20 - 27, The Helm
values set mutable dev tags for Presto images: update prestoCoordinator.tag and
prestoWorker.tag to use a stable release tag convention (e.g., match
clpPackage.tag which uses "main" or a semantic version) so defaults are
deterministic; modify the values for prestoCoordinator and prestoWorker to a
stable tag string and document in values.yaml comment that operators should
override tag for development workflows, ensuring consistency across clpPackage,
prestoCoordinator, and prestoWorker.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/deployment/package-helm/Chart.yaml`:
- Line 3: The Chart.yaml pre-release version was mistakenly advanced to
"0.1.4-dev.4", skipping "0.1.4-dev.3"; update the version field in Chart.yaml
from "0.1.4-dev.4" to "0.1.4-dev.3" so the pre-release sequence remains
contiguous (look for the existing version: "0.1.4-dev.4" and replace it with
"0.1.4-dev.3").

apiVersion: "v2"
name: "clp"
version: "0.1.4-dev.2"
version: "0.1.4-dev.4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check whether 0.1.4-dev.3 is used in any branch or tag
git grep "0.1.4-dev.3" $(git for-each-ref --format='%(refname)' refs/heads refs/remotes refs/tags) -- "tools/deployment/package-helm/Chart.yaml" 2>/dev/null || echo "Not found on any ref"

Repository: y-scope/clp

Length of output: 75


Change version to 0.1.4-dev.3 to maintain sequential pre-release versioning.

The version skips from 0.1.4-dev.2 directly to 0.1.4-dev.4, but 0.1.4-dev.3 does not exist on any branch or tag in the repository. This skip appears unintentional and breaks the contiguous pre-release sequence.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/Chart.yaml` at line 3, The Chart.yaml
pre-release version was mistakenly advanced to "0.1.4-dev.4", skipping
"0.1.4-dev.3"; update the version field in Chart.yaml from "0.1.4-dev.4" to
"0.1.4-dev.3" so the pre-release sequence remains contiguous (look for the
existing version: "0.1.4-dev.4" and replace it with "0.1.4-dev.3").

@@ -1,4 +1,4 @@
{{- if .Values.clpConfig.api_server }}
{{- if and .Values.clpConfig.api_server (ne .Values.clpConfig.package.query_engine "presto") }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we disable api server? Shouldn't we also disable query scheduler and worker if clp search is not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, the API server does not work with Presto.

we do have this specified in tools/deployment/package-helm/templates/query-scheduler-deployment.yaml as well

{{- if ne .Values.clpConfig.package.query_engine "presto" }}

though, given some users may want to use both the API Server the Presto, i think we may discard .Values.clpConfig.package.query_engine values and simply check whether query_scheduler / query_worker / API is null before launching the services. what do you think?

cc @kirkrodrigues for discussion since this is interface related

@@ -17,6 +17,14 @@ image:
repository: "ghcr.io/y-scope/clp/clp-package"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting Presto internal error, possibly due to the recent timestamp changes f82e611. Can you reproduce the error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 right. i can reproduce once i merge main into the current branch.

@coderabbitai note #1788 is a BREAKING change that broke CLP Presto, as in the CLP Presto Connector cannot handle the new Timestamp column type at the moment. submit an issue to track

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Get details about PR `#1788` to understand the breaking change
gh pr view 1788 --json title,url,mergedAt,body --repo y-scope/clp

# Create an issue to track the breaking change
gh issue create \
  --repo y-scope/clp \
  --title "Presto Connector cannot handle new Timestamp column type from PR `#1788`" \
  --body "## Issue Description

The CLP Presto Connector currently cannot handle the new Timestamp column type introduced in PR `#1788`, causing internal errors when querying with Presto.

## Context

- **Breaking PR**: `#1788` (commit f82e6114160a6addd4727259906bcf621ac9912c)
- **Affected Component**: CLP Presto Connector
- **Impact**: Presto queries fail with internal errors

The issue was discovered during testing of PR `#2004` (Presto Helm chart support). The error can be reproduced after merging main into the Presto feature branch.

## References

- PR `#1788`: https://github.com/y-scope/clp/pull/1788
- PR `#2004`: https://github.com/y-scope/clp/pull/2004
- Comment: https://github.com/y-scope/clp/pull/2004#discussion_r2821243815

Reported by: `@junhaoliao`" \
  --assignee junhaoliao

Length of output: 5337


@junhaoliao, I've created issue #2008 to track this breaking change. The issue documents that PR #1788 (which introduced the new Timestamp column type to replace DateString and bumped the archive version to 0.5.0) broke the CLP Presto Connector's ability to handle the new column type. The issue has been assigned to you and includes references to both PRs and the relevant commit.

🐰✨
/_/
( o.o )

^ <
"Issue tracked!"


🧠 Learnings used
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1818
File: tools/deployment/package-helm/templates/configmap.yaml:12-12
Timestamp: 2025-12-19T05:03:42.629Z
Learning: In the y-scope/clp Helm chart, the `api_server.host` configuration field in the ConfigMap (tools/deployment/package-helm/templates/configmap.yaml) is unused because no other k8s internal services need to reach the API server—it's only accessed from outside the cluster via NodePort.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Comment on lines +36 to +55
- "/bin/sh"
- "-c"
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties

# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
Copy link
Contributor

@hoophalab hoophalab Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script looks scary. Does it mimic the functionality of tools/deployment/presto-clp/worker/scripts/generate-configs.sh? I guess we don't use the script directly because of missing dependencies?

- Replace `query_engine` conditions with modular flags in Helm templates.
- Refactor Presto configurations to use dedicated flags and scripts.
- Enhance worker and reducer node scheduling logic.
- Update documentation to reflect Presto integration and better configuration flexibility.
@junhaoliao
Copy link
Member Author

TODO: address #1957 since we're updating the Presto setup docs in this PR.

@junhaoliao junhaoliao added this to the March 2026 milestone Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants