feat(helm): Add Presto query engine support to the Helm chart.#2004
feat(helm): Add Presto query engine support to the Helm chart.#2004junhaoliao wants to merge 5 commits intoy-scope:mainfrom
Conversation
WalkthroughAdds Presto support to the Helm deployment: new Presto coordinator/worker templates, Presto-specific config and values, and conditional gating that disables default query-scheduler/worker/reducer (and related PV/PVC/Service) when .Values.clpConfig.package.query_engine == "presto". Documentation for Kubernetes and Docker Compose is added. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/src/user-docs/guides-k8s-deployment.md`:
- Around line 359-362: Update the note text in the docs to correct the list of
components replaced when `query_engine` = "presto": remove "results cache" from
the excluded list and add "api-server"; reference the Helm gating logic when
editing the sentence so it reads that Presto replaces the "query scheduler,
query workers, reducers, and api-server" (since `query_scheduler`,
`query_workers`, and `reducers` are gated by `{{- if ne
.Values.clpConfig.package.query_engine \"presto\" }}` and the `api-server`
deployment is gated by `{{- if and .Values.clpConfig.api_server (ne
.Values.clpConfig.package.query_engine \"presto\") }}`), while `results-cache-*`
manifests are not conditionally removed.
- Around line 311-315: The prestoCoordinator and prestoWorker image definitions
currently use mutable tag values (tag: "dev"); update these to pinned/stable
release tags (e.g., tag: "vX.Y.Z") for both prestoCoordinator and prestoWorker
so deployments pull a fixed image; locate the tag fields under the
prestoCoordinator and prestoWorker blocks and replace "dev" with the chosen
semantic version string (or a chart value reference like {{
.Values.image.prestoCoordinator.tag }} and {{ .Values.image.prestoWorker.tag }}
if you prefer templating) to ensure reproducible deployments.
In `@docs/src/user-docs/guides-using-presto.md`:
- Around line 66-70: The docs show only a helm install command which will fail
when CLP is already deployed; update the step under the Presto install to
include the alternative helm upgrade invocation so users can perform an upgrade
when a release exists—specifically add guidance to run `helm upgrade clp clp/clp
DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml` as the alternative to `helm
install` (referencing the existing `helm install clp clp/clp
DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml` and the `presto-values.yaml`
values file).
In `@tools/deployment/package-helm/templates/configmap.yaml`:
- Around line 369-378: The JVM Xmx in presto-coordinator-config-jvm.config is
hardcoded as "-Xmx4G"; change it to render a Helm value so operators can
configure it: replace the literal "-Xmx4G" with a template that reads
.Values.clpConfig.presto.coordinator.jvm_max_heap (with a sensible default like
"4G"), ensure consumers pass the unit (e.g., "4G") or document it, and add the
new key plus a descriptive default and comment to values.yaml so the chart is
configurable and documented.
- Around line 347-348: The ConfigMap is embedding sensitive values
(clp.metadata-db-user, clp.metadata-db-password, clp.s3-access-key-id,
clp.s3-secret-access-key); remove these keys from templates/configmap.yaml and
instead create a Helm Secret template that stores those values (e.g.,
templates/secret.yaml with .Values.credentials.database and
.Values.credentials.s3 entries), then update consumers to read them from the
Secret (use env valueFrom.secretKeyRef, secret projection into a volume, or
template lookup/include to inject non-plain text values) so credentials are
stored as Kubernetes Secrets rather than literal ConfigMap entries.
- Line 397: The ConfigMap currently hardcodes AWS hostnames for the Presto S3
path via the clp.s3-end-point entry using {{ .s3_config.bucket }} and {{
.s3_config.region_code }}, which breaks S3-compatible stores; change the
template to prefer the configured endpoint_url on .s3_config (e.g.,
archive_output.storage.s3_config.endpoint_url / .s3_config.endpoint_url) when
present and fall back to the existing AWS-format URL only if endpoint_url is
empty, so Presto uses the explicit endpoint for MinIO/other S3-compatible
services.
In `@tools/deployment/package-helm/templates/presto-coordinator-deployment.yaml`:
- Around line 33-63: The presto-coordinator container lacks Kubernetes resource
requests/limits; add a configurable resources block for the container named
"presto-coordinator" (e.g., read values from a new or existing Helm key such as
.Values.prestoCoordinator.resources or .Values.resources.prestoCoordinator) and
populate both requests and limits for cpu and memory so they can be tuned
per-environment; update the chart values.yaml to include sensible defaults and
reference those values in the presto-coordinator deployment template.
In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml`:
- Around line 62-93: Add Kubernetes resource requests and limits to the
presto-worker container (container name "presto-worker") so CPU/memory are
enforced; update the deployment template to include a resources: block under
that container using Helm values (e.g. reference values like
.Values.resources.prestoWorker.requests.cpu,
.Values.resources.prestoWorker.requests.memory,
.Values.resources.prestoWorker.limits.cpu,
.Values.resources.prestoWorker.limits.memory or similar keys used elsewhere) and
ensure the memory request/limit aligns reasonably with the Presto config
(system_memory_gb/query_memory_gb) to avoid OOMKills; keep the block next to
ports/volumeMounts in the "presto-worker" container spec.
- Around line 38-55: The startup script that sets DISCOVERY_URI, calls wget into
VERSION and extracts PRESTO_VERSION can produce an empty PRESTO_VERSION (e.g.,
coordinator not ready) and currently would write a blank presto.version; update
the boot script (the block using set -e, DISCOVERY_URI, VERSION, PRESTO_VERSION)
to validate the coordinator response: implement a short retry loop with a
timeout that retries wget on ${DISCOVERY_URI}/v1/info a few times (or at least
reject empty output), extract PRESTO_VERSION only after non-empty VERSION, and
if PRESTO_VERSION is still empty log an error and exit non-zero (do not append a
blank presto.version); ensure the script does not proceed to echo
"presto.version=${PRESTO_VERSION}" unless PRESTO_VERSION is non-empty.
- Around line 69-86: The volume mount ordering for the Presto worker container
is subtle: the emptyDir mount named "presto-etc" at /opt/presto-server/etc must
be mounted before the child mounts ("presto-catalog" at
/opt/presto-server/etc/catalog and "presto-config" via subPath at
/opt/presto-server/etc/velox.properties) so init-container writes to the
emptyDir (config.properties, node.properties) remain visible; add a clear inline
comment near the volumeMounts block explaining this dependency and that
Kubernetes mounts parents before children (so do not reorder or replace
"presto-etc" with a deeper mount), referencing the mount names presto-etc,
presto-catalog, presto-config and the target path /opt/presto-server/etc.
- Around line 33-55: The init container "setup-configs" is using an unpinned
busybox image and a brittle sed-based JSON parse for PRESTO_VERSION; change the
image (in the "setup-configs" container spec) from busybox:latest to a pinned
image that includes jq (or make the tag configurable via values.yaml), and
replace the sed pipeline that sets PRESTO_VERSION (the commands that assign
VERSION and PRESTO_VERSION) to use a robust JSON parser like jq (e.g., fetch
DISCOVERY_URI "/v1/info" and pipe to jq -r .version) and add a simple check to
fail or default if PRESTO_VERSION is empty so the echo "presto.version=..." step
is reliable.
In `@tools/deployment/package-helm/values.yaml`:
- Around line 20-27: The Helm values set mutable dev tags for Presto images:
update prestoCoordinator.tag and prestoWorker.tag to use a stable release tag
convention (e.g., match clpPackage.tag which uses "main" or a semantic version)
so defaults are deterministic; modify the values for prestoCoordinator and
prestoWorker to a stable tag string and document in values.yaml comment that
operators should override tag for development workflows, ensuring consistency
across clpPackage, prestoCoordinator, and prestoWorker.
| repository: "ghcr.io/y-scope/presto/coordinator" | ||
| tag: "dev" | ||
| prestoWorker: | ||
| repository: "ghcr.io/y-scope/presto/prestissimo-worker" | ||
| tag: "dev" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's run the provided shell script to check Helm chart values
fd "values.yaml" tools/deployment/package-helm \
--exec grep -n "prestoCoordinator\|prestoWorker\|presto.*tag\|tag.*presto" {}Repository: y-scope/clp
Length of output: 114
🏁 Script executed:
# Let's also check the full context around the documentation lines mentioned
cd docs/src/user-docs
cat -n guides-k8s-deployment.md | sed -n '300,330p'Repository: y-scope/clp
Length of output: 1087
🏁 Script executed:
# Check if DOCS_VAR_CLP_GIT_REF is used in the documentation
grep -n "DOCS_VAR_CLP_GIT_REF" docs/src/user-docs/guides-k8s-deployment.md | head -20Repository: y-scope/clp
Length of output: 37
🏁 Script executed:
# Let's explore the Helm chart structure
find tools/deployment/package-helm -type f -name "*.yaml" -o -name "*.yml" | head -20Repository: y-scope/clp
Length of output: 1474
🏁 Script executed:
# Read the Helm chart values.yaml to see the actual configuration
cd tools/deployment/package-helm
cat values.yaml | head -150Repository: y-scope/clp
Length of output: 4185
🏁 Script executed:
# Check specifically for image tag configurations in values.yaml
grep -A 5 -B 5 "prestoCoordinator\|prestoWorker" tools/deployment/package-helm/values.yamlRepository: y-scope/clp
Length of output: 730
🏁 Script executed:
# Look at other image configurations in the documentation to see how they handle tags
grep -n "tag:" docs/src/user-docs/guides-k8s-deployment.md | head -20Repository: y-scope/clp
Length of output: 114
Replace "dev" image tags with stable version tags for Presto components.
The prestoCoordinator and prestoWorker images use tag: "dev", which is mutable and pulled fresh on every deployment (due to pullPolicy: "Always"). This can lead to unexpected breaking changes between deployments.
Use a stable version tag (e.g., a specific release version) instead of the floating "dev" tag to ensure consistent, predictable deployments. This aligns with the stability provided by other pinned image tags in the Helm chart.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/src/user-docs/guides-k8s-deployment.md` around lines 311 - 315, The
prestoCoordinator and prestoWorker image definitions currently use mutable tag
values (tag: "dev"); update these to pinned/stable release tags (e.g., tag:
"vX.Y.Z") for both prestoCoordinator and prestoWorker so deployments pull a
fixed image; locate the tag fields under the prestoCoordinator and prestoWorker
blocks and replace "dev" with the chosen semantic version string (or a chart
value reference like {{ .Values.image.prestoCoordinator.tag }} and {{
.Values.image.prestoWorker.tag }} if you prefer templating) to ensure
reproducible deployments.
| :::{note} | ||
| When `query_engine` is set to `"presto"`, the chart deploys a Presto coordinator and Presto | ||
| worker(s) instead of the query scheduler, query workers, reducers, and results cache. | ||
| ::: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for conditional logic around results_cache, api-server, and other components
# in the Helm templates to determine what's excluded when query_engine="presto"
echo "=== Searching for results_cache conditional logic ==="
rg -n 'results.cache|resultsCache' tools/deployment/package-helm/templates/ \
-A 5 -B 5 --type yaml
echo -e "\n=== Searching for api-server conditional logic ==="
rg -n 'api.server|apiServer|api_server' tools/deployment/package-helm/templates/ \
-A 5 -B 5 --type yaml
echo -e "\n=== Searching for query_engine conditional logic ==="
rg -n 'query_engine|queryEngine' tools/deployment/package-helm/templates/ \
-A 5 -B 5 --type yamlRepository: y-scope/clp
Length of output: 50367
Note is inaccurate: results_cache should not be listed as excluded, and api-server is missing.
The note states Presto replaces "the query scheduler, query workers, reducers, and results cache", but:
- The Helm templates show no conditional gating on
query_enginefor anyresults-cache-*manifests (service, statefulset, indices-creator job), confirming it remains deployed when Presto is enabled — only with retention disabled. - The api-server deployment is actually gated by
{{- if and .Values.clpConfig.api_server (ne .Values.clpConfig.package.query_engine "presto") }}, so it is excluded and should be mentioned. - The query scheduler, query workers, and reducers are all gated by
{{- if ne .Values.clpConfig.package.query_engine "presto" }}, confirming they are excluded.
Update the note to list: query scheduler, query workers, reducers, and api-server—not results cache.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/src/user-docs/guides-k8s-deployment.md` around lines 359 - 362, Update
the note text in the docs to correct the list of components replaced when
`query_engine` = "presto": remove "results cache" from the excluded list and add
"api-server"; reference the Helm gating logic when editing the sentence so it
reads that Presto replaces the "query scheduler, query workers, reducers, and
api-server" (since `query_scheduler`, `query_workers`, and `reducers` are gated
by `{{- if ne .Values.clpConfig.package.query_engine \"presto\" }}` and the
`api-server` deployment is gated by `{{- if and .Values.clpConfig.api_server (ne
.Values.clpConfig.package.query_engine \"presto\") }}`), while `results-cache-*`
manifests are not conditionally removed.
| 2. Install (or upgrade) the Helm chart with the Presto values: | ||
|
|
||
| ```bash | ||
| helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml | ||
| ``` |
There was a problem hiding this comment.
helm install will fail for users with an existing CLP deployment.
The requirement at line 33 explicitly states the user already has "a running CLP Kubernetes deployment," yet step 2 only shows helm install, which would error out with release already exists. The parenthetical "(or upgrade)" acknowledges the upgrade path but never shows the corresponding command.
💡 Suggested fix
-2. Install (or upgrade) the Helm chart with the Presto values:
+2. If you haven't installed CLP yet, install it with Presto enabled:
```bash
helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
```
+
+ If CLP is already installed, upgrade the release instead:
+
+ ```bash
+ helm upgrade clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
+ ```📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 2. Install (or upgrade) the Helm chart with the Presto values: | |
| ```bash | |
| helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml | |
| ``` | |
| 2. If you haven't installed CLP yet, install it with Presto enabled: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/src/user-docs/guides-using-presto.md` around lines 66 - 70, The docs
show only a helm install command which will fail when CLP is already deployed;
update the step under the Presto install to include the alternative helm upgrade
invocation so users can perform an upgrade when a release exists—specifically
add guidance to run `helm upgrade clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f
presto-values.yaml` as the alternative to `helm install` (referencing the
existing `helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f
presto-values.yaml` and the `presto-values.yaml` values file).
| clp.metadata-db-user={{ $.Values.credentials.database.username }} | ||
| clp.metadata-db-password={{ $.Values.credentials.database.password }} |
There was a problem hiding this comment.
Credentials stored in plaintext in a ConfigMap — use a Secret instead.
clp.metadata-db-password (and clp.metadata-db-user) are written directly into the ConfigMap. ConfigMaps are stored unencrypted in etcd by default, are readable by any principal with get/list on ConfigMaps in the namespace, and are fully visible in kubectl get configmap -o yaml. Kubernetes Secrets are the appropriate resource for sensitive values: they can be encrypted at rest, carry separate RBAC controls, and are omitted from audit logs by default.
The same concern applies to the S3 credentials at lines 396/398 (clp.s3-access-key-id, clp.s3-secret-access-key).
Consider rendering these files from a Kubernetes Secret (or multiple projected volume mounts) rather than embedding credentials in the ConfigMap.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 347 -
348, The ConfigMap is embedding sensitive values (clp.metadata-db-user,
clp.metadata-db-password, clp.s3-access-key-id, clp.s3-secret-access-key);
remove these keys from templates/configmap.yaml and instead create a Helm Secret
template that stores those values (e.g., templates/secret.yaml with
.Values.credentials.database and .Values.credentials.s3 entries), then update
consumers to read them from the Secret (use env valueFrom.secretKeyRef, secret
projection into a volume, or template lookup/include to inject non-plain text
values) so credentials are stored as Kubernetes Secrets rather than literal
ConfigMap entries.
| presto-coordinator-config-jvm.config: | | ||
| -server | ||
| -Xmx4G | ||
| -XX:+UseG1GC | ||
| -XX:G1HeapRegionSize=32M | ||
| -XX:+UseGCOverheadLimit | ||
| -XX:+ExplicitGCInvokesConcurrent | ||
| -XX:+HeapDumpOnOutOfMemoryError | ||
| -XX:+ExitOnOutOfMemoryError | ||
| -Djdk.attach.allowAttachSelf=true |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Hardcoded -Xmx4G should be a configurable value.
The coordinator JVM heap is fixed at 4 GB regardless of the resource limits/requests set on the coordinator pod. If the pod is scheduled with less memory, it will OOM-kill; if scheduled with more, the heap cannot grow to use it. Expose this as a value (e.g., .Values.clpConfig.presto.coordinator.jvm_max_heap) so operators can tune it alongside query.max-memory and query.max-memory-per-node.
♻️ Suggested approach
presto-coordinator-config-jvm.config: |
-server
- -Xmx4G
+ -Xmx{{ .coordinator.jvm_max_heap_gb }}G
-XX:+UseG1GC🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 369 -
378, The JVM Xmx in presto-coordinator-config-jvm.config is hardcoded as
"-Xmx4G"; change it to render a Helm value so operators can configure it:
replace the literal "-Xmx4G" with a template that reads
.Values.clpConfig.presto.coordinator.jvm_max_heap (with a sensible default like
"4G"), ensure consumers pass the unit (e.g., "4G") or document it, and add the
new key plus a descriptive default and comment to values.yaml so the chart is
configurable and documented.
| - name: "setup-configs" | ||
| image: "busybox:latest" | ||
| command: | ||
| - "/bin/sh" | ||
| - "-c" | ||
| - | | ||
| set -e | ||
| cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties | ||
| cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties | ||
|
|
||
| # Query coordinator for version and add it to config | ||
| DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080" | ||
| VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info") | ||
| PRESTO_VERSION=$(echo "$VERSION" \ | ||
| | sed 's/.*"version":"//' \ | ||
| | sed 's/".*//') | ||
| echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties | ||
|
|
||
| # Set node identity from hostname | ||
| HOSTNAME=$(hostname) | ||
| HOST_IP=$(hostname -i) | ||
| echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties | ||
| echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties |
There was a problem hiding this comment.
Fragile JSON parsing with sed and unpinned busybox:latest image.
Two concerns in this init container:
-
busybox:latest— Using thelatesttag hurts reproducibility and can introduce unexpected breakages. Pin to a specific version (e.g.,busybox:1.37), or make it configurable via values. -
JSON parsing via
sed(Lines 46–48) — Extractingversionfrom a JSON response usingsed 's/.*"version":"//' | sed 's/".*//'is brittle. If the JSON structure changes (e.g., field ordering, nestedversionkeys, or whitespace variations), parsing will silently produce incorrect results. Sincebusyboxdoesn't includejq, consider switching to an image that hasjq, or using a more robustawk/greppattern.
Proposed fix for image tag pinning and safer parsing
- name: "setup-configs"
- image: "busybox:latest"
+ image: "busybox:1.37"
command:
- "/bin/sh"
- "-c"
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties
# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
- PRESTO_VERSION=$(echo "$VERSION" \
- | sed 's/.*"version":"//' \
- | sed 's/".*//')
+ PRESTO_VERSION=$(echo "$VERSION" \
+ | grep -o '"version":"[^"]*"' \
+ | head -1 \
+ | sed 's/"version":"//;s/"//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 33 - 55, The init container "setup-configs" is using an unpinned busybox
image and a brittle sed-based JSON parse for PRESTO_VERSION; change the image
(in the "setup-configs" container spec) from busybox:latest to a pinned image
that includes jq (or make the tag configurable via values.yaml), and replace the
sed pipeline that sets PRESTO_VERSION (the commands that assign VERSION and
PRESTO_VERSION) to use a robust JSON parser like jq (e.g., fetch DISCOVERY_URI
"/v1/info" and pipe to jq -r .version) and add a simple check to fail or default
if PRESTO_VERSION is empty so the echo "presto.version=..." step is reliable.
| - | | ||
| set -e | ||
| cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties | ||
| cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties | ||
|
|
||
| # Query coordinator for version and add it to config | ||
| DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080" | ||
| VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info") | ||
| PRESTO_VERSION=$(echo "$VERSION" \ | ||
| | sed 's/.*"version":"//' \ | ||
| | sed 's/".*//') | ||
| echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties | ||
|
|
||
| # Set node identity from hostname | ||
| HOSTNAME=$(hostname) | ||
| HOST_IP=$(hostname -i) | ||
| echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties | ||
| echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties |
There was a problem hiding this comment.
No error handling if the coordinator version query fails.
If wget on Line 45 fails (e.g., coordinator not yet ready, network issue), $VERSION will be empty and $PRESTO_VERSION will be an empty string. This would write presto.version= to the config, likely causing the worker to fail with a confusing error. While set -e is present, wget returning an empty body with HTTP 200 would not trigger it.
Consider adding a validation check:
Proposed validation
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
+ if [ -z "$PRESTO_VERSION" ]; then
+ echo "ERROR: Failed to retrieve Presto version from coordinator" >&2
+ exit 1
+ fi
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - | | |
| set -e | |
| cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties | |
| cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties | |
| # Query coordinator for version and add it to config | |
| DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080" | |
| VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info") | |
| PRESTO_VERSION=$(echo "$VERSION" \ | |
| | sed 's/.*"version":"//' \ | |
| | sed 's/".*//') | |
| echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties | |
| # Set node identity from hostname | |
| HOSTNAME=$(hostname) | |
| HOST_IP=$(hostname -i) | |
| echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties | |
| echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties | |
| - | | |
| set -e | |
| cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties | |
| cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties | |
| # Query coordinator for version and add it to config | |
| DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080" | |
| VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info") | |
| PRESTO_VERSION=$(echo "$VERSION" \ | |
| | sed 's/.*"version":"//' \ | |
| | sed 's/".*//') | |
| if [ -z "$PRESTO_VERSION" ]; then | |
| echo "ERROR: Failed to retrieve Presto version from coordinator" >&2 | |
| exit 1 | |
| fi | |
| echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties | |
| # Set node identity from hostname | |
| HOSTNAME=$(hostname) | |
| HOST_IP=$(hostname -i) | |
| echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties | |
| echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 38 - 55, The startup script that sets DISCOVERY_URI, calls wget into
VERSION and extracts PRESTO_VERSION can produce an empty PRESTO_VERSION (e.g.,
coordinator not ready) and currently would write a blank presto.version; update
the boot script (the block using set -e, DISCOVERY_URI, VERSION, PRESTO_VERSION)
to validate the coordinator response: implement a short retry loop with a
timeout that retries wget on ${DISCOVERY_URI}/v1/info a few times (or at least
reject empty output), extract PRESTO_VERSION only after non-empty VERSION, and
if PRESTO_VERSION is still empty log an error and exit non-zero (do not append a
blank presto.version); ensure the script does not proceed to echo
"presto.version=${PRESTO_VERSION}" unless PRESTO_VERSION is non-empty.
| containers: | ||
| - name: "presto-worker" | ||
| image: "{{ .Values.image.prestoWorker.repository }}:{{ .Values.image.prestoWorker.tag }}" | ||
| imagePullPolicy: "{{ .Values.image.prestoWorker.pullPolicy }}" | ||
| ports: | ||
| - name: "presto-worker" | ||
| containerPort: 8080 | ||
| volumeMounts: | ||
| - name: "presto-catalog" | ||
| mountPath: "/opt/presto-server/etc/catalog" | ||
| readOnly: true | ||
| - name: "presto-config" | ||
| mountPath: "/opt/presto-server/etc/velox.properties" | ||
| subPath: "presto-worker-config-velox.properties" | ||
| readOnly: true | ||
| - name: "presto-etc" | ||
| mountPath: "/opt/presto-server/etc" | ||
| {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} | ||
| - name: {{ include "clp.volumeName" (dict | ||
| "component_category" "shared-data" | ||
| "name" "archives" | ||
| ) | quote }} | ||
| mountPath: "/var/data/archives" | ||
| readOnly: true | ||
| {{- end }} | ||
| readinessProbe: | ||
| {{- include "clp.readinessProbeTimings" . | nindent 12 }} | ||
| tcpSocket: &presto-worker-health-check | ||
| port: "presto-worker" | ||
| livenessProbe: | ||
| {{- include "clp.livenessProbeTimings" . | nindent 12 }} | ||
| tcpSocket: *presto-worker-health-check |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
No resources (requests/limits) defined on the worker container.
Same concern as the coordinator: the worker container has no CPU/memory resource requests or limits. This is especially important for Presto workers, which can be memory-intensive. The system_memory_gb and query_memory_gb values are set in the Presto config but not enforced at the Kubernetes level, which could lead to OOMKills or overcommitment.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 62 - 93, Add Kubernetes resource requests and limits to the presto-worker
container (container name "presto-worker") so CPU/memory are enforced; update
the deployment template to include a resources: block under that container using
Helm values (e.g. reference values like
.Values.resources.prestoWorker.requests.cpu,
.Values.resources.prestoWorker.requests.memory,
.Values.resources.prestoWorker.limits.cpu,
.Values.resources.prestoWorker.limits.memory or similar keys used elsewhere) and
ensure the memory request/limit aligns reasonably with the Presto config
(system_memory_gb/query_memory_gb) to avoid OOMKills; keep the block next to
ports/volumeMounts in the "presto-worker" container spec.
| volumeMounts: | ||
| - name: "presto-catalog" | ||
| mountPath: "/opt/presto-server/etc/catalog" | ||
| readOnly: true | ||
| - name: "presto-config" | ||
| mountPath: "/opt/presto-server/etc/velox.properties" | ||
| subPath: "presto-worker-config-velox.properties" | ||
| readOnly: true | ||
| - name: "presto-etc" | ||
| mountPath: "/opt/presto-server/etc" | ||
| {{- if eq .Values.clpConfig.archive_output.storage.type "fs" }} | ||
| - name: {{ include "clp.volumeName" (dict | ||
| "component_category" "shared-data" | ||
| "name" "archives" | ||
| ) | quote }} | ||
| mountPath: "/var/data/archives" | ||
| readOnly: true | ||
| {{- end }} |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Volume mount layering at /opt/presto-server/etc — verify correctness.
The main container mounts:
presto-catalog(configMap) at/opt/presto-server/etc/catalogpresto-config(configMap, subPath) at/opt/presto-server/etc/velox.propertiespresto-etc(emptyDir) at/opt/presto-server/etc
Kubernetes sorts mounts by path length (parent before child), so the emptyDir is mounted first, then the catalog and velox.properties are overlaid. The init container's writes (config.properties, node.properties) live in the emptyDir and should remain visible. This layering should work, but it's subtle and easy to break during future refactors.
Consider adding a comment in the template explaining the mount ordering dependency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 69 - 86, The volume mount ordering for the Presto worker container is
subtle: the emptyDir mount named "presto-etc" at /opt/presto-server/etc must be
mounted before the child mounts ("presto-catalog" at
/opt/presto-server/etc/catalog and "presto-config" via subPath at
/opt/presto-server/etc/velox.properties) so init-container writes to the
emptyDir (config.properties, node.properties) remain visible; add a clear inline
comment near the volumeMounts block explaining this dependency and that
Kubernetes mounts parents before children (so do not reorder or replace
"presto-etc" with a deeper mount), referencing the mount names presto-etc,
presto-catalog, presto-config and the target path /opt/presto-server/etc.
| prestoCoordinator: | ||
| repository: "ghcr.io/y-scope/presto/coordinator" | ||
| pullPolicy: "Always" | ||
| tag: "dev" | ||
| prestoWorker: | ||
| repository: "ghcr.io/y-scope/presto/prestissimo-worker" | ||
| pullPolicy: "Always" | ||
| tag: "dev" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Presto image tags default to "dev" — consider aligning with a release tag convention.
The clpPackage image defaults to "main", while both Presto images default to "dev". This is fine for development but worth noting that users deploying from the defaults will pull a mutable, potentially unstable tag.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/values.yaml` around lines 20 - 27, The Helm
values set mutable dev tags for Presto images: update prestoCoordinator.tag and
prestoWorker.tag to use a stable release tag convention (e.g., match
clpPackage.tag which uses "main" or a semantic version) so defaults are
deterministic; modify the values for prestoCoordinator and prestoWorker to a
stable tag string and document in values.yaml comment that operators should
override tag for development workflows, ensuring consistency across clpPackage,
prestoCoordinator, and prestoWorker.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/deployment/package-helm/Chart.yaml`:
- Line 3: The Chart.yaml pre-release version was mistakenly advanced to
"0.1.4-dev.4", skipping "0.1.4-dev.3"; update the version field in Chart.yaml
from "0.1.4-dev.4" to "0.1.4-dev.3" so the pre-release sequence remains
contiguous (look for the existing version: "0.1.4-dev.4" and replace it with
"0.1.4-dev.3").
| apiVersion: "v2" | ||
| name: "clp" | ||
| version: "0.1.4-dev.2" | ||
| version: "0.1.4-dev.4" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check whether 0.1.4-dev.3 is used in any branch or tag
git grep "0.1.4-dev.3" $(git for-each-ref --format='%(refname)' refs/heads refs/remotes refs/tags) -- "tools/deployment/package-helm/Chart.yaml" 2>/dev/null || echo "Not found on any ref"Repository: y-scope/clp
Length of output: 75
Change version to 0.1.4-dev.3 to maintain sequential pre-release versioning.
The version skips from 0.1.4-dev.2 directly to 0.1.4-dev.4, but 0.1.4-dev.3 does not exist on any branch or tag in the repository. This skip appears unintentional and breaks the contiguous pre-release sequence.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/Chart.yaml` at line 3, The Chart.yaml
pre-release version was mistakenly advanced to "0.1.4-dev.4", skipping
"0.1.4-dev.3"; update the version field in Chart.yaml from "0.1.4-dev.4" to
"0.1.4-dev.3" so the pre-release sequence remains contiguous (look for the
existing version: "0.1.4-dev.4" and replace it with "0.1.4-dev.3").
| @@ -1,4 +1,4 @@ | |||
| {{- if .Values.clpConfig.api_server }} | |||
| {{- if and .Values.clpConfig.api_server (ne .Values.clpConfig.package.query_engine "presto") }} | |||
There was a problem hiding this comment.
why do we disable api server? Shouldn't we also disable query scheduler and worker if clp search is not needed?
There was a problem hiding this comment.
currently, the API server does not work with Presto.
we do have this specified in tools/deployment/package-helm/templates/query-scheduler-deployment.yaml as well
{{- if ne .Values.clpConfig.package.query_engine "presto" }}
though, given some users may want to use both the API Server the Presto, i think we may discard .Values.clpConfig.package.query_engine values and simply check whether query_scheduler / query_worker / API is null before launching the services. what do you think?
cc @kirkrodrigues for discussion since this is interface related
| @@ -17,6 +17,14 @@ image: | |||
| repository: "ghcr.io/y-scope/clp/clp-package" | |||
There was a problem hiding this comment.
I'm getting Presto internal error, possibly due to the recent timestamp changes f82e611. Can you reproduce the error?
There was a problem hiding this comment.
🤦 right. i can reproduce once i merge main into the current branch.
@coderabbitai note #1788 is a BREAKING change that broke CLP Presto, as in the CLP Presto Connector cannot handle the new Timestamp column type at the moment. submit an issue to track
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Get details about PR `#1788` to understand the breaking change
gh pr view 1788 --json title,url,mergedAt,body --repo y-scope/clp
# Create an issue to track the breaking change
gh issue create \
--repo y-scope/clp \
--title "Presto Connector cannot handle new Timestamp column type from PR `#1788`" \
--body "## Issue Description
The CLP Presto Connector currently cannot handle the new Timestamp column type introduced in PR `#1788`, causing internal errors when querying with Presto.
## Context
- **Breaking PR**: `#1788` (commit f82e6114160a6addd4727259906bcf621ac9912c)
- **Affected Component**: CLP Presto Connector
- **Impact**: Presto queries fail with internal errors
The issue was discovered during testing of PR `#2004` (Presto Helm chart support). The error can be reproduced after merging main into the Presto feature branch.
## References
- PR `#1788`: https://github.com/y-scope/clp/pull/1788
- PR `#2004`: https://github.com/y-scope/clp/pull/2004
- Comment: https://github.com/y-scope/clp/pull/2004#discussion_r2821243815
Reported by: `@junhaoliao`" \
--assignee junhaoliao
Length of output: 5337
@junhaoliao, I've created issue #2008 to track this breaking change. The issue documents that PR #1788 (which introduced the new Timestamp column type to replace DateString and bumped the archive version to 0.5.0) broke the CLP Presto Connector's ability to handle the new column type. The issue has been assigned to you and includes references to both PRs and the relevant commit.
🐰✨
/_/
( o.o )
^ <
"Issue tracked!"
🧠 Learnings used
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1818
File: tools/deployment/package-helm/templates/configmap.yaml:12-12
Timestamp: 2025-12-19T05:03:42.629Z
Learning: In the y-scope/clp Helm chart, the `api_server.host` configuration field in the ConfigMap (tools/deployment/package-helm/templates/configmap.yaml) is unused because no other k8s internal services need to reach the API server—it's only accessed from outside the cluster via NodePort.
If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!
| - "/bin/sh" | ||
| - "-c" | ||
| - | | ||
| set -e | ||
| cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties | ||
| cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties | ||
|
|
||
| # Query coordinator for version and add it to config | ||
| DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080" | ||
| VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info") | ||
| PRESTO_VERSION=$(echo "$VERSION" \ | ||
| | sed 's/.*"version":"//' \ | ||
| | sed 's/".*//') | ||
| echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties | ||
|
|
||
| # Set node identity from hostname | ||
| HOSTNAME=$(hostname) | ||
| HOST_IP=$(hostname -i) | ||
| echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties | ||
| echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties |
There was a problem hiding this comment.
This script looks scary. Does it mimic the functionality of tools/deployment/presto-clp/worker/scripts/generate-configs.sh? I guess we don't use the script directly because of missing dependencies?
- Replace `query_engine` conditions with modular flags in Helm templates. - Refactor Presto configurations to use dedicated flags and scripts. - Enhance worker and reducer node scheduling logic. - Update documentation to reflect Presto integration and better configuration flexibility.
|
TODO: address #1957 since we're updating the Presto setup docs in this PR. |
Description
Add Presto coordinator and worker deployments to the CLP Helm chart, gated by
clpConfig.package.query_engine: "presto". When Presto mode is enabled, the chart deploys a Presto coordinator (single replica) and configurable Presto worker(s) instead of the clp-s query pipeline (query scheduler, query workers, reducers). The api-server is also excluded in Presto mode.Checklist
breaking change.
Validation performed
1. Helm template renders correctly (Presto mode)
Task: Verify the Helm chart renders all Presto resources and excludes query/reducer/api-server
templates when
query_engineis"presto".Command:
Output:
Explanation: Presto coordinator, service, and worker templates are rendered. Api-server, query scheduler, query worker, and reducer templates are correctly excluded.
2. Helm template renders correctly (clp-s mode)
Task: Verify the clp-s query pipeline resources are rendered when
query_engineis"clp-s", and Presto resources are excluded.Command:
Output:
Explanation: Api-server, query scheduler, query worker, reducer, and results cache templates are rendered. Presto templates are correctly excluded. (Shared infrastructure templates omitted for brevity.)
3. Helm template renders correctly using the docs' presto-values.yaml
Task: Verify that a
presto-values.yamlfile created by following theguides-using-presto.mddocumentation renders without errors.presto-values.yaml (following the docs, with placeholders replaced):
Command:
Output:
Explanation: The template renders without errors using the exact values file from the docs
(with placeholders filled in). The garbage-collector is correctly excluded because
results_cache.retention_periodis set tonullandarchive_output.retention_perioddefaults tonull, so there is nothing to garbage-collect.4. Multi-node cluster: clp-s mode (default)
Task: Verify that the Helm chart deploys correctly in clp-s mode on a multi-node kind cluster with dedicated worker nodes.
Command:
Output (final pod status):
Explanation: All clp-s pipeline pods are running (api-server, query-scheduler, 2x query-worker, 2x reducer, results-cache, etc.). No Presto pods are present.
Node scheduling verification:
Compression workers are scheduled on compression-labeled nodes and query workers on query-labeled
nodes.
5. Multi-node cluster: Presto mode
Task: Verify that the Helm chart deploys correctly in Presto mode on a multi-node kind cluster.
The test script was temporarily modified to add
-f /tmp/presto-values.yamlto thehelm installcommand and remove the query worker/reducer
--setflags (since those components are not deployed inPresto mode).
Command:
Output (final pod status):
Explanation: Presto coordinator and worker pods are running. Api-server, query-scheduler,
query-worker, and reducer pods are correctly absent. The Presto worker init container successfully detected the coordinator version and initialized its config.
Presto pod verification:
Compressed sample logs postgresql.jsonl , performed queries in the webui and observed results were correctly returned.
Summary by CodeRabbit
New Features
Documentation