-
Notifications
You must be signed in to change notification settings - Fork 43
✨ Add llm-proxy deployment #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Fabian von Feilitzsch <[email protected]>
WalkthroughAdds an LLM proxy feature: new defaults and provider mappings, Ansible tasks to deploy the proxy and update Kai CR status, and multiple Kubernetes templates (ConfigMaps, Deployment, Service, client config) with Keycloak and multi-provider LLM wiring. Changes
Sequence Diagram(s)sequenceDiagram
participant Ansible as Ansible Role
participant K8s as Kubernetes API
participant Operator as Operator (CR updates)
Ansible->>K8s: create ConfigMap (llm-proxy), Deployment, Service, client ConfigMap (when enabled)
opt ansible_operator_meta defined
Ansible->>Operator: include kai-status.yml
Operator->>K8s: get Deployment llm-proxy
Operator->>K8s: get Deployment kai-api
Operator->>Operator: compute readiness flags
Operator->>K8s: update CR conditions (LLMProxyReady, KaiSolutionServerReady, KaiAPIKeysConfigured)
end
sequenceDiagram
participant Client as Client
participant Proxy as LLM Proxy
participant Keycloak as Keycloak
participant Provider as LLM Provider
participant DB as PostgreSQL
Client->>Proxy: request + JWT
Proxy->>Keycloak: validate token
alt token valid
Keycloak-->>Proxy: validated
Proxy->>Provider: forward request (provider-specific credentials)
Provider-->>Proxy: LLM response
Proxy->>DB: store/track request
Proxy-->>Client: return response
else token invalid
Keycloak-->>Proxy: invalid
Proxy-->>Client: 401/deny
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (7)
roles/tackle/tasks/main.yml (1)
914-917: Debug logging is safe; consider gating if it becomes noisyThis debug task mirrors the existing
when: kai_solution_server_enabled | boolusage and doesn’t introduce new failure modes, but it will log on every run. If logs get too chatty over time, consider wrapping it in awhen: ansible_verbosity|int >= 2(or similar) so it only appears in more verbose runs.roles/tackle/templates/kai/kai-api-deployment.yaml.j2 (1)
29-37: Good defensive guard around secret-derived env varsThe
ifaroundkai_api_key_secret_status.resources | length > 0correctly prevents indexingresources.0when the secret is missing, avoiding template failures while keeping behavior unchanged when it exists. If you want to tighten it further, you could also guardresourceswith a| default([])to handle the (unlikely) case wherekai_api_key_secret_statusis unset, but that’s optional.roles/tackle/templates/kai/kai-db-deployment.yaml.j2 (1)
31-45: Adding POSTGRES_ aliases is reasonable; confirm image expectations*Mirroring the existing
POSTGRESQL_*vars withPOSTGRES_*aliases wired to the same secret keys is a sensible way to support images/tooling expecting either naming scheme. Just confirm that thekai_database_image_fqinimage actually consumes the desired variant(s) so you’re not carrying unused env vars.roles/tackle/templates/kai/llm-proxy-ingress.yaml.j2 (1)
1-45: Ingress wiring is sound; consider refining TLS placeholderThe ingress correctly targets
llm-proxy:8321, parametrizesingressClassName, and sets reasonable nginx/alb annotations. Thetls: - {}placeholder, though valid YAML, may behave inconsistently across controllers; you might prefer to either omittlsentirely when no secret/hosts are configured or model it after whatever pattern you use in the existing UI ingress template.roles/tackle/templates/kai/llm-proxy-deployment.yaml.j2 (1)
62-68: Normalizefeature_auth_requiredto a boolean in Jinja conditionalsHere you use:
{% if feature_auth_required and keycloak_sso_proto == 'https' %} ... {% endif %}and similarly for the
trusted-cavolume. Elsewhere in the role,feature_auth_requiredis typically used asfeature_auth_required | boolbecause it’s derived via a Jinja expression and may be a string value.For consistency (and to avoid surprises where a string
"false"still evaluates truthy), consider:-{% if feature_auth_required and keycloak_sso_proto == 'https' %} +{% if feature_auth_required | bool and keycloak_sso_proto == 'https' %} ... -{% if feature_auth_required and keycloak_sso_proto == 'https' %} +{% if feature_auth_required | bool and keycloak_sso_proto == 'https' %}This keeps the behavior aligned with the rest of the templates and ensures the trusted CA wiring only kicks in when auth is actually required.
Also applies to: 89-115
roles/tackle/defaults/main.yml (1)
123-123: LLM proxy defaults are coherent; consider guardingkai_llm_modelin computed model idThe new defaults for
keycloak_api_audience,kai_llm_params_proxy, and thellm_proxy_*provider/API-key mappings line up well with the deployment and ConfigMap templates.One small edge case: with the current definition
kai_llm_params_proxy: model: "{{ llm_proxy_provider_id }}/{{ kai_llm_model }}"and
kai_llm_modeldefaulting tonull, you’ll end up with values likeopenai/Noneif someone enables the proxy without explicitly setting a model. You might want a safer default or explicit guard, for example:-kai_llm_params_proxy: - model: "{{ llm_proxy_provider_id }}/{{ kai_llm_model }}" +kai_llm_params_proxy: + model: >- + {{ (llm_proxy_provider_id ~ '/' ~ kai_llm_model) + if kai_llm_model else None }}or simply document that
kai_llm_modelis required wheneverkai_llm_proxy_enabledis set.Also applies to: 294-351
roles/tackle/tasks/kai-status.yml (1)
22-39: Downgrade severity: This is optional refactoring, not a critical issueThe code currently works correctly due to Jinja2's reliable short-circuit evaluation of the
andoperator. Whenkai_llm_proxy_enabledis false (the default), the first part of the AND expression fails andllm_proxy_deployment.resourcesis never evaluated—preventing any undefined variable error.However, the review's suggestion is valid as a defensive programming practice to make the guard explicit rather than relying on implicit short-circuit behavior. This improves code clarity and maintainability.
The suggested fix is sound, but this is a code quality improvement recommendation, not a critical bug:
- name: Set LLM proxy ready status when: kai_llm_proxy_enabled | bool and (llm_proxy_deployment is defined) set_fact: llm_proxy_ready: "{{ llm_proxy_deployment.resources | length > 0 and llm_proxy_deployment.resources[0].status.readyReplicas | default(0) > 0 and llm_proxy_deployment.resources[0].status.readyReplicas | default(0) == llm_proxy_deployment.resources[0].status.replicas | default(1) and llm_proxy_deployment.resources[0].status.updatedReplicas | default(0) == llm_proxy_deployment.resources[0].status.replicas | default(1) }}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
roles/tackle/defaults/main.yml(2 hunks)roles/tackle/tasks/kai-status.yml(1 hunks)roles/tackle/tasks/kai.yml(2 hunks)roles/tackle/tasks/main.yml(1 hunks)roles/tackle/templates/deployment-hub.yml.j2(1 hunks)roles/tackle/templates/kai/kai-api-deployment.yaml.j2(1 hunks)roles/tackle/templates/kai/kai-db-deployment.yaml.j2(1 hunks)roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2(1 hunks)roles/tackle/templates/kai/llm-proxy-deployment.yaml.j2(1 hunks)roles/tackle/templates/kai/llm-proxy-ingress.yaml.j2(1 hunks)roles/tackle/templates/kai/llm-proxy-service.yaml.j2(1 hunks)
🔇 Additional comments (3)
roles/tackle/templates/kai/llm-proxy-service.yaml.j2 (1)
1-19: Service wiring looks correct; ensure pod labels match selectorClusterIP service on port 8321 with selector
app: llm-proxyaligns with the ingress/backend usage. Just double-check the llm-proxy Deployment labels its pods withapp: llm-proxyso this selector actually picks them up.roles/tackle/templates/deployment-hub.yml.j2 (1)
101-134: KEYCLOAK_AUDIENCE wiring looks consistent; ensure it matches Keycloak client configAdding
KEYCLOAK_AUDIENCEsourced fromkeycloak_api_audienceunder the existing Keycloak-only block is consistent with how the same variable is used in the LLM proxy config. Just make sure the value ("konveyor-api"by default) actually matches the audience/client settings in Keycloak so JWT validation is coherent across Hub, Kai, and the proxy.roles/tackle/tasks/kai.yml (1)
149-152: Status update hook for Kai components looks correctImporting
kai-status.ymlonly whenansible_operator_metais defined is a clean way to keep CR status updates operator-only while reusing the same Kai deployment tasks elsewhere.
roles/tackle/tasks/kai.yml
Outdated
| - name: Get Keycloak external URL | ||
| when: | ||
| - kai_llm_proxy_enabled | bool | ||
| - feature_auth_required | bool | ||
| - keycloak_external_url is not defined | ||
| block: | ||
| - name: "Get Keycloak route on OpenShift" | ||
| when: openshift_cluster | ||
| k8s_info: | ||
| api_version: route.openshift.io/v1 | ||
| kind: Route | ||
| name: "{{ keycloak_sso_name }}" | ||
| namespace: "{{ app_namespace }}" | ||
| register: keycloak_route | ||
|
|
||
| - name: "Set Keycloak external URL from route" | ||
| when: | ||
| - openshift_cluster | ||
| - keycloak_route.resources | length > 0 | ||
| set_fact: | ||
| keycloak_external_url: "https://{{ keycloak_route.resources[0].spec.host }}" | ||
|
|
||
| - name: "Get Keycloak ingress on Kubernetes" | ||
| when: not openshift_cluster | ||
| k8s_info: | ||
| api_version: networking.k8s.io/v1 | ||
| kind: Ingress | ||
| name: "{{ keycloak_sso_name }}" | ||
| namespace: "{{ app_namespace }}" | ||
| register: keycloak_ingress | ||
|
|
||
| - name: "Set Keycloak external URL from ingress" | ||
| when: | ||
| - not openshift_cluster | ||
| - keycloak_ingress.resources | length > 0 | ||
| set_fact: | ||
| keycloak_external_url: "http://{{ keycloak_ingress.resources[0].spec.rules[0].host }}/auth" | ||
|
|
||
| - name: Deploy LLM Proxy | ||
| when: kai_llm_proxy_enabled | bool | ||
| block: | ||
| - name: Create LLM Proxy ConfigMap | ||
| k8s: | ||
| state: present | ||
| template: kai/llm-proxy-configmap.yaml.j2 | ||
|
|
||
| - name: Create LLM Proxy Deployment | ||
| k8s: | ||
| state: present | ||
| template: kai/llm-proxy-deployment.yaml.j2 | ||
|
|
||
| - name: Create LLM Proxy Service | ||
| k8s: | ||
| state: present | ||
| template: kai/llm-proxy-service.yaml.j2 | ||
|
|
||
| - name: Create LLM Proxy Ingress | ||
| k8s: | ||
| state: present | ||
| template: kai/llm-proxy-ingress.yaml.j2 | ||
| when: not openshift_cluster | ||
|
|
||
| # TODO: Determine LLM proxy external URL | ||
| # After creating the ingress/route, we should: | ||
| # 1. Query the ingress/route to get the actual external URL | ||
| # 2. Store it as llm_proxy_external_url | ||
| # 3. Make it available to users (perhaps in status or ConfigMap) | ||
| # This URL is needed for external clients to connect to the proxy | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Keycloak external URL discovery is good, but currently unused and path/scheme may not match real Keycloak
The route/ingress discovery logic is a solid start, but a couple of points to tighten:
keycloak_external_urlisn’t used in the providedllm-proxy-configmap.yaml.j2; the proxy auth section still points at the internal service. If the JWTissclaim is based on the external URL, you’ll likely want to plumbkeycloak_external_urlinto the LLM proxy’sissuer/jwks.uriinstead of (or in addition to) the cluster-internal host.- The Kubernetes Ingress branch hardcodes
http://{{ keycloak_ingress.resources[0].spec.rules[0].host }}/auth. That may not match clusters where:- TLS is terminated at the ingress (should be
https://), or - the Keycloak path isn’t
/auth(or is already included in the ingress path).
- TLS is terminated at the ingress (should be
You might want to:
- Normalize
keycloak_external_urlto a single base (scheme + host [+ optional base path]) and - Derive the issuer/JWKS URLs from that in the proxy config, with a fallback to the internal service if discovery fails.
🏁 Script executed:
#!/bin/bash
# Find the LLM proxy configuration template
fd -t f "llm-proxy" --type file | head -20Length of output: 216
🏁 Script executed:
#!/bin/bash
# Search for references to keycloak_external_url
rg -n "keycloak_external_url" --type yaml --type j2Length of output: 26
🏁 Script executed:
#!/bin/bash
# Find and examine the llm-proxy-configmap template
find . -name "llm-proxy-configmap.yaml.j2" -o -name "*llm*proxy*" -type f 2>/dev/null | head -20Length of output: 224
🏁 Script executed:
#!/bin/bash
# Search for keycloak references in proxy or auth config
rg -n -C2 "keycloak|issuer|jwks" --type yaml --type j2 | grep -A2 -B2 "proxy\|auth\|config" | head -50Length of output: 26
🏁 Script executed:
#!/bin/bash
# Look at the kai.yml file around lines 71-139 for context
cat roles/tackle/tasks/kai.yml | sed -n '60,150p'Length of output: 2917
🏁 Script executed:
#!/bin/bash
# Examine the LLM proxy configmap template
cat roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2Length of output: 3765
🏁 Script executed:
#!/bin/bash
# Search for keycloak_external_url in all files
rg -n "keycloak_external_url" --no-typeLength of output: 435
🏁 Script executed:
#!/bin/bash
# Search for keycloak references in templates (no type filter)
rg -n "keycloak" roles/tackle/templates/kai/ -A2 -B2Length of output: 2310
🏁 Script executed:
#!/bin/bash
# Search for issuer and jwks patterns
rg -n "issuer|jwks|iss_uri|token_endpoint" roles/tackle/templates/kai/Length of output: 277
🏁 Script executed:
#!/bin/bash
# Look at all template files to understand structure
ls -la roles/tackle/templates/kai/ | grep -E "llm-proxy|kai-api"Length of output: 453
🏁 Script executed:
#!/bin/bash
# Search entire codebase for any reference to keycloak_external_url
rg "keycloak_external_url"Length of output: 320
🏁 Script executed:
#!/bin/bash
# Check if keycloak_sso_service_name and keycloak_sso_port are defined
rg -n "keycloak_sso_service_name|keycloak_sso_port" roles/tackle/defaults/ roles/tackle/vars/ 2>/dev/null | head -20Length of output: 916
🏁 Script executed:
#!/bin/bash
# Verify what keycloak_sso_service_name typically resolves to
rg -n "keycloak_sso_service_name.*=" roles/tackle/Length of output: 0
Keycloak external URL is discovered but unused—JWT validation will fail
The review comment is accurate. Verification confirms:
keycloak_external_urlis computed inroles/tackle/tasks/kai.yml(lines ~94, ~104) but has zero references anywhere else in the codebase—it is never passed to any template.- The LLM proxy auth config (
roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2, lines 113–114) hardcodes internal Kubernetes service URLs (keycloak_sso_service_name) instead of using the external URL. - If Keycloak issues JWTs with an external issuer (standard for ingress/route deployments), the proxy will reject them because the issuer claim won't match the internal service URL configured in the proxy.
- The hardcoded
http://scheme (lines 104 and 113) may be incorrect if TLS terminates at the ingress. - The hardcoded
/authpath (lines 104 and 113) assumes a fixed Keycloak path structure that may not hold across deployments.
Recommendation: Plumb keycloak_external_url into the proxy config template as the issuer/JWKS base, and derive the full URIs from it with a fallback to the internal service if discovery fails.
🤖 Prompt for AI Agents
In roles/tackle/tasks/kai.yml lines 71-139, keycloak_external_url is computed
but never used; update the task flow to pass keycloak_external_url into the LLM
Proxy ConfigMap template (e.g., add vars to the k8s template call or set a fact
consumed by the template) so the proxy can use it as the issuer/JWKS base;
modify the value you pass to derive full URIs from keycloak_external_url
(respecting its scheme/host/path) and only fall back to the internal service
name (keycloak_sso_service_name and port) if discovery is missing, and avoid
hardcoding "http://" or the "/auth" path in the task—let the template construct
issuer/jwks endpoints from the provided external URL with a safe fallback to
internal service when needed.
| server: | ||
| port: 8321 | ||
| {% if feature_auth_required %} | ||
| auth: | ||
| provider_config: | ||
| type: "oauth2_token" | ||
| jwks: | ||
| uri: "http://{{ keycloak_sso_service_name }}:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}/protocol/openid-connect/certs" | ||
| issuer: "http://{{ keycloak_sso_service_name }}.{{ app_namespace }}.svc:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}" | ||
| audience: "{{ keycloak_api_audience }}" | ||
| {% endif %} | ||
| telemetry: | ||
| enabled: false No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Align OAuth2 issuer/jwks.uri with Keycloak scheme and external URL
The auth config is currently:
auth:
provider_config:
type: "oauth2_token"
jwks:
uri: "http://{{ keycloak_sso_service_name }}:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}/protocol/openid-connect/certs"
issuer: "http://{{ keycloak_sso_service_name }}.{{ app_namespace }}.svc:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}"
audience: "{{ keycloak_api_audience }}"Issues:
- The scheme is hardcoded to
http://even thoughkeycloak_sso_portis8443when TLS is enabled. That will fail against a TLS-only Keycloak. - Both
issuerandjwks.uripoint at the cluster-internal service name, while Keycloak typically mints tokens withisspointing at the external URL (route/ingress, includingKEYCLOAK_FRONTEND_URL). This mismatch can cause issuer validation to fail in the LLM proxy.
At minimum, it’s safer to respect keycloak_sso_proto:
- jwks:
- uri: "http://{{ keycloak_sso_service_name }}:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}/protocol/openid-connect/certs"
- issuer: "http://{{ keycloak_sso_service_name }}.{{ app_namespace }}.svc:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}"
+ jwks:
+ uri: "{{ keycloak_sso_proto }}://{{ keycloak_sso_service_name }}.{{ app_namespace }}.svc:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}/protocol/openid-connect/certs"
+ issuer: "{{ keycloak_sso_proto }}://{{ keycloak_sso_service_name }}.{{ app_namespace }}.svc:{{ keycloak_sso_port }}/auth/realms/{{ keycloak_sso_realm }}"Given you already compute keycloak_external_url in roles/tackle/tasks/kai.yml when the LLM proxy + auth are enabled, you may want to go further and base issuer/jwks.uri on that external URL (with a fallback to the internal service) so they match the iss claim seen by external clients.
I’d recommend validating this end‑to‑end with a real JWT from your Keycloak to ensure the proxy’s issuer/audience checks pass as intended.
I'll verify the review comment by examining the codebase for the relevant configuration variables and patterns.
🏁 Script executed:
#!/bin/bash
# 1. Display the template file to verify current state
echo "=== Template content ==="
cat roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2
echo -e "\n=== Searching for keycloak_sso_proto definition ==="
rg -n "keycloak_sso_proto" --type yaml --type yml -C 3
echo -e "\n=== Searching for keycloak_external_url ==="
rg -n "keycloak_external_url" -C 3
echo -e "\n=== Searching for keycloak_sso_port definition ==="
rg -n "keycloak_sso_port" --type yaml --type yml -C 2Length of output: 5442
I need to search without the file type filter that caused the error:
🏁 Script executed:
#!/bin/bash
# Search for keycloak_sso_proto definition
echo "=== Searching for keycloak_sso_proto ==="
rg -n "keycloak_sso_proto" -C 3
echo -e "\n=== Searching for keycloak_sso_port ==="
rg -n "keycloak_sso_port" -C 3
echo -e "\n=== Looking for variable defaults in defaults/main.yml ==="
find . -name "defaults" -o -name "main.yml" | head -20
rg -n "keycloak" roles/tackle/defaults/main.yml 2>/dev/null | head -30
echo -e "\n=== Checking if keycloak_sso_proto is used elsewhere ==="
rg -n "keycloak_sso_proto\|keycloak_sso_port" --type py --type yml --type yaml --type j2 | head -40Length of output: 9740
Use keycloak_sso_proto for OAuth2 auth URLs to fix TLS mismatch
The review identifies a confirmed bug in roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2 lines 113–114. The auth config hardcodes the scheme to http:// even though:
keycloak_sso_protois defined as'https'when TLS is enabled and'http'when disabledkeycloak_sso_portis8443when TLS is enabled
This causes OAuth2 issuer validation to fail in HTTPS/TLS deployments. Additionally, keycloak_external_url is already computed in the role when LLM proxy + auth are enabled, so using it for the issuer and jwks.uri would align token validation with external client expectations.
At minimum, update lines 113–114 to use {{ keycloak_sso_proto }} instead of hardcoded http://. Ideally, base both URLs on the computed keycloak_external_url to match Keycloak's token issuer claim.
🤖 Prompt for AI Agents
roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2 lines 106 to 118: the
OAuth2 URLs are hardcoded with "http://" which breaks issuer/jwks validation in
TLS deployments; replace the hardcoded scheme with the proper templated value
and preferably use the precomputed keycloak_external_url: update jwks.uri and
issuer to build URLs using {{ keycloak_sso_proto }} and {{
keycloak_sso_service_name }}, {{ keycloak_sso_port }} and {{ keycloak_sso_realm
}} or, better, use {{ keycloak_external_url }} (ensuring it includes proto, host
and port) so both jwks.uri and issuer match Keycloak's external issuer value.
Signed-off-by: Fabian von Feilitzsch <[email protected]>
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: llm-proxy-config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-config seems redundant on a configMap. Like having all the pod resources have a -pod suffix in the name. just a nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
roles/tackle/templates/kai/llm-proxy-deployment.yaml.j2 (1)
94-100: Review resource limits for production workloads.The current resource configuration:
- Requests: 256Mi RAM, 100m CPU
- Limits: 512Mi RAM, 500m CPU
These limits may be insufficient for LLM proxy operations under load. Consider:
- Whether 512Mi RAM is adequate for the proxy's memory footprint
- If 500m CPU (0.5 cores) provides acceptable response times
- Whether these should be configurable via variables (similar to other components in the codebase)
Consider making resources configurable:
In
roles/tackle/defaults/main.yml, add:kai_llm_proxy_container_requests_cpu: "100m" kai_llm_proxy_container_requests_memory: "256Mi" kai_llm_proxy_container_limits_cpu: "500m" kai_llm_proxy_container_limits_memory: "512Mi"Then update the template:
resources: requests: - memory: "256Mi" - cpu: "100m" + memory: "{{ kai_llm_proxy_container_requests_memory }}" + cpu: "{{ kai_llm_proxy_container_requests_cpu }}" limits: - memory: "512Mi" - cpu: "500m" + memory: "{{ kai_llm_proxy_container_limits_memory }}" + cpu: "{{ kai_llm_proxy_container_limits_cpu }}"roles/tackle/defaults/main.yml (2)
312-319: LGTM: LLM proxy feature flags and service configuration.The defaults provide sensible values:
kai_llm_proxy_enabled: false(opt-in)- Internal service URL with explicit port
- Versioned container image reference
Note that
llamastack/distribution-starter:latestuses a mutable tag. Consider pinning to a specific version for production deployments to ensure reproducibility.
337-348: Review API key environment variable mapping.The
kai_llm_proxy_api_key_env_mapprovides mappings for common providers. The fallback logic on line 348 constructs<PROVIDER>_API_KEYby uppercasing the provider name.Consider edge cases:
- Provider names with hyphens (e.g., "azure-openai") would become "AZURE-OPENAI_API_KEY" (invalid env var name)
- Multi-word providers may not follow the simple uppercase pattern
For robustness, document the expected provider name format or add validation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
roles/tackle/defaults/main.yml(2 hunks)roles/tackle/tasks/kai-status.yml(1 hunks)roles/tackle/tasks/kai.yml(2 hunks)roles/tackle/templates/deployment-hub.yml.j2(2 hunks)roles/tackle/templates/deployment-ui.yml.j2(1 hunks)roles/tackle/templates/kai/kai-db-deployment.yaml.j2(2 hunks)roles/tackle/templates/kai/llm-proxy-client-configmap.yaml.j2(1 hunks)roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2(1 hunks)roles/tackle/templates/kai/llm-proxy-deployment.yaml.j2(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- roles/tackle/tasks/kai.yml
- roles/tackle/templates/kai/llm-proxy-configmap.yaml.j2
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: run-ci / e2e-api-integration-tests
🔇 Additional comments (14)
roles/tackle/templates/kai/kai-db-deployment.yaml.j2 (2)
6-6: LGTM: Namespace formatting change.The unquoted namespace is consistent with standard Jinja2 templating practices.
31-45: The review comment's premise is incorrect—these are not duplicate environment variables.The code shows an intentional mapping, not duplication:
- The secret stores only
POSTGRESQL_*keys (POSTGRESQL_DATABASE,POSTGRESQL_PASSWORD,POSTGRESQL_USER,POSTGRESQL_HOST)- The deployment surfaces these as
POSTGRES_*environment variables, which the official PostgreSQL Docker image expectsThis is a necessary conversion between naming conventions. The secret uses
POSTGRESQL_prefixes (common in Keycloak-based setups), while the PostgreSQL image requiresPOSTGRES_prefixed environment variables. No action is needed.Likely an incorrect or invalid review comment.
roles/tackle/templates/deployment-ui.yml.j2 (1)
97-100: LGTM: Clean conditional environment variable.The
KAI_LLM_PROXY_URLenvironment variable is properly gated bykai_llm_proxy_enabledand follows the established pattern for conditional feature configuration.roles/tackle/templates/deployment-hub.yml.j2 (2)
132-133: LGTM: Keycloak audience configuration.The
KEYCLOAK_AUDIENCEenvironment variable is properly placed within the Keycloak configuration block and sourced from thekeycloak_api_audiencedefault value.
189-192: LGTM: LLM proxy configuration mapping.The
LLM_PROXY_CONFIG_MAPenvironment variable is properly gated bykai_llm_proxy_enabledand references the client ConfigMap name, enabling the Hub to serve the LLM proxy configuration to clients.roles/tackle/tasks/kai-status.yml (3)
23-29: Review complex readiness logic.The readiness check ensures all replicas are ready and updated. However, the expression is complex and spans multiple lines. Consider the following:
- Line 28: The condition checks
readyReplicas == replicas, but defaults replicas to 1 if undefined. Verify this default is appropriate—typically deployment replicas should be explicitly defined.- Line 29: Similar default for
updatedReplicascomparison.- The compound boolean spanning multiple lines may benefit from intermediate variables for clarity.
The logic appears correct but confirm the default values align with expected deployment behavior.
32-38: LGTM: Blocking and deployment status logic.The logic correctly determines if Kai is blocked by the proxy (proxy enabled but not ready) and separately tracks Kai API deployment readiness. The compound conditions properly cascade to the status updates.
40-87: LGTM: Comprehensive status condition updates.The three status conditions (
LLMProxyReady,KaiSolutionServerReady,KaiAPIKeysConfigured) are properly structured with:
- Correct status, reason, and message fields
- Proper timestamp formatting
- Appropriate gating (LLMProxyReady only when enabled)
- Clear user-facing messages for troubleshooting
The implementation aligns with Kubernetes status conventions.
roles/tackle/templates/kai/llm-proxy-deployment.yaml.j2 (3)
24-26: LGTM: Config-driven pod restart mechanism.The checksum annotation ensures pods restart when the ConfigMap changes, which is essential for configuration updates. This is a standard pattern for configuration-driven deployments.
62-68: LGTM: Self-signed certificate handling.The conditional SSL certificate configuration properly handles self-signed certificates when Keycloak uses HTTPS, setting both
SSL_CERT_FILEandREQUESTS_CA_BUNDLEenvironment variables.
74-82: LGTM: Dynamic API key injection.The template dynamically injects all keys from the
kai-api-keyssecret as environment variables, providing flexibility for different LLM providers. This pattern correctly handles the case where no secret exists.roles/tackle/defaults/main.yml (3)
123-123: LGTM: Keycloak API audience default.The
keycloak_api_audiencedefault value aligns with the PR objectives to integrate JWT validation against Keycloak.
294-303: LGTM: Proxy-specific LLM parameters.The
kai_llm_params_proxystructure properly mirrorskai_llm_paramswhile pointing to the local proxy service. The model format (provider_id/model) andmodel_provider: "openai_api"indicate OpenAI-compatible API usage through the proxy.
321-335: Verify silent default masking configuration errors on line 335.The fallback format
remote::<provider_name>is correct for llama-stack. However, line 335 defaultskai_llm_providerto"openai"when null, which silently masks missing or invalid configuration:kai_llm_proxy_provider_id: "{{ kai_llm_provider | default('openai') }}"Consider raising an error or requiring explicit configuration instead of silently defaulting to
"openai".
| --- | ||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: {{ kai_llm_proxy_client_config_name }} | ||
| namespace: {{ app_namespace }} | ||
| labels: | ||
| app.kubernetes.io/name: llm-proxy-client-config | ||
| app.kubernetes.io/component: kai | ||
| app.kubernetes.io/part-of: tackle | ||
| data: | ||
| config.json: | | ||
| { | ||
| "model": "{{ kai_llm_proxy_provider_id }}/{{ kai_llm_model }}" | ||
| } No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the model format matches LLM proxy expectations.
The ConfigMap defines a simple client configuration with the model field formatted as "{{ kai_llm_proxy_provider_id }}/{{ kai_llm_model }}". Ensure this format (provider/model) is compatible with the LLM proxy service and client implementation.
Additionally, confirm that kai_llm_model will always be defined when kai_llm_proxy_enabled is true, as this template doesn't handle the null case.
🏁 Script executed:
#!/bin/bash
# Search for LLM proxy client usage patterns and model format expectations
echo "Checking for LLM proxy model format references..."
rg -nC5 'kai_llm_proxy_provider_id|kai_llm_model|config\.json'
echo -e "\nChecking for null handling in related templates..."
rg -nC3 'kai_llm_model.*null|kai_llm_model\s*\|'Length of output: 9643
Looking at the script output, I need to verify when the llm-proxy-client-configmap.yaml.j2 template is rendered and whether it's conditionally protected.
🏁 Script executed:
#!/bin/bash
# Find where llm-proxy-client-configmap.yaml.j2 is referenced/used
echo "=== Searching for llm-proxy-client-configmap template references ==="
rg -nC5 'llm-proxy-client-configmap'
echo -e "\n=== Searching for kai_llm_proxy_enabled definition and usage ==="
rg -nC3 'kai_llm_proxy_enabled'
echo -e "\n=== Checking conditionals in KAI-related role files ==="
fd -e yml -e j2 'kai|llm' roles/tackle/templates/kai/ --exec grep -l 'if kai_llm' {} \;Length of output: 5586
🏁 Script executed:
#!/bin/bash
# Also check how this template is registered in the role
echo "=== Checking role tasks for template handling ==="
fd -e yml 'main\.yml|tasks' roles/tackle/ --exec grep -l 'llm-proxy-client' {} \;Length of output: 80
🏁 Script executed:
#!/bin/bash
# Find where this template is used in role tasks
echo "=== Searching for template task that renders llm-proxy-client-configmap ==="
rg -n 'llm-proxy-client-configmap' roles/tackle/tasks/
echo -e "\n=== Searching all tasks/main.yml files in tackle role ==="
cat roles/tackle/tasks/main.yml | head -100
echo -e "\n=== Looking for kai_llm_proxy_enabled in tasks ==="
rg -n 'kai_llm_proxy_enabled' roles/tackle/tasks/Length of output: 3921
🏁 Script executed:
#!/bin/bash
# Search for validation or interdependencies between kai_llm_proxy_enabled and kai_llm_model
echo "=== Looking for validation logic or guards ==="
rg -nC5 'kai_llm_proxy_enabled.*kai_llm_model|kai_llm_model.*kai_llm_proxy_enabled'
echo -e "\n=== Checking if kai_llm_model has other conditional checks ==="
rg -nB3 -A3 'kai_llm_model' roles/tackle/tasks/kai.yml
echo -e "\n=== Searching in entire kai.yml for related context ==="
cat roles/tackle/tasks/kai.yml | grep -A20 'when: kai_llm_proxy_enabled'Length of output: 796
Add null-check for kai_llm_model to prevent invalid JSON when proxy is enabled without a model specified.
The ConfigMap template uses kai_llm_model without null protection, unlike the related llm-proxy-configmap.yaml.j2 template which wraps the model reference in a {% if kai_llm_model %} check. If kai_llm_proxy_enabled is true but kai_llm_model remains null (from defaults), the rendered JSON becomes invalid: "model": "openai/null".
Add the same null-checking pattern from the sibling template to ensure the ConfigMap is only created when both conditions are met.
| startupProbe: | ||
| httpGet: | ||
| path: /v1/health | ||
| port: 8321 | ||
| failureThreshold: 30 | ||
| periodSeconds: 10 | ||
| initialDelaySeconds: 10 | ||
| livenessProbe: | ||
| httpGet: | ||
| path: /v1/health | ||
| port: 8321 | ||
| periodSeconds: 30 | ||
| timeoutSeconds: 5 | ||
| failureThreshold: 3 | ||
| readinessProbe: | ||
| httpGet: | ||
| path: /v1/health | ||
| port: 8321 | ||
| periodSeconds: 10 | ||
| timeoutSeconds: 5 | ||
| successThreshold: 1 | ||
| failureThreshold: 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify health endpoint and probe timings.
The deployment uses /v1/health for all probes with the following configuration:
- Startup: 30 failures × 10s = 5 minutes max startup time
- Liveness: 30s period, 3 failures before restart
- Readiness: 10s period
Confirm that:
- The
/v1/healthendpoint exists in thellamastack/distribution-starter:latestimage - The 5-minute startup window is sufficient for model initialization
- The liveness failure threshold (90 seconds) allows for temporary slow responses without unnecessary restarts
🌐 Web query:
What health check endpoints does llama-stack distribution-starter provide?
💡 Result:
The starter distribution exposes HTTP health endpoints under /v1, specifically:
- GET /v1/health (generic health). [1]
- GET /v1/health/ready (readiness). [2]
- GET /v1/health/live (liveness). [2]
Sources: starter/bring-your-own docs showing /v1/health and the API reference showing /v1/health/ready and /v1/health/live. [1][2]
Use dedicated health check endpoints for readiness and liveness probes.
The /v1/health endpoint exists, but llama-stack distribution-starter also provides dedicated /v1/health/ready and /v1/health/live endpoints. Update the configuration to use:
readinessProbe→/v1/health/readylivenessProbe→/v1/health/livestartupProbe→/v1/health(or/v1/health/live)
Also verify through deployment testing that:
- The 5-minute startup window is sufficient for model initialization
- The 90-second liveness failure threshold (30s period × 3 failures) allows for temporary slow responses
🤖 Prompt for AI Agents
In roles/tackle/templates/kai/llm-proxy-deployment.yaml.j2 around lines 35 to
56, the probes all point to /v1/health; change them so readinessProbe uses
/v1/health/ready, livenessProbe uses /v1/health/live, and keep startupProbe
pointed at /v1/health (or /v1/health/live if preferred), then ensure the
existing timing values reflect desired behavior (confirm startupProbe
initialDelaySeconds and failureThreshold provide ~5 minutes for model init, and
that liveness periodSeconds × failureThreshold (~90s) tolerates transient
slowness) and run deployment tests to validate the timings and endpoints return
expected HTTP success codes.
Signed-off-by: Fabian von Feilitzsch <[email protected]>
Signed-off-by: Fabian von Feilitzsch <[email protected]>
| - name: KAI_URL | ||
| value: "{{ kai_url }}" | ||
| {% if kai_llm_proxy_enabled %} | ||
| - name: LLM_PROXY_CONFIG_MAP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't necessary. The hub endpoint serves the configmap by hame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep missed that on cleanup
| - name: TACKLE_HUB_URL | ||
| value: "{{ hub_url }}" | ||
| {% if kai_llm_proxy_enabled|bool %} | ||
| - name: KAI_LLM_PROXY_URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we decided the new llm proxy service would be mapped in the main rev proxy instead of the brokering through the hub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not setting it in the main reverse proxy? I set this envvar to be used here: https://github.com/konveyor/tackle2-ui/pull/2743/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. My mistake. Thought it was the hub deployment.
Signed-off-by: Fabian von Feilitzsch <[email protected]>
jortel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Defer to @jmontleon regarding general ansible-operator things.
Closes #487
Depends on konveyor/tackle2-hub#953
Still TODO: need to figure out the proper way to set the ingress host etc, and create a provider-settings configmap that the hub can serve to users.
Summary by CodeRabbit
New Features
Monitoring
Configuration
Chores
✏️ Tip: You can customize this high-level summary in your review settings.