cld2labs/TinyLlama-1.1B-Chat-v1.0 by arpannookala-12 · Pull Request #94 · opea-project/Enterprise-Inference

arpannookala-12 · 2026-04-21T19:44:23Z

Summary

Adds model card for TinyLlama-1.1B-Chat-v1.0 (TinyLlama Project) under third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/
Adds Helm-based deployment guide for deploying TinyLlama-1.1B-Chat-v1.0 via vLLM on CPU (Xeon) with Keycloak OIDC and APISIX ingress

…r Dell EI Signed-off-by: arpannookala-12 <ganesh.arpan.nookala@cloud2labs.com>

alexsin368

Reviewed TinyLlama. Suggested methods for cleaner user experience.

alexsin368 · 2026-05-05T23:07:27Z

+## Step 2: Deploy Tinyllama-1.1b-chat-v1.0 Model
+
+```bash
+helm install tinyllama-1-1b-cpu ./core/helm-charts/vllm \


Getting an error saying the ingress for this model already exists. Even after uninstalling the model with helm and confirming the ingress for tinyllama is deleted, rerunning the helm install command results in this error:

Error: INSTALLATION FAILED: 1 error occurred:
* ingresses.networking.k8s.io "tinyllama-1-1b-cpu-vllm-ingress" already exists

Alex please verify if ingress is deleted with 'kubectl get ingress' after running helm uninstall

I tried to replicate the issue, I deployed tinyllama with helm command and running helm uninstall command also removed ingress.

redeploying tinyllama for the second time didnt give me any error. Below is the output

**user@ubuntuxeon2:~/Enterprise-Inference$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
keycloak nginx api.example.com 80, 443 28m
tinyllama-1-1b-cpu-vllm-ingress alb api.example.com 80, 443 6m39s

user@ubuntuxeon2:~/Enterprise-Inference$ helm uninstall tinyllama-1-1b-cpu
release "tinyllama-1-1b-cpu" uninstalled

user@ubuntuxeon2:~/Enterprise-Inference$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
keycloak nginx api.example.com 80, 443 36m

user@ubuntuxeon2:~/Enterprise-Inference$ kubectl get pods
NAME READY STATUS RESTARTS AGE
keycloak-0 1/1 Running 0 36m
keycloak-postgresql-0 1/1 Running 0 36m

user@ubuntuxeon2:~/Enterprise-Inference$ helm install tinyllama-1-1b-cpu ./core/helm-charts/vllm --values ./core/helm-charts/vllm/xeon-values.yaml --set LLM_MODEL_ID="TinyLlama/TinyLlama-1.1B-Chat-v1.0" --set global.HUGGINGFACEHUB_API_TOKEN="$HUGGING_FACE_HUB_TOKEN" --set ingress.enabled=true --set ingress.secretname="${BASE_URL}" --set ingress.host="${BASE_URL}" --set oidc.client_id="$KEYCLOAK_CLIENT_ID" --set oidc.client_secret="$KEYCLOAK_CLIENT_SECRET" --set apisix.enabled=true --set tensor_parallel_size="1" --set pipeline_parallel_size="1"
NAME: tinyllama-1-1b-cpu
LAST DEPLOYED: Wed May 6 15:42:36 2026
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
**

feat: add TinyLlama-1.1B-Chat-v1.0 model card and deployment guide fo…

3c03b81

…r Dell EI Signed-off-by: arpannookala-12 <ganesh.arpan.nookala@cloud2labs.com>

alexsin368 self-requested a review April 29, 2026 04:17