Skip to content

cld2labs/TinyLlama-1.1B-Chat-v1.0#94

Open
arpannookala-12 wants to merge 6 commits into
opea-project:mainfrom
cld2labs:cld2labs/TinyLlama-1.1B-Chat-v1.0
Open

cld2labs/TinyLlama-1.1B-Chat-v1.0#94
arpannookala-12 wants to merge 6 commits into
opea-project:mainfrom
cld2labs:cld2labs/TinyLlama-1.1B-Chat-v1.0

Conversation

@arpannookala-12
Copy link
Copy Markdown
Contributor

Summary

  • Adds model card for TinyLlama-1.1B-Chat-v1.0 (TinyLlama Project) under third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/
  • Adds Helm-based deployment guide for deploying TinyLlama-1.1B-Chat-v1.0 via vLLM on CPU (Xeon) with Keycloak OIDC and APISIX ingress

…r Dell EI

Signed-off-by: arpannookala-12 <ganesh.arpan.nookala@cloud2labs.com>
@alexsin368 alexsin368 self-requested a review April 29, 2026 04:17
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Copy link
Copy Markdown
Collaborator

@alexsin368 alexsin368 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed TinyLlama. Suggested methods for cleaner user experience.

Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
Comment thread third_party/Dell/model-deployment/TinyLlama-1.1B-Chat-v1.0/deployment.md Outdated
## Step 2: Deploy Tinyllama-1.1b-chat-v1.0 Model

```bash
helm install tinyllama-1-1b-cpu ./core/helm-charts/vllm \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting an error saying the ingress for this model already exists. Even after uninstalling the model with helm and confirming the ingress for tinyllama is deleted, rerunning the helm install command results in this error:

Error: INSTALLATION FAILED: 1 error occurred:
* ingresses.networking.k8s.io "tinyllama-1-1b-cpu-vllm-ingress" already exists

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex please verify if ingress is deleted with 'kubectl get ingress' after running helm uninstall

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to replicate the issue, I deployed tinyllama with helm command and running helm uninstall command also removed ingress.

redeploying tinyllama for the second time didnt give me any error. Below is the output

**user@ubuntuxeon2:~/Enterprise-Inference$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
keycloak nginx api.example.com 80, 443 28m
tinyllama-1-1b-cpu-vllm-ingress alb api.example.com 80, 443 6m39s

user@ubuntuxeon2:~/Enterprise-Inference$ helm uninstall tinyllama-1-1b-cpu
release "tinyllama-1-1b-cpu" uninstalled

user@ubuntuxeon2:~/Enterprise-Inference$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
keycloak nginx api.example.com 80, 443 36m

user@ubuntuxeon2:~/Enterprise-Inference$ kubectl get pods
NAME READY STATUS RESTARTS AGE
keycloak-0 1/1 Running 0 36m
keycloak-postgresql-0 1/1 Running 0 36m

user@ubuntuxeon2:~/Enterprise-Inference$ helm install tinyllama-1-1b-cpu ./core/helm-charts/vllm --values ./core/helm-charts/vllm/xeon-values.yaml --set LLM_MODEL_ID="TinyLlama/TinyLlama-1.1B-Chat-v1.0" --set global.HUGGINGFACEHUB_API_TOKEN="$HUGGING_FACE_HUB_TOKEN" --set ingress.enabled=true --set ingress.secretname="${BASE_URL}" --set ingress.host="${BASE_URL}" --set oidc.client_id="$KEYCLOAK_CLIENT_ID" --set oidc.client_secret="$KEYCLOAK_CLIENT_SECRET" --set apisix.enabled=true --set tensor_parallel_size="1" --set pipeline_parallel_size="1"
NAME: tinyllama-1-1b-cpu
LAST DEPLOYED: Wed May 6 15:42:36 2026
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
**

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants