Add Qwen3-TTS VoiceDesign vLLM-Omni launcher by yfchoco208 · Pull Request #135 · swiss-ai/model-launch

yfchoco208 · 2026-05-20T04:36:17Z

Adds examples/clariden/cli/qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign-vllm-omni.sh, single-node launcher for Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign, serving text-to-speech via vLLM-Omni on Clariden GH200.

Adds images/vllm_qwen3_tts_cuda13/Dockerfile and src/swiss_ai_model_launch/assets/envs/vllm_qwen3_tts_cuda13.toml, a CUDA13 vLLM-Omni TTS environment with vllm==0.20.2, vllm-omni==0.20.0, transformers==5.8.0, and audio dependencies such as FFmpeg, libsndfile, and soundfile.

Adds Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign to src/swiss_ai_model_launch/assets/models.json, an interactive SML catalog entry using vLLM-Omni with --max-model-len 8192 and --gpu-memory-utilization 0.40. VoiceDesign was tested with task_type=VoiceDesign and text instructions rather than preset CustomVoice speakers.

Also adds vllm-omni as a supported framework where required, matching the existing vLLM-Omni serving pattern.

Validated from a clean checkout:

sml advanced launch works
interactive sml catalog launch works

AryanAhadinia

Thanks a lot for your contribution! We would love to merge your PR after addressing the listed comments! Keep doing the great job!

Please also note that your PR has now conflicts that should be solved prior to merge.

AryanAhadinia · 2026-05-20T19:02:42Z

+    vllm-omni)
+        FRAMEWORK_ENV_SETUP="export RAY_CGRAPH_get_timeout=1800; export no_proxy=\"0.0.0.0,\$no_proxy\"; export NO_PROXY=\"0.0.0.0,\$NO_PROXY\""
+        FRAMEWORK_LAUNCH="vllm serve"
+        ;;


This line indeed seems redundant to me as it is identical to the vLLM case. We may change python3 -m vllm.entrypoints.openai.api_server with vllm serve as they are identical since the former one is deprecated. Nevertheless, please note that we have massively refactored the codebase in #100 and the template.jinja file is now completely removed. Instead, we are now rendering the job script during the runtime in framework.py.

AryanAhadinia · 2026-05-20T19:04:26Z


    model: str
-    framework: Literal["sglang", "vllm"]
+    framework: Literal["sglang", "vllm", "vllm-omni"]


Adding vLLM OMNI beside vLLM as a new framework should be well justified. In the long vision we have, we would like to have to golden base images for vLLM and SGL (ref: #118). As a result, I would suggest to drop vllm-omni as a new framework for now and just use (--environment/--slurm-environment) to specify which toml file you want to use.

Thank you for clarifying, I will remove vllm-omni as new framework and stick to using the original vllm

AryanAhadinia · 2026-05-20T19:10:04Z

Isn't it possible to patch the current vLLM image?

Just to clarify what you mean by “patch the current vLLM image”?

I'm not sure if you meant one of the following:

Use existing Docker vLLM CUDA13 base image if it exists and make vllm_qwen3_tts_cuda13 (derived image) that only adds vllm-omni and audio dependencies.

Modify the current vllm_cuda13 Dockerfile (image) itself to include vllm-omni and audio dependencies.

The second one. In general, we are working toward keeping the number of images and environment as minimal as possible. So, adding a new image and environment only for a small class of models is not that much aligned with our long-term goals.

I worked on patching the existing images/vllm_cuda13/Dockerfile.

Removed the idea of adding vllm-omni as a new SML framework.

Switched the Qwen3-TTS entry to use the existing framework: vllm and made the launch path to use the existing vllm environment pattern.

I found a compatibility issue I found is that adding vllm-omni==0.20.0 on top of the current vllm cuda13 caused import failure because the current image has vllm 0.21.1rc..., when vllm-omni==0.20.0 expects the vLLM 0.20 API layout, so i validated with the following combination:

vllm==0.20.2

vllm-omni==0.20.0

transformers==5.8.0

I tested the patched vllm_cuda13 image using a temporary .sqsh so I did not overwrite the shared image, and tested that Qwen3-TTS VoiceDesign starts and /v1/audio/speech generates WAV output, and normal vLLM text model (swiss-ai/Apertus-8B-Instruct-2509) also starts with --enforce-eager, and /v1/chat/completions returns a valid response.

Since the patch seems to be working with Qwen3-TTS and other models that use the docker image, can I proceed with patching the existing vllm_cuda13 Dockerfile?

But I just want to verify if it is okay to pin the shared vllm cuda13 image to vllm==0.20.2 for compatibility with vllm-omni==0.20.0 or not.

Thanks a lot! What I would like to suggest is to create a new Dockerfile vllm_cuda13_v2 and drop your patched Dockerfile there. Since the changes are now minimal, it would be hopefully easy for us to replace the original image with yours.

There are ongoing efforts in fixing some bugs that we have with vLLM (#126) and building golden docker images (#118 and #93). The complete replacement of the current vLLM image with yours will take place after the resolution of the aforementioned issues and PRs.

Please note that the CI pipeline should automatically build the Docker image for your and place it beside the other docker images.

Again, thanks a lot for your great job!

Hi, I've pushed the updated version, but the two remaining CI failures seem likely to be related to repository CI configuration/key.

For Docker Build vllm_cuda13_v2, the job seem to fail during FirecREST initialization before the image build and gives error message:

requests.exceptions.MissingSchema: Invalid URL '': No scheme supplied. Perhaps you meant https://?

I manually built and tested vllm_cuda13_v2 on Clariden using a temporary sqsh and toml, but the official GitHub image build seems to require repository FirecREST secrets to build vllm_cuda13_v2.sqsh.

For SonarCloud / analyze, the message also suggests a missing token/project permission issue:

Warning: Running this GitHub Action without SONAR_TOKEN is not recommended

Project not found. Please check the 'sonar.projectKey' and 'sonar.organization' properties, the 'SONAR_TOKEN' environment variable, or contact the project administrator to check the permissions of the user the token belongs to

I would greatly appreciate if you can help me solve this issue, whether if this is an issue about the dockerfile or issue about missing secret/key, Thank you!

yfchoco208 force-pushed the add-qwen3-tts-voicedesign branch 2 times, most recently from d1fca8c to b7f21f4 Compare May 20, 2026 05:06

AryanAhadinia requested changes May 20, 2026

View reviewed changes

AryanAhadinia assigned yfchoco208 May 20, 2026

AryanAhadinia added the model-support Adding support for a new model label May 20, 2026

yfchoco208 force-pushed the add-qwen3-tts-voicedesign branch from b7f21f4 to 94adb31 Compare May 23, 2026 16:55

Added Qwen3-TTS VoiceDesign vLLM-Omni launcher

6510e4a

yfchoco208 force-pushed the add-qwen3-tts-voicedesign branch from 94adb31 to 6510e4a Compare May 23, 2026 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Qwen3-TTS VoiceDesign vLLM-Omni launcher#135

Add Qwen3-TTS VoiceDesign vLLM-Omni launcher#135
yfchoco208 wants to merge 1 commit into
swiss-ai:mainfrom
yfchoco208:add-qwen3-tts-voicedesign

yfchoco208 commented May 20, 2026

Uh oh!

AryanAhadinia left a comment •

edited

Loading

Uh oh!

AryanAhadinia May 20, 2026 •

edited

Loading

Uh oh!

AryanAhadinia May 20, 2026

Uh oh!

yfchoco208 May 21, 2026

Uh oh!

AryanAhadinia May 20, 2026

Uh oh!

yfchoco208 May 21, 2026

Uh oh!

AryanAhadinia May 21, 2026

Uh oh!

yfchoco208 May 22, 2026

Uh oh!

AryanAhadinia May 22, 2026

Uh oh!

yfchoco208 May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yfchoco208 commented May 20, 2026

Uh oh!

AryanAhadinia left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AryanAhadinia May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AryanAhadinia left a comment •

edited

Loading

AryanAhadinia May 20, 2026 •

edited

Loading