Support vLLM v0.12.0 #438

pan-x-c · 2025-12-12T06:48:20Z

Description

This PR add supports for vLLM v0.12.0.

However, we found a significant performance drop in v0.12.0 compared to v0.11.0, with the test time increasing from 9 minutes to 12 minutes.

So, we still recommend using v0.10.2 or v0.11.0.

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist · 2025-12-12T06:48:56Z

Summary of Changes

Hello @pan-x-c, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on a significant upgrade of the vLLM library to version 0.12.0 and its Ray dependency to 2.50.0. This update is accompanied by an upgrade of the underlying CUDA environment in the Docker images to 13.0.2. To ensure seamless integration and continued functionality with the new vLLM version, the changes include the implementation of version-specific patching mechanisms for vLLM's API server and worker components. Additionally, a new enable_thinking parameter has been introduced in the ModelWrapper for enhanced model control, particularly for Qwen3 models, and an existing patch for prompt logprobs calculation has been refined for accuracy.

Highlights

vLLM and Ray Dependency Upgrade: The core vLLM library has been updated to version 0.12.0, and the Ray dependency has been bumped to 2.50.0, ensuring compatibility with the latest features and bug fixes.
CUDA Base Image Update: All Dockerfiles (Dockerfile, Dockerfile.megatron, Dockerfile.uv) now use the nvcr.io/nvidia/cuda:13.0.2-cudnn-devel-ubuntu22.04 base image, upgrading the CUDA environment.
Version-Specific vLLM Patching: New conditional logic and a dedicated patch file (api_patch_v12.py) have been introduced to handle API server and worker component differences between vLLM versions, specifically for v0.12.0 and older versions.
New enable_thinking Parameter: A new enbale_thinking (likely a typo for enable_thinking) parameter has been added to the ModelWrapper for potential use with Qwen3 series models, allowing for more nuanced control over model behavior.
Improved Prompt Logprobs Calculation: The vLLM worker patch now correctly applies temperature scaling to logits when calculating prompt logprobs, addressing a potential inaccuracy in previous vLLM versions.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates vLLM to v0.12.0 and ray to v2.50.0, which also involves upgrading the CUDA version in the Dockerfiles. The code modifications are mainly to ensure compatibility with the new vLLM version, introducing version-specific logic for API server and worker patches. My review has identified a few areas for improvement: there's a recurring typo in a new parameter name, some type hints could be restored for better code quality, a docstring needs correction, and there's a significant amount of duplicated code that could be refactored. Apart from these points, the changes are logical and well-aligned with the goal of the dependency update.

trinity/common/models/model.py

trinity/common/models/vllm_model.py

trinity/common/models/vllm_patch/api_patch_v12.py

trinity/common/models/vllm_patch/worker_patch.py

pan-x-c · 2025-12-12T07:37:08Z

/unittest-module-common

github-actions · 2025-12-12T07:48:02Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
40	36	4	0	0	0	8m 43s

Failed Tests

Failed Tests ❌	Fail Message
❌ tests/common/vllm_test.py::TestAPIServer::test_api	The test failed in the call phase
❌ tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	The test failed in the call phase due to an assertion error
❌ tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	The test failed in the call phase
❌ tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	The test failed in the call phase

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	42.2s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	375ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	42ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	193ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	93ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	3.4s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	95ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	93ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1.7s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	15ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	2ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	58.5s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	32.3s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	43.8s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	15.5s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	15.3s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	15.2s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	15.6s
tests/common/vllm_test.py::TestAPIServer::test_api	❌	15.9s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	15.8s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	❌	19.3s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	270ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	241ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	❌	14.8s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	❌	15.0s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	3m 8s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-12T08:04:45Z

/unittest-module-common

github-actions · 2025-12-12T08:15:54Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
40	40	0	0	0	0	8m 54s

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	43.4s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	373ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	42ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	194ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	93ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	3.4s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	98ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	93ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1.8s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	15ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	2ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	58.6s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	32.0s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	43.9s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	15.5s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	15.7s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	15.2s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	15.5s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	20.7s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	15.7s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	20.7s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	234ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	231ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	17.7s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	15.8s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	3m 8s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-12T08:17:35Z

/unittest-module-common

github-actions · 2025-12-12T08:27:05Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
40	39	1	0	0	0	7m 25s

Failed Tests

Failed Tests ❌	Fail Message
❌ tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	The test failed in the call phase due to an exception

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	33.0s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	97ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	43ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	196ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	94ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	3.6s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	96ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	95ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1.7s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	16ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	2ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	1m 18s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	39.3s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	53.2s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	23.8s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	23.6s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	23.3s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	23.1s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	28.9s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	23.9s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	29.0s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	269ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	240ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	25.3s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	23.2s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	❌	5.2s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-12T09:02:24Z

/unittest-module-common

pan-x-c · 2025-12-12T09:06:55Z

/unittest-module-common

github-actions · 2025-12-12T09:20:44Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
40	40	0	0	0	0	11m 32s

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	33.0s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	97ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	42ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	197ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	95ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	3.4s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	99ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	96ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1.7s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	16ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	2ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	2ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	1m 17s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	40.8s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	50.8s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	23.4s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	23.1s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	23.1s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	22.8s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	28.7s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	23.9s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	28.9s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	233ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	227ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	25.7s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	23.6s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	4m 16s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-12T09:22:11Z

/unittest-module-trainer

pan-x-c · 2025-12-12T10:18:40Z

/unittest-all

pan-x-c · 2025-12-12T11:49:32Z

/unittest-all

github-actions · 2025-12-12T13:32:37Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
210	206	1	3	0	0	1h 40m

Failed Tests

Failed Tests ❌	Fail Message
❌ tests/explorer/explorer_test.py::ServeTest::test_serve	The test failed in the call phase due to an exception

Skipped

Tests	Status
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_std_grpo	✅	42ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_step_wise_grpo_advantage	✅	2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_duplicate_grpo	✅	5ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_advantage	✅	3ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_correct_bias	✅	2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_reward_std	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_advantage	✅	2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_with_std_threshold	✅	2ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_abs_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_fallback	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_loss	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_same_policy	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_with_old_logprob	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_dummy_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k1_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k2_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k3_kl_fn	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_kl_loss_aggregation_modes	✅	1ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_low_var_kl_fn	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_gspo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss_with_sequence_masking	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sapo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss	✅	1ms
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	26.3s
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation	✅	19.8s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	✅	5.0s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	✅	6.9s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	✅	7.4s
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	258ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	5.3s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	528ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	447ms
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter	✅	855ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	948ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	698ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	215ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	10.1s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	5.9s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control	✅	8.0s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	7.0s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	7.1s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	7.5s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration	✅	613ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	6ms
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write	✅	4.9s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write	✅	5.3s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0	✅	91ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1	✅	72ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2	✅	111ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3	✅	112ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4	✅	111ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5	✅	116ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6	✅	132ms
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple	✅	58ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file	✅	72ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql	✅	5.3s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file	✅	53ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql	✅	5.2s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file	✅	52ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql	✅	5.5s
tests/cli/launcher_test.py::TestLauncherMain::test_debug_mode	✅	1m 13s
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command	✅	6.9s
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc	✅	1.4s
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command	✅	317ms
tests/cli/launcher_test.py::TestLauncherMain::test_multi_stage_run	✅	1.8s
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	33.0s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	96ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	42ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	193ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	94ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	10.1s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	98ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	94ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	158ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	18ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	1m 33s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	39.2s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	51.8s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	24.9s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	23.5s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	24.6s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	24.0s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	29.0s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	24.3s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	29.2s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	239ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	235ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	25.9s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	24.0s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	4m 16s
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	1m 29s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	1m 31s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	3m 46s
tests/explorer/explorer_test.py::ServeTest::test_serve	❌	2m 45s
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	14.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	14.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	22.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	29.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	14.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	14.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	14.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	14.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	14.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	14.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	18.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	24.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	27.9s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	18.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	34.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	18.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	23.2s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	19.1s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	17ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	27ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	268ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	4ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	17ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	10ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow	✅	119ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	22.9s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	23.2s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	⏭️	3ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	303ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	8.1s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai	✅	25.3s
tests/manager/synchronizer_test.py::TestSynchronizerExit::test_synchronizer	✅	1m 10s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_0::test_synchronizer	✅	1m 55s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_1::test_synchronizer	✅	2m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_2::test_synchronizer	✅	2m 39s
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_3::test_synchronizer	✅	2m 52s
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_0::test_synchronizer	✅	1m 54s
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_1::test_synchronizer	✅	1m 55s
tests/service/data_juicer_test.py::TestDataJuicer::test_config	✅	1.9s
tests/service/data_juicer_test.py::TestDataJuicer::test_server_start	✅	22.0s
tests/service/data_juicer_test.py::TestDataJuicerExperiencePipeline::test_data_juicer_operators	✅	33.8s
tests/service/data_juicer_test.py::TestDataJuicerTaskPipeline::test_data_juicer_task_pipeline	✅	14.6s
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	3m 30s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	5m 16s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 50s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 54s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 44s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 46s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	2m 1s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	3m 6s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	1m 15s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	1m 11s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	1m 13s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	2m 24s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	2m 23s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	3m 2s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	3m 2s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	4m 44s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	3m 20s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	⏭️	811ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	⏭️	808ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	3m 44s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 46s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	1m 25s
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_equivalent	✅	15ms
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_ground_truth	✅	2ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_solution_string	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_multiple_boxed_answers_in_solution	✅	2ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_not_boxed	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_raw_and_ground_truth_boxed_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_extract_answer	✅	4ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_verify_math_answer	✅	77ms
tests/utils/eval_utils_test.py::TestEvalUtils::test_is_equiv	✅	6ms
tests/utils/log_test.py::LogTest::test_actor_log	✅	5.1s
tests/utils/log_test.py::LogTest::test_group_by_node	✅	4.9s
tests/utils/log_test.py::LogTest::test_no_actor_log	✅	908ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_0__workspace_tests_utils_plugins	✅	99ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_1_tests_utils_plugins	✅	98ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_0__workspace_tests_utils_plugins	✅	25.3s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_1_tests_utils_plugins	✅	25.5s
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_0__workspace_tests_utils_plugins	✅	13.8s
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_1_tests_utils_plugins	✅	13.3s
tests/utils/registry_test.py::TestRegistry::test_dynamic_import	✅	5.1s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-15T02:24:00Z

/unittest-module-explorer

github-actions · 2025-12-15T02:42:33Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
46	45	0	1	0	0	16m 28s

Skipped

Tests	Status
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	skipped ⏭️

Tests

Test Name	Status	Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	2m 1s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	1m 38s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	3m 44s
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	1m 30s
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	14.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	14.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	22.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	29.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	14.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	14.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	14.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	14.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	14.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	14.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	18.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	24.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	27.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	18.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	34.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	17.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	22.9s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	19.4s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	2ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	33ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	25ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	559ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	4ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	14ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	8ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow	✅	133ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	22.8s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	22.8s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter	⏭️	1ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	299ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	8.1s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai	✅	25.2s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-16T04:30:39Z

/unittest-module-common

github-actions · 2025-12-16T04:44:51Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
40	40	0	0	0	0	12m 5s

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	42.0s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	96ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	42ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	194ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	94ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	32.5s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	98ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	95ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	2.0s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	16ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	2ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	1m 36s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	40.2s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	55.0s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	24.5s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	24.6s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	23.6s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	24.0s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	29.6s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	24.3s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	30.1s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	246ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	248ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	26.3s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	24.7s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	3m 36s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-16T07:43:21Z

/unittest-module-common

github-actions · 2025-12-16T07:54:29Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
40	40	0	0	0	0	8m 50s

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	44.0s
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	96ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	41ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	193ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	93ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	3.3s
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	95ms
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	94ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	2.0s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	16ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	2ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	57.8s
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	31.6s
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	43.8s
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	15.5s
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	16.0s
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	15.5s
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	16.1s
tests/common/vllm_test.py::TestAPIServer::test_api	✅	21.5s
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	16.1s
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	20.8s
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	261ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	256ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	18.3s
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	15.8s
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	3m 2s

Github Test Reporter by CTRF 💚

pan-x-c · 2025-12-16T08:53:26Z

/unittest-all

pan-x-c · 2025-12-16T09:21:10Z

/unittest-module-trainer

github-actions · 2025-12-16T10:07:17Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
22	20	0	2	0	0	43m 54s

Skipped

Tests	Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	skipped ⏭️

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer	✅	3m 17s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer	✅	5m 1s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	1m 33s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	1m 27s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	1m 20s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	1m 20s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	1m 35s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	2m 31s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	1m 2s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	57.8s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	59.3s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode	✅	1m 53s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode	✅	1m 53s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode	✅	2m 38s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer	✅	2m 17s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer	✅	4m 27s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	2m 32s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer	⏭️	809ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer	⏭️	808ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer	✅	4m 11s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer	✅	1m 20s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer	✅	1m 10s

Github Test Reporter by CTRF 💚

pan-x-c added 2 commits December 12, 2025 14:19

support vllm v0.12

24660de

fix dockerfile

f67a537

fix logprobs test

b6b1512

gemini-code-assist bot reviewed Dec 12, 2025

View reviewed changes

pan-x-c added 2 commits December 12, 2025 14:53

fix cuda version

979468a

fix comments

a391663

pan-x-c added 3 commits December 12, 2025 15:52

fix openai client

8afaa1f

fix comments

ec329f7

fix tests

62cca82

update docker image

5df7567

fix rope paramters

46a5f83

update doc

ff843be

pan-x-c added 2 commits December 12, 2025 19:45

fix docker file

c3ff69b

update unittest test docker image

a8847dd

fix api

ee97590

fix logprobs patch for vllm 0.10.2

832d5aa

pan-x-c added 2 commits December 16, 2025 07:19

fix debug mode

1aff31a

use old docker

1392645

pan-x-c changed the title ~~Update vLLM to v0.12.0~~ Support vLLM v0.12.0 Dec 16, 2025

fix pyproject.toml

2fe15a0

chenyushuo approved these changes Dec 16, 2025

View reviewed changes

pan-x-c merged commit 8be19eb into modelscope:main Dec 16, 2025
1 check passed

Support vLLM v0.12.0 #438

Support vLLM v0.12.0 #438

Uh oh!

Conversation

pan-x-c commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

gemini-code-assist bot commented Dec 12, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025

Summary

Failed Tests

Tests

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025

Summary

Tests

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025

Summary

Failed Tests

Tests

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025

Summary

Tests

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

pan-x-c commented Dec 12, 2025

Uh oh!

github-actions bot commented Dec 12, 2025

Summary

Failed Tests

Skipped

Tests

Uh oh!

pan-x-c commented Dec 15, 2025

Uh oh!

github-actions bot commented Dec 15, 2025

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Dec 16, 2025

Uh oh!

github-actions bot commented Dec 16, 2025

Summary

Tests

Uh oh!

pan-x-c commented Dec 16, 2025

Uh oh!

pan-x-c commented Dec 12, 2025 •

edited

Loading