Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 19 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,13 +123,30 @@ source .venv/bin/activate
pip install -e .[dev]
# for zsh
pip install -e .\[dev\]

# Install flash-attn after all dependencies are installed
# Note: flash-attn will take a long time to compile, please be patient.
pip install flash-attn -v
# Try the following command if you encounter errors during installation
# pip install flash-attn -v --no-build-isolation
```

Installation from docker:

We provided a dockerfile for Trinity-RFT (trinity)

Installation with pip:
(coming soon)
```shell
git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# build the docker image
# Note: you can edit the dockerfile to customize the environment
# e.g., use pip mirrors or set api key
docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .

# run the docker image
docker run -it --gpus all --shm-size="64g" --rm -v $PWD:/workspace -v <root_path_of_data_and_checkpoints>:/data trinity-rft:latest
```


### Step 2: prepare dataset and model
Expand Down Expand Up @@ -263,18 +280,6 @@ Please refer to [this document](./docs/sphinx_doc/source/tutorial/trinity_progra
This project is currently under active development, and we welcome contributions from the community!



Installation for development:

```shell
# for bash
pip install -e .[dev]
# for zsh
pip install -e .\[dev\]
```



Code style check:

```shell
Expand Down
35 changes: 21 additions & 14 deletions docs/sphinx_doc/source/main.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,12 +115,31 @@ source .venv/bin/activate
pip install -e .[dev]
# for zsh
pip install -e .\[dev\]

# Install flash-attn after all dependencies are installed
# Note: flash-attn will take a long time to compile, please be patient.
pip install flash-attn -v
# Try the following command if you encounter errors during installation
# pip install flash-attn -v --no-build-isolation
```


Installation from docker:

We provided a dockerfile for Trinity-RFT (trinity)

```shell
git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# build the docker image
# Note: you can edit the dockerfile to customize the environment
# e.g., use pip mirrors or set api key
docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .

Installation with pip:
(coming soon)
# run the docker image
docker run -it --gpus all --shm-size="64g" --rm -v $PWD:/workspace -v <root_path_of_data_and_checkpoints>:/data trinity-rft:latest
```



Expand Down Expand Up @@ -255,18 +274,6 @@ Please refer to [this document](tutorial/trinity_programming_guide.md).
This project is currently under active development, and we welcome contributions from the community!



Installation for development:

```shell
# for bash
pip install -e .[dev]
# for zsh
pip install -e .\[dev\]
```



Code style check:

```shell
Expand Down
1 change: 0 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ dependencies = [
"ray==2.43.0",
"vllm==0.8.3",
"tensordict==0.6.2",
"redis",
"wandb",
"omegaconf",
"sqlalchemy",
Expand Down
44 changes: 44 additions & 0 deletions scripts/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Build and run the docker image with the following command:
#
# cd <Trinity-RFT root dir>
# docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .
# docker run -it --gpus all --shm-size="64g" --rm -v $PWD:/workspace -v <root_path_of_data_and_checkpoints>:/data trinity-rft:latest


FROM nvcr.io/nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04

WORKDIR /workspace

RUN apt update && apt install -y \
build-essential \
curl \
git \
wget \
vim \
tmux \
python3 \
python3-pip \
python3-dev \
python3-packaging \
&& rm -rf /var/lib/apt/lists/* \
&& ln -sf /usr/bin/python3 /usr/bin/python


# For Aliyun users: update pip mirror to aliyun to speed up pip install
RUN pip config set global.index-url http://mirrors.cloud.aliyuncs.com/pypi/simple/ \
&& pip config set global.trusted-host mirrors.cloud.aliyuncs.com

# copy the Trinity-RFT dir into the workspace
COPY . .

RUN pip install --upgrade pip && pip install -e .[dev] && pip install flash-attn

# Set Env variables

# WANDB
# ENV WANDB_API_KEY=
# ENV WANDB_BASE_URL=

# LLM API
# ENV OPENAI_API_KEY=
# ENV DASH_API_KEY=
1 change: 1 addition & 0 deletions trinity/common/verl_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,7 @@ def synchronize_config(self, config: Config) -> None:
self.trainer.nnodes = config.cluster.node_num - rollout_node_num
self.actor_rollout_ref.model.path = config.model.model_path
self.critic.model.path = config.model.critic_model_path
self.critic.model.tokenizer_path = config.model.critic_model_path

if config.cluster.node_num == 1:
# for single node scenarios, rollout and training are on the same node
Expand Down