Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 15 additions & 25 deletions docs/getting_started/installation/gpu/rocm.inc.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# --8<-- [start:installation]

vLLM supports AMD GPUs with ROCm 6.3.
vLLM supports AMD GPUs with ROCm 6.3 or above.

!!! tip
[Docker](#set-up-using-docker) is the recommended way to use vLLM on ROCm.
Expand All @@ -11,8 +11,9 @@ vLLM supports AMD GPUs with ROCm 6.3.
# --8<-- [end:installation]
# --8<-- [start:requirements]

- GPU: MI200s (gfx90a), MI300 (gfx942), Radeon RX 7900 series (gfx1100/1101), Radeon RX 9000 series (gfx1200/1201)
- ROCm 6.3
- GPU: MI200s (gfx90a), MI300 (gfx942), MI350 (gfx950), Radeon RX 7900 series (gfx1100/1101), Radeon RX 9000 series (gfx1200/1201)
- ROCm 6.3 or above
- MI350 requires ROCm 7.0 or above

# --8<-- [end:requirements]
# --8<-- [start:set-up-using-python]
Expand All @@ -32,45 +33,45 @@ Currently, there are no pre-built ROCm wheels.
- [ROCm](https://rocm.docs.amd.com/en/latest/deploy/linux/index.html)
- [PyTorch](https://pytorch.org/)

For installing PyTorch, you can start from a fresh docker image, e.g, `rocm/pytorch:rocm6.3_ubuntu24.04_py3.12_pytorch_release_2.4.0`, `rocm/pytorch-nightly`. If you are using docker image, you can skip to Step 3.
For installing PyTorch, you can start from a fresh docker image, e.g, `rocm/pytorch:rocm6.4.3_ubuntu24.04_py3.12_pytorch_release_2.6.0`, `rocm/pytorch-nightly`. If you are using docker image, you can skip to Step 3.

Alternatively, you can install PyTorch using PyTorch wheels. You can check PyTorch installation guide in PyTorch [Getting Started](https://pytorch.org/get-started/locally/). Example:

```bash
# Install PyTorch
pip uninstall torch -y
pip install --no-cache-dir --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm6.3
pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/rocm6.4
```

1. Install [Triton flash attention for ROCm](https://github.com/ROCm/triton)
1. Install [Triton for ROCm](https://github.com/triton-lang/triton)

Install ROCm's Triton flash attention (the default triton-mlir branch) following the instructions from [ROCm/triton](https://github.com/ROCm/triton/blob/triton-mlir/README.md)
Install ROCm's Triton (the default triton-mlir branch) following the instructions from [ROCm/triton](https://github.com/ROCm/triton/blob/triton-mlir/README.md)

```bash
python3 -m pip install ninja cmake wheel pybind11
pip uninstall -y triton
git clone https://github.com/OpenAI/triton.git
git clone https://github.com/triton-lang/triton.git
cd triton
git checkout e5be006
cd python
pip3 install .
if [ ! -f setup.py ]; then cd python; fi
python3 setup.py install
Comment on lines +56 to +57
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The command python3 setup.py install is a legacy command and is deprecated. It's recommended to use pip install . instead, which is the modern standard for installing packages from source. This will correctly handle dependencies and use the PEP 517 build process. The previous version of the documentation correctly used pip3 install ., so this change is a regression.

Suggested change
if [ ! -f setup.py ]; then cd python; fi
python3 setup.py install
if [ ! -f setup.py ]; then cd python; fi
pip install .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mimics the way it is installed in the ROCm docker image for consistency.

cd ../..
```

!!! note
If you see HTTP issue related to downloading packages during building triton, please try again as the HTTP error is intermittent.

2. Optionally, if you choose to use CK flash attention, you can install [flash attention for ROCm](https://github.com/ROCm/flash-attention)
2. Optionally, if you choose to use CK flash attention, you can install [flash attention for ROCm](https://github.com/Dao-AILab/flash-attention)

Install ROCm's flash attention (v2.7.2) following the instructions from [ROCm/flash-attention](https://github.com/ROCm/flash-attention#amd-rocm-support)
Alternatively, wheels intended for vLLM use can be accessed under the releases.

For example, for ROCm 6.3, suppose your gfx arch is `gfx90a`. To get your gfx architecture, run `rocminfo |grep gfx`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The documentation has been updated to use ROCm 6.4 in the PyTorch installation step, but this example for flash attention still refers to ROCm 6.3. To maintain consistency throughout the document and avoid confusion for users, please update this to refer to ROCm 6.4.

Suggested change
For example, for ROCm 6.3, suppose your gfx arch is `gfx90a`. To get your gfx architecture, run `rocminfo |grep gfx`.
For example, for ROCm 6.4, suppose your gfx arch is `gfx90a`. To get your gfx architecture, run `rocminfo |grep gfx`.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ROCm support goes 2 versions back, so 6.3 and above are all currently supported. The installation instructions assume the lowest


```bash
git clone https://github.com/ROCm/flash-attention.git
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention
git checkout b7d29fb
git checkout 1a7f4dfa
git submodule update --init
GPU_ARCHS="gfx90a" python3 setup.py install
cd ..
Expand Down Expand Up @@ -194,16 +195,6 @@ To build vllm on ROCm 6.3 for MI200 and MI300 series, you can use the default:
DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile.rocm -t vllm-rocm .
```

To build vllm on ROCm 6.3 for Radeon RX7900 series (gfx1100), you should pick the alternative base image:

```bash
DOCKER_BUILDKIT=1 docker build \
--build-arg BASE_IMAGE="rocm/vllm-dev:navi_base" \
-f docker/Dockerfile.rocm \
-t vllm-rocm \
.
```

To run the above docker image `vllm-rocm`, use the below command:

??? console "Command"
Expand All @@ -218,8 +209,7 @@ To run the above docker image `vllm-rocm`, use the below command:
--device /dev/kfd \
--device /dev/dri \
-v <path/to/model>:/app/model \
vllm-rocm \
bash
vllm-rocm
```

Where the `<path/to/model>` is the location where the model is stored, for example, the weights for llama2 or llama3 models.
Expand Down