-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
[DOCs] Update ROCm installation docs section #24691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,6 +1,6 @@ | ||||||
# --8<-- [start:installation] | ||||||
|
||||||
vLLM supports AMD GPUs with ROCm 6.3. | ||||||
vLLM supports AMD GPUs with ROCm 6.3 or above. | ||||||
|
||||||
!!! tip | ||||||
[Docker](#set-up-using-docker) is the recommended way to use vLLM on ROCm. | ||||||
|
@@ -11,8 +11,9 @@ vLLM supports AMD GPUs with ROCm 6.3. | |||||
# --8<-- [end:installation] | ||||||
# --8<-- [start:requirements] | ||||||
|
||||||
- GPU: MI200s (gfx90a), MI300 (gfx942), Radeon RX 7900 series (gfx1100/1101), Radeon RX 9000 series (gfx1200/1201) | ||||||
- ROCm 6.3 | ||||||
- GPU: MI200s (gfx90a), MI300 (gfx942), MI350 (gfx950), Radeon RX 7900 series (gfx1100/1101), Radeon RX 9000 series (gfx1200/1201) | ||||||
- ROCm 6.3 or above | ||||||
- MI350 requires ROCm 7.0 or above | ||||||
|
||||||
# --8<-- [end:requirements] | ||||||
# --8<-- [start:set-up-using-python] | ||||||
|
@@ -32,45 +33,45 @@ Currently, there are no pre-built ROCm wheels. | |||||
- [ROCm](https://rocm.docs.amd.com/en/latest/deploy/linux/index.html) | ||||||
- [PyTorch](https://pytorch.org/) | ||||||
|
||||||
For installing PyTorch, you can start from a fresh docker image, e.g, `rocm/pytorch:rocm6.3_ubuntu24.04_py3.12_pytorch_release_2.4.0`, `rocm/pytorch-nightly`. If you are using docker image, you can skip to Step 3. | ||||||
For installing PyTorch, you can start from a fresh docker image, e.g, `rocm/pytorch:rocm6.4.3_ubuntu24.04_py3.12_pytorch_release_2.6.0`, `rocm/pytorch-nightly`. If you are using docker image, you can skip to Step 3. | ||||||
|
||||||
Alternatively, you can install PyTorch using PyTorch wheels. You can check PyTorch installation guide in PyTorch [Getting Started](https://pytorch.org/get-started/locally/). Example: | ||||||
|
||||||
```bash | ||||||
# Install PyTorch | ||||||
pip uninstall torch -y | ||||||
pip install --no-cache-dir --pre torch --index-url https://download.pytorch.org/whl/nightly/rocm6.3 | ||||||
pip install --no-cache-dir torch torchvision --index-url https://download.pytorch.org/whl/rocm6.4 | ||||||
``` | ||||||
|
||||||
1. Install [Triton flash attention for ROCm](https://github.com/ROCm/triton) | ||||||
1. Install [Triton for ROCm](https://github.com/triton-lang/triton) | ||||||
|
||||||
Install ROCm's Triton flash attention (the default triton-mlir branch) following the instructions from [ROCm/triton](https://github.com/ROCm/triton/blob/triton-mlir/README.md) | ||||||
Install ROCm's Triton (the default triton-mlir branch) following the instructions from [ROCm/triton](https://github.com/ROCm/triton/blob/triton-mlir/README.md) | ||||||
|
||||||
```bash | ||||||
python3 -m pip install ninja cmake wheel pybind11 | ||||||
pip uninstall -y triton | ||||||
git clone https://github.com/OpenAI/triton.git | ||||||
git clone https://github.com/triton-lang/triton.git | ||||||
cd triton | ||||||
git checkout e5be006 | ||||||
cd python | ||||||
pip3 install . | ||||||
if [ ! -f setup.py ]; then cd python; fi | ||||||
python3 setup.py install | ||||||
cd ../.. | ||||||
``` | ||||||
|
||||||
!!! note | ||||||
If you see HTTP issue related to downloading packages during building triton, please try again as the HTTP error is intermittent. | ||||||
|
||||||
2. Optionally, if you choose to use CK flash attention, you can install [flash attention for ROCm](https://github.com/ROCm/flash-attention) | ||||||
2. Optionally, if you choose to use CK flash attention, you can install [flash attention for ROCm](https://github.com/Dao-AILab/flash-attention) | ||||||
|
||||||
Install ROCm's flash attention (v2.7.2) following the instructions from [ROCm/flash-attention](https://github.com/ROCm/flash-attention#amd-rocm-support) | ||||||
Alternatively, wheels intended for vLLM use can be accessed under the releases. | ||||||
|
||||||
For example, for ROCm 6.3, suppose your gfx arch is `gfx90a`. To get your gfx architecture, run `rocminfo |grep gfx`. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The documentation has been updated to use ROCm 6.4 in the PyTorch installation step, but this example for flash attention still refers to ROCm 6.3. To maintain consistency throughout the document and avoid confusion for users, please update this to refer to ROCm 6.4.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ROCm support goes 2 versions back, so 6.3 and above are all currently supported. The installation instructions assume the lowest |
||||||
|
||||||
```bash | ||||||
git clone https://github.com/ROCm/flash-attention.git | ||||||
git clone https://github.com/Dao-AILab/flash-attention.git | ||||||
cd flash-attention | ||||||
git checkout b7d29fb | ||||||
git checkout 1a7f4dfa | ||||||
git submodule update --init | ||||||
GPU_ARCHS="gfx90a" python3 setup.py install | ||||||
cd .. | ||||||
|
@@ -194,16 +195,6 @@ To build vllm on ROCm 6.3 for MI200 and MI300 series, you can use the default: | |||||
DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile.rocm -t vllm-rocm . | ||||||
``` | ||||||
|
||||||
To build vllm on ROCm 6.3 for Radeon RX7900 series (gfx1100), you should pick the alternative base image: | ||||||
|
||||||
```bash | ||||||
DOCKER_BUILDKIT=1 docker build \ | ||||||
--build-arg BASE_IMAGE="rocm/vllm-dev:navi_base" \ | ||||||
-f docker/Dockerfile.rocm \ | ||||||
-t vllm-rocm \ | ||||||
. | ||||||
``` | ||||||
|
||||||
To run the above docker image `vllm-rocm`, use the below command: | ||||||
|
||||||
??? console "Command" | ||||||
|
@@ -218,8 +209,7 @@ To run the above docker image `vllm-rocm`, use the below command: | |||||
--device /dev/kfd \ | ||||||
--device /dev/dri \ | ||||||
-v <path/to/model>:/app/model \ | ||||||
vllm-rocm \ | ||||||
bash | ||||||
vllm-rocm | ||||||
``` | ||||||
|
||||||
Where the `<path/to/model>` is the location where the model is stored, for example, the weights for llama2 or llama3 models. | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command
python3 setup.py install
is a legacy command and is deprecated. It's recommended to usepip install .
instead, which is the modern standard for installing packages from source. This will correctly handle dependencies and use the PEP 517 build process. The previous version of the documentation correctly usedpip3 install .
, so this change is a regression.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mimics the way it is installed in the ROCm docker image for consistency.