Open
Description
export NPROC_PER_NODE=1
export CUDA_VISIBLE_DEVICES=1
export MAX_PARTITION=4
vllm serve ${MODEL_PATH} --host 0.0.0.0 --tensor-parallel-size 1 --trust-remote-code --port 8056 --max-model-len 32768 --gpu-memory-utilization 0.9 --max-num-batched-tokens 8192
请问对于ovis模型,怎么正确将MAX_PARTITION传入?依靠环境变量vllm serve启动不生效
Metadata
Metadata
Assignees
Labels
No labels