File tree Expand file tree Collapse file tree 1 file changed +1
-1
lines changed Expand file tree Collapse file tree 1 file changed +1
-1
lines changed Original file line number Diff line number Diff line change @@ -89,7 +89,7 @@ git clone https://github.com/NVIDIA/TensorRT-LLM.git
89
89
90
90
Navigate to the examples\llama directory and run the following script:
91
91
```
92
- python build.py --model_dir <path to llama13_chat model> --quant_ckpt_path <path to model.pt> --dtype float16 --use_gpt_attention_plugin float16 --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --enable_context_fmha --max_batch_size 1 --max_input_len 3500 --max_output_len 1024 --output_dir <TRT engine folder>
92
+ python build.py --model_dir <path to llama13_chat model> --quant_ckpt_path <path to model.pt> --dtype float16 --use_gpt_attention_plugin float16 --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --enable_context_fmha --max_batch_size 1 --max_input_len 3000 --max_output_len 1024 --output_dir <TRT engine folder>
93
93
```
94
94
95
95
## Adding your own data
You can’t perform that action at this time.
0 commit comments