Skip to content

Commit 755b0a3

Browse files
authored
Update README with Nsight Systems profiling command
Added command for profiling with Nsight Systems.
1 parent 1d4ac90 commit 755b0a3

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

language/llama3.1-8b/README_mahmood.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,6 @@ python -u main.py --scenario Offline --model-path ${CHECKPOINT_PATH} --batch-siz
3535
inteactive job command:
3636
```
3737
srun --mpi=pmix --job-name="int_gpu_job" --partition=gpu-a100-small --time=01:00:00 --ntasks=1 --cpus-per-task=2 --gpus-per-task=1 --mem-per-cpu=5G --account=research-eemcs-qce --pty /bin/bash -il
38+
39+
/scratch/mnaderantahan/nsight-systems-2025.5.1/bin/nsys profile --output nsys.out --trace=cuda,cublas,cudnn,osrt,nvtx --sample cpu --cpuctxsw process-tree python -u main.py --scenario Offline --model-path $CHECKPOINT_PATH --batch-size $BATCH_SIZE --dtype bfloat16 --user-conf user.conf --total-sample-count 1 --dataset-path $DATASET_PATH --output-log-dir output --tensor-parallel-size $GPU_COUNT --vllm
3840
```

0 commit comments

Comments
 (0)