Fix docs (mlcommons#1853)

arjunsuresh · nathanwasson · anandhu-eng · web-flow · commit 3ef1249b7f50 · 2024-10-01T17:45:41.000+01:00
* Support batch-size in llama2 run * Add Rclone-Cloudflare download instructions to README.md * Add Rclone-Cloudflare download instructiosn to README.md * Minor wording edit to README.md * Add Rclone-Cloudflare download instructions to README.md * Add Rclone-GDrive download instructions to README.md * Add new and old instructions to README.md * Tweak language in README.md * Language tweak in README.md * Minor language tweak in README.md * Fix typo in README.md * Count error when logging errors: submission_checker.py * Fixes mlcommons#1648, restrict loadgen uncommitted error message to within the loadgen directory * Update test-rnnt.yml (mlcommons#1688) Stopping the github action for rnnt * Added docs init Added github action for website publish Update benchmark documentation Update publish.yaml Update publish.yaml Update benchmark documentation Improved the submission documentation Fix taskname Removed unused images * Fix benchmark URLs * Fix links * Add _full variation to run commands * Added script flow diagram * Added docker setup command for CM, extra run options * Added support for docker options in the docs * Added --quiet to the CM run_cmds in docs * Fix the test query count for cm commands * Support ctuning-cpp implementation * Added commands for mobilenet models * Docs cleanup * Docs cleanup * Added separate files for dataset and models in the docs * Remove redundant tab in the docs * Fixes some WIP models in the docs * Use the official docs page for CM installation * Fix the deadlink in docs * Fix indendation issue in docs * Added dockerinfo for nvidia implementation * Added run options for gptj * Added execution environment tabs * Cleanup of the docs * Cleanup of the docs * Reordered the sections of the docs page * Removed an unnecessary heading in the docs * Fixes the commands for datacenter * Fix the build --sdist for loadgen * Fixes mlcommons#1761, llama2 and mixtral runtime error on CPU systems * Added mixtral to the benchmark list, improved benchmark docs * Update docs for MLPerf inference v4.1 * Update docs for MLPerf inference v4.1 * Fix typo * Gave direct link to implementation readmes * Added tables detailing implementations * Update vision README.md, split the frameworks into separate rows * Update README.md * pointed links to specific frameworks * pointed links to specific frameworks * Update Submission_Guidelines.md * Update Submission_Guidelines.md * Update Submission_Guidelines.md * api support llama2 * Added request module and reduced max token len * Fix for llama2 api server * Update SUT_API offline to work for OpenAI * Update SUT_API.py * Minor fixes * Fix json import in SUT_API.py * Fix llama2 token length * Added model name verification with server * clean temp files * support num_workers in LLAMA2 SUTs * Remove batching from Offline SUT_API.py * Update SUT_API.py * Minor fixes for llama2 API * Fix for llama2 API * removed table of contents * enabled llama2-nvidia + vllm-NM : WIP * enabled dlrm for intel * lower cased implementation * added raw data input * corrected data download commands * renamed filename * changes for bert and vllm * documentation to work on custom repo and branch * benchmark index page update * enabled sdxl for nvidia and intel * updated vllm server run cmd * benchmark page information addition * fix indendation issue * Added submission categories * update submission page - generate submission with or w/o using CM for benchmarking * Updated kits dataset documentation * Updated model parameters * updation of information * updated non cm based benchmark * added info about hf password * added links to model and access tokens * Updated reference results structuree tree * submission docs cleanup * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info * added generic stubs deepsparse * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info (FID and CLIP data added) * typo fix for bert deepsparse framework * added min system requirements for models * fixed code version * changes for displaying reference and intel implementation tip * added reference to installation page * updated neural magic documentation * Added links to the install page, redirect benchmarks page * added tips about batch size and dataset for nvidia llama2 * fix conditions logic * modified tips and additional run cmds * sentence corrections * Minor fix for the documentation * fixed bug in deepsparse generic model stubs + styling * added more information to stubs * Added SCC24 readme, support reproducibility in the docs * Made clear the custom CM repo URL format * Support conditional implementation, setup and run tips * Support rocm for sdxl * Fix _short tag support * Fix install URL * Expose bfloat16 and float16 options for sdxl * Expose download model to host option for sdxl * IndySCC24 documentation added * Improve the SCC24 docs * Improve the support of short variation * Improved the indyscc24 documentation * Updated scc run commands * removed test_query_count option for scc * Remove scc24 in the main docs * Remove scc24 in the main docs * Fix docs: indendation issue on the submission page * generalised code for skipping test query count * Fixes for SCC24 docs * Fix scenario text in main.py * Fix links for scc24 * Fix links for scc24 * Improve the general docs * Fix links for scc24 * Use float16 in scc24 doc * Improve scc24 docs * Improve scc24 docs * Use float16 in scc24 doc * fixed command bug * Fix typo in docs * Fix typo in docs * Remove unnecessary indendation in docs * initial commit for tip - native run CUDA * Updated tip --------- Co-authored-by: Nathan Wasson <nathanw@mlcommons.org> Co-authored-by: anandhu-eng <anandhukicks@gmail.com> Co-authored-by: ANANDHU S <71482562+anandhu-eng@users.noreply.github.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>
diff --git a/docs/install/index.md b/docs/install/index.md
@@ -12,8 +12,8 @@ CM needs `git`, `python3-pip` and `python3-venv` installed on your system. If an
 This step is not mandatory as CM can use separate virtual environment for MLPerf inference. But the latest `pip` install requires this or else will need the `--break-system-packages` flag while installing `cm4mlops`.
 
 ```bash
-   python3 -m venv cm
-   source cm/bin/activate
+python3 -m venv cm
+source cm/bin/activate
 ```
 
 ## Install CM and pulls any needed repositories
diff --git a/main.py b/main.py
@@ -140,6 +140,9 @@ def mlperf_inference_implementation_readme(spaces, model, implementation, *, imp
                         # ref to cm installation
                         content += f"{cur_space3}Please refer to the [installation page](site:inference/install/) to install CM for running the automated benchmark commands.\n\n"
                         test_query_count=get_test_query_count(model, implementation, device.lower())
+                        if device.lower() == "cuda" and execution_env.lower() == "native":
+                            content += f"\n{cur_space3}!!! tip\n\n"
+                            content += f"{cur_space3}    - It is advisable to use the commands in the Docker tab for CUDA. Run the below native command only if you are already on a CUDA setup with cuDNN and TensorRT installed.\n\n"
 
                         if "99.9" not in model: #not showing docker command as it is already done for the 99% variant
                             if implementation == "neuralmagic":
@@ -442,7 +445,8 @@ def mlperf_inference_run_command(spaces, model, implementation, framework, categ
             if "short" in extra_variation_tags:
                 full_ds_needed_tag = ""
             else:
-                full_ds_needed_tag = ",_full"
+                full_ds_needed_tag = "_full,"
+
 
             docker_setup_cmd = f"""\n
 {f_pre_space}```bash