Add usage instructions for accuracy and performance modes.

psyhtest · psyhtest · commit 5f9cbd3ee32c · 2020-07-20T16:17:04.000+01:00
diff --git a/v0.7/speech_recognition/rnnt/optional_harness_ck/README.md b/v0.7/speech_recognition/rnnt/optional_harness_ck/README.md
@@ -1,6 +1,6 @@
 # MLPerf Inference - Speech Recognition
 
-Below we give an _essential_ sequence of steps that should result in a successful setup 
+Below we give an _essential_ sequence of steps that should result in a successful setup
 of the RNN-T workflow on Linux systems.
 
 The steps are extracted from a [minimalistic Amazon Linux
@@ -58,6 +58,9 @@ of Docker images for this workflow including Ubuntu, Debian and CentOS.
     1. Detect [Python](#detect_python)
     1. Install [Python dependencies](#install_python_deps)
     1. Install a branch of the [MLPerf Inference](#install_inference_repo) repo
+1. [Usage](#usage)
+    1. [Performance](#usage_performance)
+    1. [Accuracy](#usage_performance)
 
 <a name="install"></a>
 ## Installation
@@ -177,3 +180,118 @@ $ ck install package --tags=python-package,absl
 $ ck install package --tags=mlperf,inference,source,dividiti.rnnt
 ```
 **NB:** This source will be used for building LoadGen as well.
+
+
+<a name="usage"></a>
+## Usage
+
+<a name="usage_performance"></a>
+### Running a performance test
+
+The first run will end up resolving all the remaining explicit dependencies:
+- preprocessing the LibriSpeech Dev-Clean dataset to wav;
+- building the LoadGen API;
+- downloading the PyTorch model.
+
+It's a performance run which should print something like:
+```
+$ ck run program:speech-recognition-pytorch-loadgen --cmd_key=performance --skip_print_timers
+...
+Dataset loaded with 4.36 hours. Filtered 1.02 hours. Number of samples: 2513
+Running Loadgen test...
+Average latency (ms) per query:
+7335.167247106061
+Median latency (ms):
+7391.662108
+90 percentile latency (ms):
+13347.925176
+================================================
+MLPerf Results Summary
+================================================
+SUT name : PySUT
+Scenario : Offline
+Mode     : Performance
+Samples per second: 4.63626
+Result is : INVALID
+  Min duration satisfied : NO
+  Min queries satisfied : Yes
+Recommendations:
+ * Increase expected QPS so the loadgen pre-generates a larger (coalesced) query.
+
+================================================
+Additional Stats
+================================================
+Min latency (ns)                : 278432559
+Max latency (ns)                : 14235613054
+Mean latency (ns)               : 7335167247
+50.00 percentile latency (ns)   : 7521181269
+90.00 percentile latency (ns)   : 13402430910
+95.00 percentile latency (ns)   : 13723706550
+97.00 percentile latency (ns)   : 14054764438
+99.00 percentile latency (ns)   : 14235613054
+99.90 percentile latency (ns)   : 14235613054
+
+================================================
+Test Parameters Used
+================================================
+samples_per_query : 66
+target_qps : 1
+target_latency (ns): 0
+max_async_queries : 1
+min_duration (ms): 60000
+max_duration (ms): 0
+min_query_count : 1
+max_query_count : 0
+qsl_rng_seed : 3133965575612453542
+sample_index_rng_seed : 665484352860916858
+schedule_rng_seed : 3622009729038561421
+accuracy_log_rng_seed : 0
+accuracy_log_probability : 0
+print_timestamps : false
+performance_issue_unique : false
+performance_issue_same : false
+performance_issue_same_index : 0
+performance_sample_count : 2513
+
+No warnings encountered during test.
+
+No errors encountered during test.
+Done!
+
+Execution time: 38.735 sec.
+```
+
+The above output is the contents of `mlperf_log_summary.txt`, one of the log files generated by LoadGen. All LoadGen log files can be located in the program's temporary directory:
+```bash
+$ cd `ck find program:speech-recognition-pytorch-loadgen`/tmp && ls -la mlperf_log_*
+-rw-r--r-- 1 anton eng      4 Jul  3 18:06 mlperf_log_accuracy.json
+-rw-r--r-- 1 anton eng  20289 Jul  3 18:06 mlperf_log_detail.txt
+-rw-r--r-- 1 anton eng   1603 Jul  3 18:06 mlperf_log_summary.txt
+-rw-r--r-- 1 anton eng 860442 Jul  3 18:06 mlperf_log_trace.json
+```
+
+<a name="usage_accuracy"></a>
+### Running an accuracy test
+
+```
+$ ck run program:speech-recognition-pytorch-loadgen --cmd_key=accuracy --skip_print_timers
+...
+Dataset loaded with 4.36 hours. Filtered 1.02 hours. Number of samples: 2513
+Running Loadgen test...
+
+No warnings encountered during test.
+
+No errors encountered during test.
+Running accuracy script: /usr/bin/python3 /disk1/homes/anton/CK-TOOLS/mlperf-inference-dividiti.rnnt/inference/v0.7/speech_recognition/rnnt/accuracy_eval.py --log_dir /disk1/homes/anton/CK/ck-mlperf/program/speech-recognition-pytorch-loadgen/tmp --dataset_dir /homes/anton/CK-TOOLS/dataset-librispeech-preprocessed-to-wav-dev-clean/../ --manifest /homes/anton/CK-TOOLS/dataset-librispeech-preprocessed-to-wav-dev-clean/wav-list.json
+Dataset loaded with 4.36 hours. Filtered 1.02 hours. Number of samples: 2513
+Word Error Rate: 0.07452253714852645
+Done!
+
+Execution time: 502.197 sec.
+
+$ cd `ck find program:speech-recognition-pytorch-loadgen`/tmp && ls -la mlperf_log_*
+-rw-r--r-- 1 anton eng  3862427 Jul  3 18:00 mlperf_log_accuracy.json
+-rw-r--r-- 1 anton eng    20126 Jul  3 18:00 mlperf_log_detail.txt
+-rw-r--r-- 1 anton eng       74 Jul  3 18:00 mlperf_log_summary.txt
+-rw-r--r-- 1 anton eng 29738248 Jul  3 18:00 mlperf_log_trace.json
+```