Skip to content

Commit 09715a3

Browse files
Add find peak performance documentation (mlcommons#2186)
1 parent 836f22c commit 09715a3

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

loadgen/README.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,3 +168,56 @@ It is not specified here how the QDL would query and configure the SUT to execut
168168
### Example
169169

170170
Refer to [LON demo](demos/lon) for a reference example illustrating usage of Loadgen over the network.
171+
172+
## Find Peak Performance Mode
173+
174+
The Find Peak Performance mode can be used to find the optimal queries per second (QPS) for the server scenario.
175+
176+
### Setup
177+
178+
You can setup loadgen to run this mode by setting the `mode` variable in the `test_settings` used to run the test. Using the Python API:
179+
180+
```python
181+
settings = mlperf_loadgen.TestSettings()
182+
settings.server_target_qps = 100
183+
settings.scenario = mlperf_loadgen.TestScenario.Server
184+
settings.mode = mlperf_loadgen.TestMode.FindPeakPerformance
185+
...
186+
187+
mlperf_loadgen.StartTest(sut, qsl, settings)
188+
```
189+
190+
Using the C/C++ API:
191+
```CPP
192+
mlperf::TestSettings settings;
193+
setting.server_target_qps = 100;
194+
settings.scenario = mlperf::TestScenario::Server;
195+
settings.mode = mlperf::TestMode::FindPeakPerformance;
196+
mlperf::LogSettings log_settings;
197+
/*
198+
Construct QSL and SUT
199+
*/
200+
mlperf::StartTest(&sut, &qsl, settings, log_settings);
201+
```
202+
203+
**Note:** Make sure you are setting the TestScenario to server and you are providing an initial target QPS.
204+
205+
### Description
206+
207+
The Find Peak Performance mode works by finding a lower and upper boundary for the optimal QPS. Then performing a binary search between the lower and upper bound to find the optimal QPS.
208+
209+
#### Finding lower and upper boundary
210+
211+
LoadGen begins by running performance mode at the specified target QPS. If the test passes, this value is used as the lower bound; otherwise, an error is raised. The algorithm then guesses the upper bound as twice the target QPS.
212+
213+
Then LoadGen will run performance mode using the upper bound guess. If the test is successful, both the lower bound and upper bound will be doubled. This repeats until the upper bound guess fails the test.
214+
215+
```
216+
[initial_target_qps, 2*initial_target_qps] -> [2*initial_target_qps, 4*initial_target_qps] -> [4*initial_target_qps, 8*initial_target_qps]...
217+
```
218+
219+
Finally, the final lower bound and upper bound are set to their current values. This process assures that the lower bound passes the performance mode, but the upper bound doesn’t.
220+
221+
#### Binary Search
222+
223+
Once the lower and upper bounds are set, binary search can be performed over the range `[lower, upper]`` to find the optimal QPS. If a given QPS fails in performance mode, the optimal value lies below it; if it passes, the optimal is higher.

0 commit comments

Comments
 (0)