[DLRMv2] Updating DLRMv2 dataset size #2222

keithachorn-intel · 2025-06-24T21:36:41Z

The updated value represents the dataset size, not the 'performance_sample_count_override' value set in mlperf.conf. After adding the new submission checker test, this mismatch raises an error which was not previously present.

Fixing mismatch in function header: result_log -> parse_result_log

DLRM dataset size is 330067, not 204800 (a setting for perf measurement)

github-actions · 2025-06-24T21:36:49Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

keithachorn-intel · 2025-06-24T21:55:27Z

If this could possibly be merged today, it would be very helpful in shipping our workload containers. Thanks! @arjunsuresh

attafosu · 2025-06-24T21:59:12Z

@arjunsuresh There seem to be a mismatch between the dataset size (204k) from mlcommons rules table, and what is generated (330k) from multihot_criteo.py's num_aggregated_samples
Note that it is also this count which is used as the total_samples_count by loadgen, and it makes its way into the mlperf_log_detail.txt as the qsl_reported_total_count.

arjunsuresh · 2025-06-24T22:10:01Z

Confirming that the value of qsl_reported_total_count is 330067 for Nvidia as well : https://raw.githubusercontent.com/mlcommons/inference_results_v5.0/refs/heads/main/closed/NVIDIA/results/GH200-GraceHopper-Superchip_GH200-144GB_aarch64x1_TRT/dlrm-v2-99/Offline/accuracy/mlperf_log_detail.txt

arjunsuresh · 2025-06-24T22:11:11Z

@pgmpablo157321 can you please make the relevant change in the rules?

keithachorn-intel added 7 commits June 23, 2025 07:06

Update verify_performance.py

6c4bfe8

Fixing mismatch in function header: result_log -> parse_result_log

Update verify_performance.py

65acd49

Update run_verification.py

809531f

Merge branch 'mlcommons:master' into master

b52297d

Update verify_performance.py

3c848c4

Merge branch 'mlcommons:master' into master

dca8b3f

Update submission_checker.py

7b76b3a

DLRM dataset size is 330067, not 204800 (a setting for perf measurement)

keithachorn-intel requested a review from a team as a code owner June 24, 2025 21:36

arjunsuresh approved these changes Jun 24, 2025

View reviewed changes

arjunsuresh merged commit 064db01 into mlcommons:master Jun 24, 2025
21 checks passed

github-actions bot locked and limited conversation to collaborators Jun 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DLRMv2] Updating DLRMv2 dataset size #2222

[DLRMv2] Updating DLRMv2 dataset size #2222

Uh oh!

keithachorn-intel commented Jun 24, 2025

Uh oh!

github-actions bot commented Jun 24, 2025

Uh oh!

keithachorn-intel commented Jun 24, 2025

Uh oh!

attafosu commented Jun 24, 2025

Uh oh!

arjunsuresh commented Jun 24, 2025

Uh oh!

arjunsuresh commented Jun 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[DLRMv2] Updating DLRMv2 dataset size #2222

[DLRMv2] Updating DLRMv2 dataset size #2222

Uh oh!

Conversation

keithachorn-intel commented Jun 24, 2025

Uh oh!

github-actions bot commented Jun 24, 2025

Uh oh!

keithachorn-intel commented Jun 24, 2025

Uh oh!

attafosu commented Jun 24, 2025

Uh oh!

arjunsuresh commented Jun 24, 2025

Uh oh!

arjunsuresh commented Jun 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants