Adding Reliability Scoring notebook - GTC #279

soares-f · 2025-03-16T13:37:08Z

This PR adds a notebook with an example of how to perform reliability scoring analysis in a win-tie-loss human evaluation.

Added notebook supporting GTC talk about the human touch. This serves as basis to produce a REL score for win-tie-loss human evaluation with 2 models.

soares-f · 2025-03-16T13:45:24Z

tag @fsoares on slack if needed

* Example of reliability scoring in human eval Added notebook supporting GTC talk about the human touch. This serves as basis to produce a REL score for win-tie-loss human evaluation with 2 models. * moved files around and created folder

soares-f added 2 commits March 14, 2025 13:16

Example of reliability scoring in human eval

1e8f5c0

Added notebook supporting GTC talk about the human touch. This serves as basis to produce a REL score for win-tie-loss human evaluation with 2 models.

moved files around and created folder

16d3b06

dglogo self-requested a review March 17, 2025 23:44

dglogo approved these changes Mar 17, 2025

View reviewed changes

dglogo merged commit d8882c5 into NVIDIA:main Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding Reliability Scoring notebook - GTC #279

Adding Reliability Scoring notebook - GTC #279

Uh oh!

soares-f commented Mar 16, 2025

Uh oh!

soares-f commented Mar 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Adding Reliability Scoring notebook - GTC #279

Adding Reliability Scoring notebook - GTC #279

Uh oh!

Conversation

soares-f commented Mar 16, 2025

Uh oh!

soares-f commented Mar 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants