-
Notifications
You must be signed in to change notification settings - Fork 3k
AzureOpenAI model grader support in evals #41599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Implement AzureOpenAIScoreModelGrader in _aoai/score_model_grader.py - Update module exports in _aoai/__init__.py and __init__.py - Register grader in _evaluate/_evaluate_aoai.py grader registry - Add comprehensive sample script with real credentials support - Include integration plan documentation - Support conversation-style input, score ranges, and sampling parameters - Handle template variables using {{ item.field }} syntax - Provide fallback demo mode for configuration testing
API Change CheckAPIView identified API level changes in this PR and created the following API reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request adds support for grading using the AzureOpenAIScoreModelGrader by integrating it into the evaluator registry and providing corresponding test data. Key changes include:
- Updating test files and evaluator registries to include the new grader.
- Implementing the new AzureOpenAIScoreModelGrader wrapper in the SDK.
- Updating module init files to export the new grader.
Reviewed Changes
Copilot reviewed 6 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
sdk/evaluation/azure-ai-evaluation/tests/unittests/test_save_eval.py | Added new grader to evaluator list. |
sdk/evaluation/azure-ai-evaluation/tests/unittests/data/score_model_test_data.jsonl | Added new test data for score model evaluations. |
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_evaluate_aoai.py | Registered the new grader in the evaluator mapping. |
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/score_model_grader.py | Implemented the new AzureOpenAIScoreModelGrader. |
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/init.py | Updated exports to include the new grader. |
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/init.py | Updated public API to expose the new grader. |
Comments suppressed due to low confidence (1)
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/score_model_grader.py:64
- [nitpick] The parameter name 'range' shadows a built-in function. Consider renaming it to 'score_range' to avoid potential confusion.
range: Optional[List[float]] = None,
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines