Skip to content

Fix tool_call_accuracy evaluator sample format causing "Tool definition not found" error #41620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Jun 17, 2025

The sample code for ToolCallAccuracyEvaluator in evaluation_samples_evaluate.py was using incorrect parameter formats that caused a "Tool definition not found" error when users tried to run it.

Issue

The sample had two format problems:

  1. tool_calls format: Used nested dict structure instead of the expected flat list format
  2. tool_definitions format: Used single dict instead of list, and missing required "type" field

Before (broken sample):

tool_calls={
    "type": "tool_call",
    "tool_call": {
        "id": "call_eYtq7fMyHxDWIgeG2s26h0lJ",
        "type": "function",
        "function": {
            "name": "fetch_weather",
            "arguments": {"location": "New York"}
        }
    }
},
tool_definitions={
    "id": "fetch_weather",
    "name": "fetch_weather",
    "description": "Fetches the weather information for the specified location.",
    "parameters": {...}
}

After (working sample):

tool_calls=[
    {
        "type": "tool_call",
        "tool_call_id": "call_eYtq7fMyHxDWIgeG2s26h0lJ", 
        "name": "fetch_weather",
        "arguments": {"location": "New York"}
    }
],
tool_definitions=[
    {
        "name": "fetch_weather",
        "type": "function",
        "description": "Fetches the weather information for the specified location.",
        "parameters": {...}
    }
]

Validation

  • ✅ Fixed sample now parses correctly without errors
  • ✅ Existing unit test format continues to work (no regressions)
  • ✅ Original problematic format still fails as expected (good validation)

The sample now matches the format expected by the evaluator implementation and demonstrated in the unit tests.

Fixes #41543.


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

@Copilot Copilot AI changed the title [WIP] [evaluation] sample for tool_call_accuracy raises exception Fix tool_call_accuracy evaluator sample format causing "Tool definition not found" error Jun 17, 2025
@Copilot Copilot AI requested a review from singankit June 17, 2025 17:38
Copilot finished work on behalf of singankit June 17, 2025 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[evaluation] sample for tool_call_accuracy raises exception
2 participants