Skip to content

使用ToolBench数据集出错 #3947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Bearx666 opened this issue Apr 21, 2025 · 1 comment
Open

使用ToolBench数据集出错 #3947

Bearx666 opened this issue Apr 21, 2025 · 1 comment

Comments

@Bearx666
Copy link

在train agent的数据集时,ms-agent等react格式的数据集可以正常训练,但是将数据集改为toolbench后出现如下的报错:
ValueError: Keys mismatch: between {'default': {'features': {'id': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}, 'tools': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}, 'conversations': {'_type': Value(dtype='string', id=None), 'feature': {'_type': Value(dtype='string', id=None), 'features': {'from': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}, 'value': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}}}}}, 'splits': {'train': {'name': Value(dtype='string', id=None), 'dataset_name': Value(dtype='string', id=None)}}}} (source) and {'default': {'features': {'id': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}, 'tools': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}, 'conversations': {'_type': Value(dtype='string', id=None), 'feature': {'_type': Value(dtype='string', id=None), 'features': {'from': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}, 'value': {'dtype': Value(dtype='string', id=None), '_type': Value(dtype='string', id=None)}}}}}, 'splits': {'train': {'name': Value(dtype='string', id=None), 'dataset_name': Value(dtype='string', id=None)}}}, 'id': Value(dtype='string', id=None), 'tools': Value(dtype='string', id=None), 'conversations': [{'from': Value(dtype='string', id=None), 'value': Value(dtype='string', id=None)}]} (target).
set() are missing from target and {'conversations', 'id', 'tools'} are missing from source
The 'source' features come from dataset_info.json, and the 'target' ones are those of the dataset arrow file.

请问这个该如何解决

@Jintao-Huang
Copy link
Collaborator

shell发一下看看

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants