-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description of the bug
In process_annotation_direct_attribution.py line73 show that:
sotopia_pi_utterance_reward.append(
{
"instruction": d['prompt'],
"output": d['result'],
"value": calc_reward(d['attribution']['attribution'], d['goal_score']),
}
)
However, For trainning train_rm.py use data.py line 100 use that:
rendered_text = self.template.render(
messages=[
{"role": "user", "content": item["input"]},
{"role": "assistant", "content": item["output"]}
]
i dont know whether there are some issue in data processing?
Steps To Reproduce
1
Additional Information
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working