Skip to content

推理型大模型多轮对话SFT数据集构造问题请教 #3627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
YihengLiu1996 opened this issue Mar 24, 2025 · 1 comment
Closed

推理型大模型多轮对话SFT数据集构造问题请教 #3627

YihengLiu1996 opened this issue Mar 24, 2025 · 1 comment

Comments

@YihengLiu1996
Copy link

对于非推理大模型,多轮对话数据是user1, answer1, user2, answer2, ....这么构造,其中answer1和answer2..等回答都会作为预测项参与到loss计算中,swift应该是这么处理的吧?
对于推理型大模型,answer分为think和正式回答两部分吗。这里感觉存在一个矛盾,若answer都参与训练,则每个answer都需要包含think内容,而不只是最后一个answer包含think。但这样设置的话,在推理的时候,就需要把每轮对话的think内容也放到上下文中,这与当前中间对话只放正式答案不放思考过程的做法不符。
请问swift有针对这个情况做处理吗?比如训练第二轮对话的时候,第一轮的think过程不引入到上下文中。

@Jintao-Huang
Copy link
Collaborator

  1. 其中answer1和answer2..等回答都会作为预测项参与到loss计算中,swift应该是这么处理的吧?
    是的

  2. 请问swift有针对这个情况做处理吗?比如训练第二轮对话的时候,第一轮的think过程不引入到上下文中。
    是全部进行训练的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants