Skip to content

DeepSeek-R1-Distill-Qwen-1.5B这种模型该怎么准备SFT的数据? #3996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vibe-viscot opened this issue Apr 25, 2025 · 2 comments
Closed

Comments

@vibe-viscot
Copy link

vibe-viscot commented Apr 25, 2025

是不是这种的格式?

{"messages": [{"role": "user", "content": "<query1>"}, {"role": "assistant", "content": "<think>\n?????????</think>\n<answer>????????</answer>"}]}

主要是<think></think><answer></answer>这几个东西到底加不加?

@Jintao-Huang
Copy link
Collaborator

是的,需要加的

不然训练出来的模型就不会think了

@vibe-viscot
Copy link
Author

vibe-viscot commented Apr 25, 2025

OK,就是说</think>后面的内容要用<answer></answer>包起来对吧?可为什么推理的时候我只看到了</think>,没看到<answer></answer>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants