Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR type
PR information
使用deepseek-ai/deepseek-coder-6.7b-base模型做grpo
问题表现同https://github.com/modelscope/ms-swift/issues/4785,但错误不一样
当前错误代码错误在swift/llm/template/base.py的_swift_encode函数,涉及的代码行如下:
`
`
debug发现prefix是 [[32013]], 对应的token是‘<|begin▁of▁sentence|>’,导致这个问题的原因在swift/llm/template/template_meta.py的init()函数的执行,将prefix这类值转成tokenid后替换该属性的值。
为了适配完整而正确的prompt,需要将token_id=32013,decode回‘<|begin▁of▁sentence|>’,这样行程的prompt为:‘<|begin▁of▁sentence|>User:xx\nAssitant:xx’
代码是最新的main分支的代码:swift.version: 3.6.0.dev0
解决方法:
将prefix=[[32013]],返回成token拼接到prompt中,注意这是一个简单的方法。
要彻底采用通用方案解决的话应该在swift/llm/template/template_meta.py文件中保存prefix的token值,如有必要也保存prefix的token_id值,其他如suffix类似,为了防止直接修改带来不可预知的问题,这里采用最小修改的办法先解决该问题。