Skip to content

convert: Swap GLM4 EOS / EOT token #13505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

henk717
Copy link

@henk717 henk717 commented May 13, 2025

When testing GLM4 models I noticed that they leak <|endoftext|> this happens because the EOS token is set to <|endoftext|> and the EOT token is set to <|user|>. The huggingface config.json defines the EOS as <|user|> which results in <|user|> being used for both and the <|endoftext|> token not being treated like an end of text token.

This PR is a simple reversal of the definitions, this allows finetunes to keep overriding the EOS should they need to while <|endoftext|> is treated like an EOT token.

In my test conversion this gives the following result:
INFO:gguf.vocab:Setting special token type eos to 151336
INFO:gguf.vocab:Setting special token type pad to 151329
INFO:gguf.vocab:Setting special token type eot to 151329
INFO:gguf.vocab:Setting special token type unk to 151329
INFO:gguf.vocab:Setting special token type bos to 151329

Before this PR the result is:
INFO:gguf.vocab:Setting special token type eos to 151336
INFO:gguf.vocab:Setting special token type pad to 151329
INFO:gguf.vocab:Setting special token type eot to 151336
INFO:gguf.vocab:Setting special token type unk to 151329
INFO:gguf.vocab:Setting special token type bos to 151329

If you prefer to solve this in a different manner (such as forcing the EOS to be set according to the internal converters definition) feel free to reject this PR and we can open an issue instead.

@github-actions github-actions bot added the python python script changes label May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant