Skip to content

Add special tokens flag to C API tokenizer #940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 2, 2025

Conversation

sayanshaw24
Copy link
Contributor

@sayanshaw24 sayanshaw24 commented May 1, 2025

Exposes add_special_tokens to C API to solve chat template issue in GenAI with extra BOS tokens.

See huggingface/transformers#37686 for more context.

@sayanshaw24 sayanshaw24 marked this pull request as ready for review May 1, 2025 23:53
@sayanshaw24 sayanshaw24 requested a review from a team as a code owner May 1, 2025 23:53
@sayanshaw24 sayanshaw24 merged commit fc00485 into main May 2, 2025
46 checks passed
@sayanshaw24 sayanshaw24 deleted the sayanshaw/add-special-tokens branch May 2, 2025 19:28
@microsoft microsoft deleted a comment from azure-pipelines bot May 2, 2025
RyanUnderhill added a commit to microsoft/onnxruntime-genai that referenced this pull request May 2, 2025
Sets `add_special_tokens` from `OrtxTokenizeWithOptions` added in
microsoft/onnxruntime-extensions#940 to false to
solve chat template issue in GenAI with extra BOS tokens.

See huggingface/transformers#37686 for more
context.

---------

Co-authored-by: Sayan Shaw <[email protected]>
Co-authored-by: Ryan Hill <[email protected]>
RyanUnderhill added a commit to microsoft/onnxruntime-genai that referenced this pull request May 6, 2025
Sets `add_special_tokens` from `OrtxTokenizeWithOptions` added in
microsoft/onnxruntime-extensions#940 to false to
solve chat template issue in GenAI with extra BOS tokens.

See huggingface/transformers#37686 for more
context.

---------

Co-authored-by: Sayan Shaw <[email protected]>
Co-authored-by: Ryan Hill <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants