-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[release/4.0] Moved SpecialTokens assignment after the modification to avoid "Collection Modified" error #7330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 suggestions.
Co-authored-by: Copilot <[email protected]>
@ericstj @michaelgsharp could you please help approving this one? Thanks! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release/4.0 #7330 +/- ##
============================================
Coverage 68.88% 68.89%
============================================
Files 1470 1470
Lines 274005 274081 +76
Branches 28403 28405 +2
============================================
+ Hits 188752 188828 +76
Misses 77936 77936
Partials 7317 7317
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Backport of #7328 to release/4.0
/cc @tarekgh @shaltielshmid
Customer Impact
Users of the BERT Tokenizer who provide a custom list of special tokens during tokenizer creation may encounter exceptions if the lowercasing option is enabled.
Testing
This has been manually tested, with new tests added, and all regression tests have passed successfully.
Risk
Low. This change does not alter any behavior or logic; it simply ensures that the supplied special tokens are handled correctly.