Skip to content

WordPieceTokenizer inconsistent lowercase behavior #599

Closed
@chenmoneygithub

Description

@chenmoneygithub

pretokenize method has lowercase=True by default, while WordPieceTokenizer has lowercase=False by default. The functionality is fine, so this is non-blocking, but shall we make them consistent?

Metadata

Metadata

Assignees

Labels

stat:contributions welcomeAdd this label to feature request issues so they are separated out from bug reporting issuestype:BugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions