Skip to content

remove padding mask for input embeddings #1799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 21, 2022

Conversation

parmeet
Copy link
Contributor

@parmeet parmeet commented Jun 21, 2022

This PR removes the masking at Padded tokens for the input embeddings.

In fairseq, the masking is applied to input embedding here but not in HF implementation. This causes MisMatch in output embedding for the padded tokens.

Ideally, it should not matter the output for padded tokens. Per the investigations from @ebsmothers, it is somehow causing results to differ for MDETR model. In order for Torch MM to upstream dependency on torchtext for RoBERTa encoder, this change is necessary.

Copy link
Contributor

@ebsmothers ebsmothers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! This looks good to me

@parmeet parmeet merged commit a937288 into pytorch:main Jun 21, 2022
@parmeet parmeet deleted the match_hf_padding branch June 21, 2022 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants