Skip to content

fixing and re-organizing pipelines #1250

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 24, 2021
Merged

Conversation

parmeet
Copy link
Contributor

@parmeet parmeet commented Mar 8, 2021

FIxing pipelines according to new features in torchtext and removing pytext dependency

@codecov
Copy link

codecov bot commented Mar 12, 2021

Codecov Report

Merging #1250 (9467f6e) into master (be3f640) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1250   +/-   ##
=======================================
  Coverage   78.80%   78.80%           
=======================================
  Files          67       67           
  Lines        3624     3624           
=======================================
  Hits         2856     2856           
  Misses        768      768           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be3f640...9467f6e. Read the comment docs.

@parmeet parmeet changed the title [WIP] fixing and re-organizing pipelines fixing and re-organizing pipelines Mar 15, 2021
@parmeet parmeet requested a review from cpuhrsch March 19, 2021 04:52
python pipelines.py --pipeline pytext


## Experimental PyText
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to remove this pipeline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The github repo of pytext is not maintained anymore and is not in good state as of this writing. This would mean that we have code in torchtext that is breaking going forward. I wasn't so sure, if we still want to maintain these code snippets that may sporadically break?

cc: @hudeven

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this in a separate PR, since we're not entirely clear on it and the other changes in this diff can go ahead, plus we might need it for comparison relatively soon.

if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Data procesing pipelines')
parser.add_argument('--pipeline', type=str, default='sentencepiece',
help='The name of pipeline')
parser.add_argument('--dataset', type=str, default='AG_NEWS',
help='Dataset for performance benchmark')
parser.add_argument('--spm-filename', type=str, default='m_user.model',
parser.add_argument('--spm-filename', type=str, default='text_unigram_25000',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for ease of use in default mode. If the user does not have access to spm model, user can simply run the code in default mode that will download one of the pre-trained spm model (text_unigram_25000). This change does not impact previous behavior, i.e if the use indeed specify m_user.model (with name other that the name in pre-trained spm models) it will work with the user model.

Copy link
Contributor

@cpuhrsch cpuhrsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment before landing

@parmeet parmeet merged commit eb5e39d into pytorch:master Mar 24, 2021
@parmeet parmeet deleted the pipelines branch March 24, 2021 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants