Skip to content

hudeven/text

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a temp repo for hack week: Data APIs for NLP

Get started

  • install HuggingFace datasets. We copied it here to jump start. Eventually, we will build our own.

pip install -e stl_text/dataframes/datasets

  • install PyTorch and torchtext nightlies as some of the tasks depend on the prototype work in torchtext library.

to install cpu version on Linux:

pip install --pre torch torchtext -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html; pip install --upgrade --pre torch torchtext -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html;

to install cuda 10.1 version on Linux:

pip install --pre torch torchtext -f https://download.pytorch.org/whl/nightly/cu101/torch_nightly.html;

More detailed instructions are available here.

  • install this package

pip install -e .

  • run an example

python examples/hf_dataset_quick_tour.py

About

Data loaders and abstractions for text and NLP

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 93.9%
  • Jupyter Notebook 5.0%
  • JavaScript 0.6%
  • Shell 0.3%
  • CSS 0.2%
  • Makefile 0.0%