Skip to content

Commit dbf4d43

Browse files
Zachary Brown (804-888-6825)Zachary Brown (804-888-6825)
authored andcommitted
added README
1 parent 18ce9ad commit dbf4d43

File tree

1 file changed

+24
-0
lines changed

1 file changed

+24
-0
lines changed

README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# deep_learning_modern_nlp
2+
Resources and notebook for ["Deep Learning and Modern NLP" Workshop](https://pydata.org/miami2019/schedule/presentation/19/) for PyData Miami 2019
3+
4+
## Environment Setup
5+
6+
This tutorial requires an existing installation of [Anaconda 3](https://www.anaconda.com/download/#macos) (tested with Python 3.6). From the root directory of the repo, run:
7+
8+
```
9+
conda env create -f environment.yml
10+
source activate deep-learning-nlp
11+
```
12+
13+
## Datasets used in Tutorials
14+
Data for these tutorials are sourced from various locations, and prepared in advance into pickled Pandas DataFrames. The original sources of the data can be found in the links below.
15+
16+
* Perceptron
17+
* [Stanford Large Movie Review Dataset](http://ai.stanford.edu/~amaas/data/sentiment/)
18+
* [Stack Overflow Q&A](https://cloud.google.com/blog/products/gcp/google-bigquery-public-datasets-now-include-stack-overflow-q-a)
19+
* LSTM Classification
20+
* [R8 of Reuters 21578](https://www.cs.umb.edu/~smimarog/textmining/datasets/)
21+
* Part of Speech Tagging
22+
* Sample of the [Penn Treebank](https://corochann.com/penn-tree-bank-ptb-dataset-introduction-1456.html) dataset from the [NLTK Corpora](http://www.nltk.org/nltk_data/)
23+
* Machine Translation
24+
* [European Parliament Proceedings Parallel Corpus 1996-2011](http://www.statmt.org/europarl/)

0 commit comments

Comments
 (0)