You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*Please read the [contribution guidelines](contributing.md) before contributing.*
6
8
7
-
## Contributing
8
-
Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls), email Martin Park ([email protected])/Keon Kim ([email protected]) to add links.
9
+
Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls), or email Martin Park ([email protected])/Keon Kim ([email protected]) to add links.
9
10
10
11
11
12
## Table of Contents
12
13
13
14
-[Tutorials and Courses](#tutorials-and-courses)
14
15
-[videos](#videos)
15
-
-[Codes](#codes)
16
+
-[Deep Learning for NLP](#deep-learning-for-nlp)
17
+
-[Packages](#packages)
16
18
-[Implemendations](#implementations)
17
19
-[Libraries](#libraries)
18
20
-[Node.js](#user-content-node-js)
@@ -21,12 +23,18 @@ Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls
21
23
-[Java](#user-content-java)
22
24
-[Clojure](#user-content-clojure)
23
25
-[Ruby](#user-content-ruby)
26
+
-[Services](#services)
24
27
-[Articles](#articles)
25
28
-[Review Articles](#review-articles)
26
29
-[Word Vectors](#word-vectors)
30
+
-[Thought Vectors](#thought-vectors)
31
+
-[Machine Translation](#machine-translation)
27
32
-[General Natural Language Processing](#general-natural-langauge-processing)
@@ -35,35 +43,39 @@ Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls
35
43
36
44
## Tutorials and Courses
37
45
38
-
* Tensor Flow Tutorial on [Seq2Seq](http://www.tensorflow.org/tutorials/seq2seq/index.html) Models
46
+
* Tensor Flow Tutorial on [Seq2Seq](https://www.tensorflow.org/tutorials/seq2seq/index.html) Models
39
47
* Natural Language Understanding with Distributed Representation [Lecture Note](https://github.com/nyu-dl/NLP_DL_Lecture_Note) by Cho
48
+
*[Michael Collins](http://www.cs.columbia.edu/~mcollins/) - one of the best NLP teachers. Check out the material on the courses he is teaching.
40
49
41
50
### videos
42
51
43
-
*[Stanford's Coursera Course](https://www.coursera.org/course/nlp) on NLP from basics
44
-
*[Intro to Natural Language Processing](https://www.coursera.org/course/nlpintro) on Coursera by U of Michigan
52
+
*[Intro to Natural Language Processing](https://www.coursera.org/learn/natural-language-processing) on Coursera by U of Michigan
45
53
*[Intro to Artificial Intelligence](https://www.udacity.com/course/intro-to-artificial-intelligence--cs271) course on Udacity which also covers NLP
46
-
*[Deep Learning for Natural Language Processing](http://cs224d.stanford.edu/) by Richard Socher
47
-
*[Natural Language Processing](https://class.coursera.org/nlangp-001) - course on Coursera that was only done in 2013 but the videos are still up. Also Mike Collins is a great professor and his notes and lectures are very good.
54
+
*[Deep Learning for Natural Language Processing (2015 classes)](https://www.youtube.com/playlist?list=PLmImxx8Char8dxWB9LRqdpCTmewaml96q) by Richard Socher
55
+
*[Deep Learning for Natural Language Processing (2016 classes)](https://www.youtube.com/playlist?list=PLmImxx8Char9Ig0ZHSyTqGsdhb9weEGam) by Richard Socher. Updated to make use of Tensorflow. Note that there are some lectures missing (lecture 9, and lectures 12 onwards).
56
+
*[Natural Language Processing](https://www.coursera.org/learn/nlangp) - course on Coursera that was only done in 2013. The videos are not available at the moment. Also Mike Collins is a great professor and his notes and lectures are very good.
48
57
*[Statistical Machine Translation](http://mt-class.org) - a Machine Translation course with great assignments and slides.
49
58
*[Natural Language Processing SFU](http://www.cs.sfu.ca/~anoop/teaching/CMPT-413-Spring-2014/) - course by [Prof Anoop Sarkar](https://www.cs.sfu.ca/~anoop/) on Natural Language Processing. Good notes and some good lectures on youtube about HMM.
59
+
*[Udacity Deep Learning](https://classroom.udacity.com/courses/ud730) Deep Learning course on Udacity (using Tensorflow) which covers a section on using deep learning for NLP tasks (covering Word2Vec, RNN's and LSTMs).
60
+
*[NLTK with Python 3 for Natural Language Processing](https://www.youtube.com/playlist?list=PLQVvvaa0QuDf2JswnfiGkliBInZnIC4HL) by Harrison Kinsley(sentdex). Good tutorials with NLTK code implementation.
50
61
51
62
## Deep Learning for NLP
52
-
[Stanford Natural Language Processing](https://class.coursera.org/nlp/lecture/preview)
53
-
Intro NLP course with videos. This has no deep learning. But it is a good primer for traditional nlp.
54
63
55
64
[Stanford CS 224D: Deep Learning for NLP class](http://cs224d.stanford.edu/syllabus.html)
56
-
[Richard Socher](https://scholar.google.com/citations?user=FaOcyfMAAAAJ&hl=en). (2015) Class with videos, and slides.
65
+
Class by [Richard Socher](https://scholar.google.com/citations?user=FaOcyfMAAAAJ&hl=en). 2016 content was updated to make use of Tensorflow. Lecture slides and reading materials for 2016 class [here](http://cs224d.stanford.edu/syllabus.html). Videos for 2016 class [here](https://www.youtube.com/playlist?list=PLmImxx8Char9Ig0ZHSyTqGsdhb9weEGam). Note that there are some lecture videos missing for 2016 (lecture 9, and lectures 12 onwards). All videos for 2015 class [here](https://www.youtube.com/playlist?list=PLmImxx8Char8dxWB9LRqdpCTmewaml96q)
66
+
67
+
[Udacity Deep Learning](https://classroom.udacity.com/courses/ud730)
68
+
Deep Learning course on Udacity (using Tensorflow) which covers a section on using deep learning for NLP tasks. This section covers how to implement Word2Vec, RNN's and LSTMs.
57
69
58
70
[A Primer on Neural Network Models for Natural Language Processing](http://u.cs.biu.ac.il/~yogo/nnlp.pdf)
59
71
Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
60
72
61
73
62
-
## Codes
74
+
## Packages
63
75
64
76
### Implementations
65
77
*[Pre-trained word embeddings for WSJ corpus](https://github.com/ai-ku/wvec) by Koc AI-Lab
66
-
*[Word2vec](https://code.google.com/p/word2vec/) by Mikolov
78
+
*[Word2vec](https://code.google.com/archive/p/word2vec) by Mikolov
67
79
*[HLBL language model](http://metaoptimize.com/projects/wordreprs/) by Turian
68
80
*[Real-valued vector "embeddings"](http://www.cis.upenn.edu/~ungar/eigenwords/) by Dhillon
69
81
*[Improving Word Representations Via Global Context And Multiple Word Prototypes](http://www.socher.org/index.php/Main/ImprovingWordRepresentationsViaGlobalContextAndMultipleWordPrototypes) by Huang
@@ -77,7 +89,7 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
77
89
*[Twitter-text](https://github.com/twitter/twitter-text) - A JavaScript implementation of Twitter's text processing library
78
90
*[Knwl.js](https://github.com/loadfive/Knwl.js) - A Natural Language Processor in JS
79
91
*[Retext](https://github.com/wooorm/retext) - Extensible system for analyzing and manipulating natural language
80
-
*[NLP Compromise](https://github.com/spencermountain/nlp_compromise) - Natural Language processing in the browser
92
+
*[NLP Compromise](https://github.com/nlp-compromise/nlp_compromise) - Natural Language processing in the browser
81
93
*[Natural](https://github.com/NaturalNode/natural) - general natural language facilities for node
@@ -96,7 +108,7 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
96
108
*[python-frog](https://github.com/proycon/python-frog) - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
97
109
*[python-zpar](https://github.com/EducationalTestingService/python-zpar) - Python bindings for [ZPar](https://github.com/frcchang/zpar), a statistical part-of-speech-tagger, constiuency parser, and dependency parser for English.
98
110
*[colibri-core](https://github.com/proycon/colibri-core) - Python binding to C++ library for extracting and working with with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
99
-
*[spaCy](https://github.com/honnibal/spaCy/) - Industrial strength NLP with Python and Cython.
111
+
*[spaCy](https://github.com/spacy-io/spaCy) - Industrial strength NLP with Python and Cython.
100
112
*[PyStanfordDependencies](https://github.com/dmcc/PyStanfordDependencies) - Python interface for converting Penn Treebank trees to Stanford Dependencies.
101
113
102
114
* <aid="c++">**C++** - C++ Libraries</a>
@@ -105,9 +117,9 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
105
117
*[CRFsuite](http://www.chokkan.org/software/crfsuite/) - CRFsuite is an implementation of Conditional Random Fields (CRFs) for labeling sequential data.
106
118
*[BLLIP Parser](https://github.com/BLLIP/bllip-parser) - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
107
119
*[colibri-core](https://github.com/proycon/colibri-core) - C++ library, command line tools, and Python binding for extracting and working with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
108
-
*[ucto](https://github.com/proycon/ucto) - Unicode-aware regular-expression based tokenizer for various languages. Tool and C++ library. Supports FoLiA format.
109
-
*[libfolia](https://github.com/proycon/libfolia) - C++ library for the [FoLiA format](http://proycon.github.io/folia/)
110
-
*[frog](https://github.com/proycon/frog) - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
120
+
*[ucto](https://github.com/LanguageMachines/ucto) - Unicode-aware regular-expression based tokenizer for various languages. Tool and C++ library. Supports FoLiA format.
121
+
*[libfolia](https://github.com/LanguageMachines/libfolia) - C++ library for the [FoLiA format](http://proycon.github.io/folia/)
122
+
*[frog](https://github.com/LanguageMachines/frog) - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
111
123
*[MeTA](https://github.com/meta-toolkit/meta) - [MeTA : ModErn Text Analysis](https://meta-toolkit.org/) is a C++ Data Sciences Toolkit that facilitates mining big text data.
@@ -128,6 +140,9 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
128
140
129
141
* <aid="ruby">**Ruby**</a>
130
142
* Kevin Dias's [A collection of Natural Language Processing (NLP) Ruby libraries, tools and software](https://github.com/diasks2/ruby-nlp)
143
+
144
+
### Services
145
+
*[Wit-ai](https://github.com/wit-ai/wit) - Natural Language Interface for apps and devices.
131
146
132
147
## Articles
133
148
@@ -141,7 +156,7 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
141
156
*[Online named entity recognition method for microtexts in social networking services: A case study of twitter](http://arxiv.org/pdf/1301.2857.pdf)
142
157
143
158
144
-
### Word Vectors (part of it from [DL4NLP](https://github.com/andrewt3000/DL4NLP))
159
+
### Word Vectors
145
160
Resources about word vectors, aka word embeddings, and distributed representations for words.
146
161
Word vectors are numeric representations of words that are often used as input to deep learning systems. This process is sometimes called pretraining.
147
162
@@ -165,7 +180,7 @@ Pennington, Socher, Manning. 2014. Creates word vectors and relates word2vec to
165
180
*[Skip Thought Vectors](http://arxiv.org/abs/1506.06726) - word representation method
166
181
*[Adaptive skip-gram](http://arxiv.org/abs/1502.07257) - similar approach, with adaptive properties
167
182
168
-
### Thought Vectors (from [DL4NLP](https://github.com/andrewt3000/DL4NLP))
183
+
### Thought Vectors
169
184
Thought vectors are numeric representations for sentences, paragraphs, and documents. The following papers are listed in order of date published, each one replaces the last as the state of the art in sentiment analysis.
170
185
171
186
[Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.383.1327&rep=rep1&type=pdf)
@@ -198,7 +213,7 @@ Sutskever, Vinyals, Le 2014. ([nips presentation](http://research.microsoft.com
198
213
*[IXA pipeline: Efficient and Ready to Use Multilingual NLP tools](http://www.lrec-conf.org/proceedings/lrec2014/pdf/775_Paper.pdf)
199
214
200
215
201
-
### Single Exchange Dialogs (from [DL4NLP](https://github.com/andrewt3000/DL4NLP))
216
+
### Single Exchange Dialogs
202
217
[A Neural Network Approach toContext-Sensitive Generation of Conversational Responses](http://arxiv.org/pdf/1506.06714v1.pdf)
203
218
Sordoni 2015. Generates responses to tweets.
204
219
Uses [Recurrent Neural Network Language Model (RLM) architecture
@@ -229,8 +244,8 @@ Graves et al. 2014.
229
244
Joulin, Mikolov 2015. [Stack RNN source code](https://github.com/facebook/Stack-RNN) and [blog post](https://research.facebook.com/blog/1642778845966521/inferring-algorithmic-patterns-with-stack/)
230
245
231
246
### General Natural Language Processing
232
-
*[Neural autocoder for paragraphs and documents](http://arxiv.org/abs/1506.01057) - LTSM representation
233
-
*[LTSM over tree structures](http://arxiv.org/abs/1503.04881)
247
+
*[Neural autocoder for paragraphs and documents](http://arxiv.org/abs/1506.01057) - LSTM representation
248
+
*[LSTM over tree structures](http://arxiv.org/abs/1503.04881)
234
249
*[Sequence to Sequence Learning](http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf) - word vectors for machine translation
235
250
*[Teaching Machines to Read and Comprehend](http://arxiv.org/abs/1506.03340) - DeepMind paper
236
251
*[Efficient Estimation of Word Representations in Vector Space](http://arxiv.org/pdf/1301.3781.pdf)
0 commit comments