Skip to content

Commit 4e4b685

Browse files
author
Daniel Khashabi
authored
Merge pull request keon#1 from keonkim/master
bring to the latest.
2 parents 01c3dd9 + 7785050 commit 4e4b685

File tree

1 file changed

+42
-27
lines changed

1 file changed

+42
-27
lines changed

README.md

Lines changed: 42 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,20 @@
1-
# awesome-nlp
2-
A curated list of resources dedicated to Natural Language Processing
1+
# awesome-nlp [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)
32

3+
> A curated list of resources dedicated to Natural Language Processing
4+
>
5+
> Maintainers - [Keon Kim](https://github.com/keonkim), [Martin Park](https://github.com/outpark)
46
5-
Maintainers - [Keon Kim](http://github.com/keonkim), [Martin Park](https://github.com/outpark)
7+
*Please read the [contribution guidelines](contributing.md) before contributing.*
68

7-
## Contributing
8-
Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls), email Martin Park ([email protected])/Keon Kim ([email protected]) to add links.
9+
Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls), or email Martin Park ([email protected])/Keon Kim ([email protected]) to add links.
910

1011

1112
## Table of Contents
1213

1314
- [Tutorials and Courses](#tutorials-and-courses)
1415
- [videos](#videos)
15-
- [Codes](#codes)
16+
- [Deep Learning for NLP](#deep-learning-for-nlp)
17+
- [Packages](#packages)
1618
- [Implemendations](#implementations)
1719
- [Libraries](#libraries)
1820
- [Node.js](#user-content-node-js)
@@ -21,12 +23,18 @@ Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls
2123
- [Java](#user-content-java)
2224
- [Clojure](#user-content-clojure)
2325
- [Ruby](#user-content-ruby)
26+
- [Services](#services)
2427
- [Articles](#articles)
2528
- [Review Articles](#review-articles)
2629
- [Word Vectors](#word-vectors)
30+
- [Thought Vectors](#thought-vectors)
31+
- [Machine Translation](#machine-translation)
2732
- [General Natural Language Processing](#general-natural-langauge-processing)
2833
- [Named Entity Recognition](#name-entity-recognition)
29-
- [Machine Translation](#machine-translation)
34+
- [Single Exchange Dialogs](#single-exchange-dialogs)
35+
- [Memory and Attention Models](#memory-and-attention-models)
36+
- [General Natural Language Processing](#general-natural-language-processing)
37+
- [Named Entity Recognition](#named-entity-recognition)
3038
- [Neural Network](#neural-network)
3139
- [Supplementary Materials](#supplementary-materials)
3240
- [Blogs](#blogs)
@@ -35,35 +43,39 @@ Please feel free to [pull requests](https://github.com/keonkim/awesome-nlp/pulls
3543

3644
## Tutorials and Courses
3745

38-
* Tensor Flow Tutorial on [Seq2Seq](http://www.tensorflow.org/tutorials/seq2seq/index.html) Models
46+
* Tensor Flow Tutorial on [Seq2Seq](https://www.tensorflow.org/tutorials/seq2seq/index.html) Models
3947
* Natural Language Understanding with Distributed Representation [Lecture Note](https://github.com/nyu-dl/NLP_DL_Lecture_Note) by Cho
48+
* [Michael Collins](http://www.cs.columbia.edu/~mcollins/) - one of the best NLP teachers. Check out the material on the courses he is teaching.
4049

4150
### videos
4251

43-
* [Stanford's Coursera Course](https://www.coursera.org/course/nlp) on NLP from basics
44-
* [Intro to Natural Language Processing](https://www.coursera.org/course/nlpintro) on Coursera by U of Michigan
52+
* [Intro to Natural Language Processing](https://www.coursera.org/learn/natural-language-processing) on Coursera by U of Michigan
4553
* [Intro to Artificial Intelligence](https://www.udacity.com/course/intro-to-artificial-intelligence--cs271) course on Udacity which also covers NLP
46-
* [Deep Learning for Natural Language Processing](http://cs224d.stanford.edu/) by Richard Socher
47-
* [Natural Language Processing](https://class.coursera.org/nlangp-001) - course on Coursera that was only done in 2013 but the videos are still up. Also Mike Collins is a great professor and his notes and lectures are very good.
54+
* [Deep Learning for Natural Language Processing (2015 classes)](https://www.youtube.com/playlist?list=PLmImxx8Char8dxWB9LRqdpCTmewaml96q) by Richard Socher
55+
* [Deep Learning for Natural Language Processing (2016 classes)](https://www.youtube.com/playlist?list=PLmImxx8Char9Ig0ZHSyTqGsdhb9weEGam) by Richard Socher. Updated to make use of Tensorflow. Note that there are some lectures missing (lecture 9, and lectures 12 onwards).
56+
* [Natural Language Processing](https://www.coursera.org/learn/nlangp) - course on Coursera that was only done in 2013. The videos are not available at the moment. Also Mike Collins is a great professor and his notes and lectures are very good.
4857
* [Statistical Machine Translation](http://mt-class.org) - a Machine Translation course with great assignments and slides.
4958
* [Natural Language Processing SFU](http://www.cs.sfu.ca/~anoop/teaching/CMPT-413-Spring-2014/) - course by [Prof Anoop Sarkar](https://www.cs.sfu.ca/~anoop/) on Natural Language Processing. Good notes and some good lectures on youtube about HMM.
59+
* [Udacity Deep Learning](https://classroom.udacity.com/courses/ud730) Deep Learning course on Udacity (using Tensorflow) which covers a section on using deep learning for NLP tasks (covering Word2Vec, RNN's and LSTMs).
60+
* [NLTK with Python 3 for Natural Language Processing](https://www.youtube.com/playlist?list=PLQVvvaa0QuDf2JswnfiGkliBInZnIC4HL) by Harrison Kinsley(sentdex). Good tutorials with NLTK code implementation.
5061

5162
## Deep Learning for NLP
52-
[Stanford Natural Language Processing](https://class.coursera.org/nlp/lecture/preview)
53-
Intro NLP course with videos. This has no deep learning. But it is a good primer for traditional nlp.
5463

5564
[Stanford CS 224D: Deep Learning for NLP class](http://cs224d.stanford.edu/syllabus.html)
56-
[Richard Socher](https://scholar.google.com/citations?user=FaOcyfMAAAAJ&hl=en). (2015) Class with videos, and slides.
65+
Class by [Richard Socher](https://scholar.google.com/citations?user=FaOcyfMAAAAJ&hl=en). 2016 content was updated to make use of Tensorflow. Lecture slides and reading materials for 2016 class [here](http://cs224d.stanford.edu/syllabus.html). Videos for 2016 class [here](https://www.youtube.com/playlist?list=PLmImxx8Char9Ig0ZHSyTqGsdhb9weEGam). Note that there are some lecture videos missing for 2016 (lecture 9, and lectures 12 onwards). All videos for 2015 class [here](https://www.youtube.com/playlist?list=PLmImxx8Char8dxWB9LRqdpCTmewaml96q)
66+
67+
[Udacity Deep Learning](https://classroom.udacity.com/courses/ud730)
68+
Deep Learning course on Udacity (using Tensorflow) which covers a section on using deep learning for NLP tasks. This section covers how to implement Word2Vec, RNN's and LSTMs.
5769

5870
[A Primer on Neural Network Models for Natural Language Processing](http://u.cs.biu.ac.il/~yogo/nnlp.pdf)
5971
Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
6072

6173

62-
## Codes
74+
## Packages
6375

6476
### Implementations
6577
* [Pre-trained word embeddings for WSJ corpus](https://github.com/ai-ku/wvec) by Koc AI-Lab
66-
* [Word2vec](https://code.google.com/p/word2vec/) by Mikolov
78+
* [Word2vec](https://code.google.com/archive/p/word2vec) by Mikolov
6779
* [HLBL language model](http://metaoptimize.com/projects/wordreprs/) by Turian
6880
* [Real-valued vector "embeddings"](http://www.cis.upenn.edu/~ungar/eigenwords/) by Dhillon
6981
* [Improving Word Representations Via Global Context And Multiple Word Prototypes](http://www.socher.org/index.php/Main/ImprovingWordRepresentationsViaGlobalContextAndMultipleWordPrototypes) by Huang
@@ -77,7 +89,7 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
7789
* [Twitter-text](https://github.com/twitter/twitter-text) - A JavaScript implementation of Twitter's text processing library
7890
* [Knwl.js](https://github.com/loadfive/Knwl.js) - A Natural Language Processor in JS
7991
* [Retext](https://github.com/wooorm/retext) - Extensible system for analyzing and manipulating natural language
80-
* [NLP Compromise](https://github.com/spencermountain/nlp_compromise) - Natural Language processing in the browser
92+
* [NLP Compromise](https://github.com/nlp-compromise/nlp_compromise) - Natural Language processing in the browser
8193
* [Natural](https://github.com/NaturalNode/natural) - general natural language facilities for node
8294

8395
* <a id="python">**Python** - Python NLP Libraries</a>
@@ -96,7 +108,7 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
96108
* [python-frog](https://github.com/proycon/python-frog) - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
97109
* [python-zpar](https://github.com/EducationalTestingService/python-zpar) - Python bindings for [ZPar](https://github.com/frcchang/zpar), a statistical part-of-speech-tagger, constiuency parser, and dependency parser for English.
98110
* [colibri-core](https://github.com/proycon/colibri-core) - Python binding to C++ library for extracting and working with with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
99-
* [spaCy](https://github.com/honnibal/spaCy/) - Industrial strength NLP with Python and Cython.
111+
* [spaCy](https://github.com/spacy-io/spaCy) - Industrial strength NLP with Python and Cython.
100112
* [PyStanfordDependencies](https://github.com/dmcc/PyStanfordDependencies) - Python interface for converting Penn Treebank trees to Stanford Dependencies.
101113

102114
* <a id="c++">**C++** - C++ Libraries</a>
@@ -105,9 +117,9 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
105117
* [CRFsuite](http://www.chokkan.org/software/crfsuite/) - CRFsuite is an implementation of Conditional Random Fields (CRFs) for labeling sequential data.
106118
* [BLLIP Parser](https://github.com/BLLIP/bllip-parser) - BLLIP Natural Language Parser (also known as the Charniak-Johnson parser)
107119
* [colibri-core](https://github.com/proycon/colibri-core) - C++ library, command line tools, and Python binding for extracting and working with basic linguistic constructions such as n-grams and skipgrams in a quick and memory-efficient way.
108-
* [ucto](https://github.com/proycon/ucto) - Unicode-aware regular-expression based tokenizer for various languages. Tool and C++ library. Supports FoLiA format.
109-
* [libfolia](https://github.com/proycon/libfolia) - C++ library for the [FoLiA format](http://proycon.github.io/folia/)
110-
* [frog](https://github.com/proycon/frog) - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
120+
* [ucto](https://github.com/LanguageMachines/ucto) - Unicode-aware regular-expression based tokenizer for various languages. Tool and C++ library. Supports FoLiA format.
121+
* [libfolia](https://github.com/LanguageMachines/libfolia) - C++ library for the [FoLiA format](http://proycon.github.io/folia/)
122+
* [frog](https://github.com/LanguageMachines/frog) - Memory-based NLP suite developed for Dutch: PoS tagger, lemmatiser, dependency parser, NER, shallow parser, morphological analyzer.
111123
* [MeTA](https://github.com/meta-toolkit/meta) - [MeTA : ModErn Text Analysis](https://meta-toolkit.org/) is a C++ Data Sciences Toolkit that facilitates mining big text data.
112124
* [Mecab (Japanese)](http://taku910.github.io/mecab/)
113125
* [Mecab (Korean)](http://eunjeon.blogspot.com/)
@@ -128,6 +140,9 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
128140

129141
* <a id="ruby">**Ruby**</a>
130142
* Kevin Dias's [A collection of Natural Language Processing (NLP) Ruby libraries, tools and software](https://github.com/diasks2/ruby-nlp)
143+
144+
### Services
145+
* [Wit-ai](https://github.com/wit-ai/wit) - Natural Language Interface for apps and devices.
131146

132147
## Articles
133148

@@ -141,7 +156,7 @@ Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
141156
* [Online named entity recognition method for microtexts in social networking services: A case study of twitter](http://arxiv.org/pdf/1301.2857.pdf)
142157

143158

144-
### Word Vectors (part of it from [DL4NLP](https://github.com/andrewt3000/DL4NLP))
159+
### Word Vectors
145160
Resources about word vectors, aka word embeddings, and distributed representations for words.
146161
Word vectors are numeric representations of words that are often used as input to deep learning systems. This process is sometimes called pretraining.
147162

@@ -165,7 +180,7 @@ Pennington, Socher, Manning. 2014. Creates word vectors and relates word2vec to
165180
* [Skip Thought Vectors](http://arxiv.org/abs/1506.06726) - word representation method
166181
* [Adaptive skip-gram](http://arxiv.org/abs/1502.07257) - similar approach, with adaptive properties
167182

168-
### Thought Vectors (from [DL4NLP](https://github.com/andrewt3000/DL4NLP))
183+
### Thought Vectors
169184
Thought vectors are numeric representations for sentences, paragraphs, and documents. The following papers are listed in order of date published, each one replaces the last as the state of the art in sentiment analysis.
170185

171186
[Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.383.1327&rep=rep1&type=pdf)
@@ -198,7 +213,7 @@ Sutskever, Vinyals, Le 2014. ([nips presentation](http://research.microsoft.com
198213
* [IXA pipeline: Efficient and Ready to Use Multilingual NLP tools](http://www.lrec-conf.org/proceedings/lrec2014/pdf/775_Paper.pdf)
199214

200215

201-
### Single Exchange Dialogs (from [DL4NLP](https://github.com/andrewt3000/DL4NLP))
216+
### Single Exchange Dialogs
202217
[A Neural Network Approach toContext-Sensitive Generation of Conversational Responses](http://arxiv.org/pdf/1506.06714v1.pdf)
203218
Sordoni 2015. Generates responses to tweets.
204219
Uses [Recurrent Neural Network Language Model (RLM) architecture
@@ -229,8 +244,8 @@ Graves et al. 2014.
229244
Joulin, Mikolov 2015. [Stack RNN source code](https://github.com/facebook/Stack-RNN) and [blog post](https://research.facebook.com/blog/1642778845966521/inferring-algorithmic-patterns-with-stack/)
230245

231246
### General Natural Language Processing
232-
* [Neural autocoder for paragraphs and documents](http://arxiv.org/abs/1506.01057) - LTSM representation
233-
* [LTSM over tree structures](http://arxiv.org/abs/1503.04881)
247+
* [Neural autocoder for paragraphs and documents](http://arxiv.org/abs/1506.01057) - LSTM representation
248+
* [LSTM over tree structures](http://arxiv.org/abs/1503.04881)
234249
* [Sequence to Sequence Learning](http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf) - word vectors for machine translation
235250
* [Teaching Machines to Read and Comprehend](http://arxiv.org/abs/1506.03340) - DeepMind paper
236251
* [Efficient Estimation of Word Representations in Vector Space](http://arxiv.org/pdf/1301.3781.pdf)

0 commit comments

Comments
 (0)