Skip to content

Commit 67e07fa

Browse files
authored
Update Attention with Transformer Annotation
- Update Attention section - Move relevant links to Dialog and Conversational
1 parent 2a110e9 commit 67e07fa

File tree

1 file changed

+12
-19
lines changed

1 file changed

+12
-19
lines changed

README.md

Lines changed: 12 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Please feel free to create [pull requests](https://github.com/keonkim/awesome-nl
3232
- [Text Embeddings](#text-embeddings)
3333
- [Thought Vectors](#thought-vectors)
3434
- [Machine Translation](#machine-translation)
35-
- [Single Exchange Dialogs](#single-exchange-dialogs)
35+
- [Dialogs and Conversational](#dialogs-and-conversational)
3636
- [Memory and Attention Models](#memory-and-attention-models)
3737
- [Named Entity Recognition](#named-entity-recognition)
3838
- [Natural Language Understanding](#natural-language-understanding)
@@ -269,45 +269,38 @@ timesteps, thereby achieving strong performance in many text classification task
269269
* [arXiv: Sequence to Sequence Learning with Neural Networks](http://arxiv.org/pdf/1409.3215v3.pdf) Sutskever, Vinyals, Le 2014 proved the effectiivenss of **LSTM** for Machine Translation. Check their ([nips presentation](http://research.microsoft.com/apps/video/?id=239083))
270270
* [arXiv: Neural Machine Translation by jointly learning to align and translate](http://arxiv.org/pdf/1409.0473v6.pdf)
271271
Bahdanau, Cho 2014 introduced the **attention mechanism** in NLP
272-
* [arXiv: A Convolutional encoder model for neural machine translation](https://arxiv.org/pdf/1611.02344.pdf) by Gehring et al, 2017. The paper is from Facebook AI research and its code is available [here](https://github.com/facebookresearch/fairseq).
273-
* [Convolutional Sequence to Sequence learning](https://arxiv.org/pdf/1705.03122.pdf) by Gehring et al, 2017. The paper is from Facebook AI research and its code is available [here](https://github.com/facebookresearch/fairseq).
274-
* [Convolutional over Recurrent Encoder for neural machine translation](https://ufal.mff.cuni.cz/pbml/108/art-dakwale-monz.pdf) by Dakwale and Monz from University of Amsterdam compare the CNNs with a recurrent neural network with additional convolutonal layers.
275-
* Open Source code: [OpenNMT](http://opennmt.net/) is an open source initiative for neural machine translation and neural sequence modeling. It has a [PyTorch](https://github.com/OpenNMT/OpenNMT-py), [Tensorflow](https://github.com/OpenNMT/OpenNMT-tf) and the original [LuaTorch](https://github.com/OpenNMT/OpenNMT) implementation.
272+
* [arXiv: A Convolutional encoder model for neural machine translation](https://arxiv.org/pdf/1611.02344.pdf) by Gehring et al, 2017. The paper is from Facebook AI research and its code is available [here](https://github.com/facebookresearch/fairseq)
273+
* [Convolutional Sequence to Sequence learning](https://arxiv.org/pdf/1705.03122.pdf) by Gehring et al, 2017. The paper is from Facebook AI research and its code is available [here](https://github.com/facebookresearch/fairseq)
274+
* [Convolutional over Recurrent Encoder for neural machine translation](https://ufal.mff.cuni.cz/pbml/108/art-dakwale-monz.pdf) by Dakwale and Monz from University of Amsterdam compare the CNNs with a recurrent neural network with additional convolutonal layers
275+
* Open Source code: [OpenNMT](http://opennmt.net/) is an open source initiative for neural machine translation and neural sequence modeling. [PyTorch](https://github.com/OpenNMT/OpenNMT-py), [Tensorflow](https://github.com/OpenNMT/OpenNMT-tf) and the original [LuaTorch](https://github.com/OpenNMT/OpenNMT) implementation
276276

277-
### Single Exchange Dialogs
277+
### Dialogs and Conversational
278278

279279
[Back to Top](#contents)
280-
281280
* [A Neural Network Approach to Context-Sensitive Generation of Conversational Responses](http://arxiv.org/pdf/1506.06714v1.pdf)
282-
Sordoni 2015. Generates responses to tweets.
283-
* Uses [Recurrent Neural Network Language Model (RLM) architecture
284-
of (Mikolov et al., 2010).](http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf). Source code: [RNNLM Toolkit](http://www.fit.vutbr.cz/~imikolov/rnnlm/index.html)
281+
Sordoni 2015. Generates responses to tweets.
282+
* Uses [Recurrent Neural Network Language Model (RLM) architecture of (Mikolov et al., 2010).](http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf). [Code of RNNLM Toolkit](http://www.fit.vutbr.cz/~imikolov/rnnlm/index.html)
285283
* RNNLM Tutorial: [Implementing RNN Language Models by Denny Britz](http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-2-implementing-a-language-model-rnn-with-python-numpy-and-theano/)
286-
287284
* [Neural Responding Machine for Short-Text Conversation](http://arxiv.org/pdf/1503.02364v2.pdf)
288285
Shang et al. 2015 Uses Neural Responding Machine. Trained on Weibo dataset. Achieves one round conversations with 75% appropriate responses.
289-
290286
* [arXiv: A Neural Conversation Model](http://arxiv.org/pdf/1506.05869v3.pdf) Vinyals, [Le](https://scholar.google.com/citations?user=vfT6-XIAAAAJ) 2015. Uses LSTM RNNs to generate conversational responses
291287

292288
### Memory and Attention Models
293289

294290
[Back to Top](#contents)
291+
Some are courtesy [andrewt3000/DL4NLP](https://github.com/andrewt3000/DL4NLP)
295292

296-
Most are courtesy [andrewt3000/DL4NLP](https://github.com/andrewt3000/DL4NLP)
297293
* Interactive tutorial on [Augmented RNNs](www.distill.pub/2016/augmented-rnns/) including Attention and Memory networks
298-
* [Reasoning, Attention and Memory RAM workshop at NIPS 2015. slides included](http://www.thespermwhale.com/jaseweston/ram/)
294+
* [Annotated Transformer](http://nlp.seas.harvard.edu//2018/04/03/attention.html) from the [Attention is All You Need](https://arxiv.org/abs/1706.03762) work explains Tranformer implementation in line by line detail. Both links highly recommended.
299295
* [Memory Networks](http://arxiv.org/pdf/1410.3916v10.pdf) Weston et. al 2014
300296
* [End-To-End Memory Networks](http://arxiv.org/pdf/1503.08895v4.pdf) Sukhbaatar et. al 2015
301297
Memory networks are implemented in [MemNN](https://github.com/facebook/MemNN). Attempts to solve task of reason attention and memory
302-
* [Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks](http://arxiv.org/pdf/1502.05698v7.pdf)
303-
Weston 2015. Classifies QA tasks like single factoid, yes/no etc. Extends memory networks
304-
* [Evaluating prerequisite qualities for learning end to end dialog systems](http://arxiv.org/pdf/1511.06931.pdf)
305-
Dodge et. al 2015. Tests Memory Networks on 4 tasks including reddit dialog task
306-
* [Jason Weston lecture on MemNN](https://www.youtube.com/watch?v=Xumy3Yjq4zk)
298+
* [Reasoning, Attention and Memory RAM workshop at NIPS 2015. slides included](http://www.thespermwhale.com/jaseweston/ram/)
307299
* [Neural Turing Machines](http://arxiv.org/pdf/1410.5401v2.pdf), Graves et al. 2014
308300
* [Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets](http://arxiv.org/pdf/1503.01007v4.pdf), Joulin, Mikolov 2015
309301
* [Stack RNN source code](https://github.com/facebook/Stack-RNN) and [blog post](https://research.facebook.com/blog/1642778845966521/inferring-algorithmic-patterns-with-stack/)
310302

303+
311304
### Natural Language Understanding
312305

313306
[Back to Top](#contents)

0 commit comments

Comments
 (0)