You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Packages marked by :v: are popular and used in production grade systems by atleast one maintainer of this repository or people they respect
114
115
115
-
*[TextBlob](http://textblob.readthedocs.org/) - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of [Natural Language Toolkit (NLTK)](http://www.nltk.org/) and [Pattern](https://github.com/clips/pattern), and plays nicely with both :v:
116
-
*[spaCy](https://github.com/spacy-io/spaCy) - Industrial strength NLP with Python and Cython :v:
117
-
*[textacy](https://github.com/chartbeat-labs/textacy) - Higher level NLP built on spaCy :v:
118
-
*[gensim](https://radimrehurek.com/gensim/index.html) - Python library to conduct unsupervised semantic modelling from plain text :v:
119
-
*[scattertext](https://github.com/JasonKessler/scattertext) - Python library to produce d3 visualizations of how language differs between corpora :v:
120
-
*[AllenNLP](https://github.com/allenai/allennlp) - An NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.
121
-
*[Rosetta](https://github.com/columbia-applied-data-science/rosetta) - Text processing tools and wrappers (e.g. Vowpal Wabbit)
122
-
*[PyNLPl](https://github.com/proycon/pynlpl) - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for [FoLiA](http://proycon.github.io/folia/), but also ARPA language models, Moses phrasetables, GIZA++ alignments.
123
-
*[jPTDP](https://github.com/datquocnguyen/jPTDP) - A toolkit for joint part-of-speech (POS) tagging and dependency parsing. jPTDP provides pre-trained models for 40+ languages.
124
-
*[BigARTM](https://github.com/bigartm/bigartm) - a fast library for topic modelling
125
-
126
-
* Language Specific Tools
127
-
* Chinese: [YAlign](https://github.com/machinalis/yalign) - A sentence aligner, a friendly tool for extracting parallel sentences from comparable corpora
128
-
* Chinese: [SnowNLP](https://github.com/isnowfy/snownlp) - A library for processing Chinese text
129
-
* Chinese: [jieba](https://github.com/fxsjy/jieba#jieba-1) - Chinese Words Segmentation Utilities.
130
-
* Russian: [pymorphy2](https://github.com/kmike/pymorphy2) - a good pos-tagger for Russian
131
-
* Thai: [PyThaiNLP](https://github.com/wannaphongcom/pythainlp) - Thai NLP in Python Package
132
-
* Ancient Languages: [CLTK](https://github.com/cltk/cltk): The Classical Language Toolkit is a Python library and collection of texts for doing NLP in ancient languages
133
-
* Dutch: [python-frog](https://github.com/proycon/python-frog) - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
134
-
116
+
*[TextBlob](http://textblob.readthedocs.org/) - Providing a consistent API for diving into common natural language processing (NLP) tasks. Stands on the giant shoulders of [Natural Language Toolkit (NLTK)](http://www.nltk.org/) and [Pattern](https://github.com/clips/pattern), and plays nicely with both :v:
117
+
*[spaCy](https://github.com/spacy-io/spaCy) - Industrial strength NLP with Python and Cython :v:
118
+
*[textacy](https://github.com/chartbeat-labs/textacy) - Higher level NLP built on spaCy :v:
119
+
*[gensim](https://radimrehurek.com/gensim/index.html) - Python library to conduct unsupervised semantic modelling from plain text :v:
120
+
*[scattertext](https://github.com/JasonKessler/scattertext) - Python library to produce d3 visualizations of how language differs between corpora :v:
121
+
*[AllenNLP](https://github.com/allenai/allennlp) - An NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.
122
+
*[Rosetta](https://github.com/columbia-applied-data-science/rosetta) - Text processing tools and wrappers (e.g. Vowpal Wabbit)
123
+
*[PyNLPl](https://github.com/proycon/pynlpl) - Python Natural Language Processing Library. General purpose NLP library for Python. Also contains some specific modules for parsing common NLP formats, most notably for [FoLiA](http://proycon.github.io/folia/), but also ARPA language models, Moses phrasetables, GIZA++ alignments.
124
+
*[jPTDP](https://github.com/datquocnguyen/jPTDP) - A toolkit for joint part-of-speech (POS) tagging and dependency parsing. jPTDP provides pre-trained models for 40+ languages.
125
+
*[BigARTM](https://github.com/bigartm/bigartm) - a fast library for topic modelling
126
+
135
127
136
128
* <aid="c++">**C++** - C++ Libraries</a> | [Back to Top](#contents)
137
129
*[MIT Information Extraction Toolkit](https://github.com/mit-nlp/MITIE) - C, C++, and Python tools for named entity recognition and relation extraction
@@ -453,6 +445,12 @@ Dodge et. al 2015. Tests Memory Networks on 4 tasks including reddit dialog task
*[Spanish Billion words corpus with Word2Vec embeddings](http://crscardellino.me/SBWCE/)
455
447
448
+
### Other Languages
449
+
* Russian: [pymorphy2](https://github.com/kmike/pymorphy2) - a good pos-tagger for Russian
450
+
* Thai: [PyThaiNLP](https://github.com/wannaphongcom/pythainlp) - Thai NLP in Python Package
451
+
* Ancient Languages: [CLTK](https://github.com/cltk/cltk): The Classical Language Toolkit is a Python library and collection of texts for doing NLP in ancient languages
452
+
* Dutch: [python-frog](https://github.com/proycon/python-frog) - Python binding to Frog, an NLP suite for Dutch. (pos tagging, lemmatisation, dependency parsing, NER)
453
+
456
454
## Credits
457
455
Awesome NLP was seeded with curated content from the lot of repositories, some of which are listed below | [Back to Top](#contents)
0 commit comments