About
This module will enable you to automatically scrape Eurostat so-called "Statistics Explained" and index the contents of those pages. It will build a graph of inter-relationships between the pages while extracting some semantic contents ("concepts"). The interconnected concepts are then used to automatically train a text classifier.
documentation | available at: https://gjacopo.github.io/esscrape/ |
since | 2018 |
license | EUPL |
Keras
, thePython
Deep Learning library.- Various algorithms for short text categorization:
PyShortTextCategorization
. - Source code for large-scale hierarchical text classification with recursively regularized Deep Graph-CNN:
Deepgraphcnn
. - Convolutional Neural Networks for sentence classification:
CNN_sentence
. - Tool
word2vec
for computing continuous distributed representations of words, with pre-trained word and phrase vectors; see also mirror repository. - Implementation of Graph Convolutional Networks in
TensorFlow
. - Text matching toolkit
MatchZoo
for designing, comparing, and sharing of deep text matching models. - Britz D. blog on implementing a Convolutional Neural Network for text classification in
Tensorflow
and source codecnn-text-classification-tf
. - Britz D. blog for understanding Convolutional Neural Networks for NLP.
- Kipf T.N. blog on Graph Convolutional Network.
- Framework
Scrapy
for extracting data from online websites. - Natural language toolkit
nltk
to work with human language data. - Package
NetworkX
for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. - Module
py2neo
forneo4j
graph database, though the bolt driverneo4j-python-driver
does the job.
- Peng H., Li J., He Y., Liu Y., Bao M., Song Y., and Yang Q. (2018): Large-scale hierarchical text classification with recursively regularized Deep Graph-CNN, Proc. WWW.
- Yu J., Lu Y., Qin Z., Liu Y., Tan J., Guo L., and Zhang W. (2018): Modeling text with Graph Convolutional Network for cross-modal information retrieval, arXiv:1802.00985.
- Wang T., Wu D.J., Coates A., and Ng, A.Y. (2018): End-to-end text recognition with Convolutional Neural Networks.
- Schlichtkrull M., Kipf T.N., Bloem P., van den Berg R., Titov I., and Welling M. (2018): Modeling relational data with Graph Convolutional Networks, Proc. ESWC, arXiv:1703.06103.
- Zhang Z., Robinson D., and Tepper J. (2018): Detecting hate speech on Twitter using a Convolution-GRU based Deep Neural Network, Proc. ESWC.
- Liu B., Zhang T., Niu D., Lin J., Lai K., and Xu Y. (2018): Matching long text documents via Graph Convolutional Networks, arXiv:1802.07459.
- Kipf T.N. and Welling M. (2017) Semi-supervised classification with Graph Convolutional Networks, Proc. _ ICLR_, arXiv:1609.02907.
- Fan Y., Pang L., Hou J.P., Guo J., Lan Y., and Cheng X. (2017): MatchZoo: A toolkit for deep text matching, Proc. SIGIR, arXiv:1707.07270.
- Mitra B., Diaz F., and Craswell N. (2017): Learning to match using local and distributed representations of text for web search, Proc. ICWWW, arXiv:1610.08136.
- Defferrard M., Bresson X. and Vandergheynst P. (2016): Convolutional Neural Networks on graphs with fast localized spectral filtering, Proc. NIPS, arXiv:1606.09375.
- Zhang X., Zhao J., and LeCun Y. (2015): Character-level Convolutional Networks for text classification, Proc. NIPS, arXiv:1509.01626.
- Johnson R. and Zhang T. (2015): Semi-supervised Convolutional Neural Networks for text categorization via region embedding, arXiv:1504.01255.
- Qiu, X. and Huang, X. (2015): Convolutional Neural Tensor Network architecture for community-based question answering, Proc. IJCAI.
- Wang P., Xu J., Xu B., Liu C., Zhang H., Wang F., and Hao H. (2015): Semantic clustering and Convolutional Neural Network for short text categorization, doi:10.3115/v1/P15-2058.
- Kim Y. (2014): Convolutional Neural Networks for sentence classification, arXiv:1408.5882.