NOTE: This repo has been deprecated! It has been replaced with: www.github.com/dallascard/scholar

neural_topic_models

NOTE: This repo has been deprecated! It has been replaced with: www.github.com/dallascard/scholar

The accompanying paper has also been updated to reflect the version published at ACL 2018: "Neural Models for Documents with Metadata" by Dallas Card, Chenhao Tan, and Noah A. Smith

The paper can be found at: https://arxiv.org/abs/1705.09296

Legacy documentation:

Requirements:

The following requirements can be installed via pip or conda:

python3
numpy
scipy
theano
scikit-learn
spacy

To run:

If you've never used spaCy before, start by running:

python -m spacy download en

Then, run the following commands from the theano_code directory:

Download 20 newsgroups data:

python download_20ng.py

Preprocess data:

python preprocess_data.py ../data/20ng/20ng_all/train.json ../data/20ng/20ng_all/test.json ../data/20ng/20ng_all/ --malletstop

To test code:

python run_ngtm.py ../data/20ng/20ng_all train output --test test --max_epochs 1 [options]

Options:

To recreate the NVDM (Miao et al, 2016), run:

python run_ngtm.py ../data/20ng/20ng_all train output --test test --encoder_layers 2 --generator_layers 0 --train_bias

To recreate the (e1 g0) model, run:

python run_ngtm.py ../data/20ng/20ng_all train output --test test --encoder_layers 1 --generator_layers 0 --train_bias

To recreate the (e1s g4s) model, run:

python run_ngtm.py ../data/20ng/20ng_all train output --test test --encoder_layers 1 --encoder_shortcut --generator_layers 4 --generator_shortcut --train_bias

To train a model with 50% sparsity:

python run_ngtm.py ../data/20ng/20ng_all train output --test test --encoder_layers 1 --encoder_shortcut --generator_layers 4 --generator_shortcut --train_bias --l1_penalty 0.01 --sparsity_target 0.5

To train a model using topics, labels, and interactions:

python run_ngtm.py ../data/20ng/20ng_all train output --test test --encoder_layers 1 --encoder_shortcut --generator_layers 4 --generator_shortcut --train_bias --n_topics 5 --n_classes 20 --use_interactions

Citation:

If you find this code useful, please cite:

@article{card.2017,
  author = {Dallas Card and Chenhao Tan and Noah A. Smith},
  title = {A Neural Framework for Generalized Topic Models},
  year = {2017},
  journal = {arXiv preprint arXiv:1705.09296},
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
theano_code		theano_code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

neural_topic_models

NOTE: This repo has been deprecated! It has been replaced with: www.github.com/dallascard/scholar

Legacy documentation:

Requirements:

To run:

Options:

Citation:

About

Uh oh!

Releases

Packages

Languages

License

dallascard/neural_topic_models

Folders and files

Latest commit

History

Repository files navigation

neural_topic_models

NOTE: This repo has been deprecated! It has been replaced with: www.github.com/dallascard/scholar

Legacy documentation:

Requirements:

To run:

Options:

Citation:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages