lightning-language-models

This repository contains custom implementations of various language models in PyTorch Lightning. The major purpose of this project is educational, that is, by abstracting the training and inference logic into standardized base models and scripts, we can put our full focus on the model architecture itself.

Due to computational constraints, we are restricting ourselves to character-level language models in this repository. However, if you change the tokenization logic in the train.py script, you can use any of the models in this repository for word-level language modeling as well.

As training corpus we are using the works of Shakespeare, which can be found in the data directory. Feel free to use any other text corpus you like by changing the corpus argument in the train.py script.

Models

You can find implementations of the following models in the models directory:

Bigram (models/bigram.py)
Vanilla RNN (models/rnn.py)
LSTM (models/lstm.py)
GRU (models/gru.py)
Transformer (models/transformer.py)

Be aware, that these model classes only implement the forward pass of each model, making it easier to understand the different architectures. Shared model logic is inhereted from the base class in models/base.py.

Setup

If you are using conda, run the following commands to create a new environment and install the code dependencies:

conda create -n lightning_language_models python=3.9
conda activate lightning_language_models
pip install -r requirements.txt

Training

To train a model, run the train.py script with the desired model name as argument. For example, to train a LSTM model on a custom dataset, run:

python train.py --model lstm --corpus data/custom_dataset.txt

Generating Text

To generate text from a trained model, run the generate.py script with the desired model name as argument. For example, to generate text from a LSTM model, run:

python generate.py --model lstm

Tensorboard

If you want to use Tensorboard to monitor your training progress, run the following command in a separate terminal:

tensorboard --logdir logs

Once Tensorboard is running, you can access it by opening the following URL in your browser: http://localhost:6006

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
checkpoints		checkpoints
data		data
models		models
resources		resources
.gitignore		.gitignore
README.md		README.md
generate.py		generate.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

lightning-language-models

Models

Setup

Training

Generating Text

Tensorboard

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mwinterde/lightning-language-models

Folders and files

Latest commit

History

Repository files navigation

lightning-language-models

Models

Setup

Training

Generating Text

Tensorboard

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages