Transformers from Scratch

This repository offers a ground-up implementation of the Transformer model, as introduced in the seminal paper "Attention Is All You Need". The Transformer architecture has become foundational in various Natural Language Processing (NLP) tasks due to its efficiency and scalability.

Overview

This project provides a minimalist and educational implementation of the Transformer model using Python and PyTorch. It's designed for those who wish to understand the inner workings of Transformers and experiment with the architecture.

Features

Encoder and Decoder modules with multi-head self-attention mechanisms.
Positional encoding to capture sequence information.
Layer normalization and residual connections for stable training.
Configurable hyperparameters for experimentation.
Training and inference scripts for model evaluation.

Installation

Clone the Repository:

git clone https://github.com/ZXEcoder/transformers.git

Navigate to the Project Directory:
```
cd transformers
```
Install Dependencies:

Ensure you have Python 3.8+ installed. Then, install the required packages:
```
pip install -r requirements.txt
```

Usage

Configuration

The config.py file contains all the hyperparameters and configurations for the model, training process, and dataset paths. Adjust these parameters as needed before training or inference.

Training

To train the Transformer model:

python train.py

This script will initiate the training process using the configurations specified in config.py. Ensure your dataset is prepared and its path is correctly set in the configuration file.

Inference

For running inference and evaluating the model, you can use the provided Jupyter Notebook:

jupyter notebook inference.ipynb

This notebook demonstrates how to load a trained model and perform inference on sample inputs.

Project Structure

transformers/
│-- config.py          # Configuration settings
│-- dataset.py         # Dataset loading and preprocessing
│-- model.py           # Transformer model implementation
│-- train.py           # Training script
│-- inference.ipynb    # Inference and evaluation notebook
│-- LICENSE            # License information
│-- README.md          # Project documentation

Contributing

Contributions are welcome! If you have suggestions or improvements, please open an issue or submit a pull request.

License

This project is licensed under the Apache-2.0 License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Transformers from Scratch

Table of Contents

Overview

Features

Installation

Usage

Configuration

Training

Inference

Project Structure

Contributing

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
config.py		config.py
dataset.py		dataset.py
inference.ipynb		inference.ipynb
model.py		model.py
train.py		train.py

License

ZXEcoder/transformers

Folders and files

Latest commit

History

Repository files navigation

Transformers from Scratch

Table of Contents

Overview

Features

Installation

Usage

Configuration

Training

Inference

Project Structure

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages