Skip to content

ZXEcoder/transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformers from Scratch

This repository offers a ground-up implementation of the Transformer model, as introduced in the seminal paper "Attention Is All You Need". The Transformer architecture has become foundational in various Natural Language Processing (NLP) tasks due to its efficiency and scalability.

Table of Contents

Overview

This project provides a minimalist and educational implementation of the Transformer model using Python and PyTorch. It's designed for those who wish to understand the inner workings of Transformers and experiment with the architecture.

Features

  • Encoder and Decoder modules with multi-head self-attention mechanisms.
  • Positional encoding to capture sequence information.
  • Layer normalization and residual connections for stable training.
  • Configurable hyperparameters for experimentation.
  • Training and inference scripts for model evaluation.

Installation

  1. Clone the Repository:

    git clone https://github.com/ZXEcoder/transformers.git
  2. Navigate to the Project Directory:

    cd transformers
  3. Install Dependencies:

    Ensure you have Python 3.8+ installed. Then, install the required packages:

    pip install -r requirements.txt

Usage

Configuration

The config.py file contains all the hyperparameters and configurations for the model, training process, and dataset paths. Adjust these parameters as needed before training or inference.

Training

To train the Transformer model:

python train.py

This script will initiate the training process using the configurations specified in config.py. Ensure your dataset is prepared and its path is correctly set in the configuration file.

Inference

For running inference and evaluating the model, you can use the provided Jupyter Notebook:

jupyter notebook inference.ipynb

This notebook demonstrates how to load a trained model and perform inference on sample inputs.

Project Structure

transformers/
│-- config.py          # Configuration settings
│-- dataset.py         # Dataset loading and preprocessing
│-- model.py           # Transformer model implementation
│-- train.py           # Training script
│-- inference.ipynb    # Inference and evaluation notebook
│-- LICENSE            # License information
│-- README.md          # Project documentation

Contributing

Contributions are welcome! If you have suggestions or improvements, please open an issue or submit a pull request.

License

This project is licensed under the Apache-2.0 License. See the LICENSE file for details.


About

Implementing transformers fro scratch

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published