A PyTorch implementation of scalable memory modules for Large Language Models, featuring product key memory and efficient attention mechanisms.
This project implements memory-augmented transformer architectures designed to enhance the context handling and retrieval capabilities of language models. It features:
- Product Key Memory implementation for efficient memory access
- Memory-augmented transformer layers
- Character AI system with persistent memory
- Flexible training framework for memory-enhanced models
- SQuAD dataset integration for question-answering tasks
- Product Key Memory: Efficient memory access using product keys
- Memory-Augmented Transformer: Enhanced transformer architecture with integrated memory
- Character Assistant: Framework for creating AI characters with persistent memory
- Training Framework: Flexible system for training memory-enhanced models
- Scalable memory management for large language models
- Efficient key-value storage and retrieval
- Attention-based memory access
- Character personality persistence
- Integration with popular LLM architectures
Clone the repository
git clone https://github.com/peytontolbert/llm_memory_modules_at_scale.git
Install dependencies
pip install -r requirements.txt
Install the package in development mode
pip install -e .
from src.models.memory import ProductKeyMemory
from src.models.transformer import MemoryAugmentedTransformerLayer
Initialize memory layer
memory = ProductKeyMemory(
query_dim=512,
key_dim=64,
value_dim=64,
num_heads=4
)
Create memory-augmented transformer
transformer = MemoryAugmentedTransformerLayer(
d_model=512,
nhead=8,
memory_size=1024
)
from examples.character_assistant import CharacterAssistant
Create a Sherlock Holmes AI
assistant = CharacterAssistant(
character_name="Sherlock Holmes",
base_model="gpt2-medium",
memory_size=1024,
num_heads=8
)
Generate response
response = assistant.generate_response("Tell me about your methods.")
├── src/
│ ├── models/
│ │ ├── memory.py # Memory implementations
│ │ └── transformer.py # Transformer architectures
│ ├── training/
│ │ └── trainer.py # Training utilities
│ └── utils/
│ └── metrics.py # Evaluation metrics
├── examples/
│ ├── character_assistant.py # Character AI implementation
│ └── train_qa.py # Training examples
├── tests/
│ ├── test_memory.py
│ └── test_transformer.py
└── requirements.txt
To train a model on the SQuAD dataset:
python examples/train_qa.py \
--model_type memory_transformer \
--batch_size 32 \
--learning_rate 1e-4 \
--num_epochs 10
Run the test suite:
pytest tests/
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
If you use this code in your research, please cite: