PaLM-rlhf-pytorch is a PyTorch implementation of Pathways Language Model (PaLM) with Reinforcement Learning from Human Feedback (RLHF). It is designed for fine-tuning large-scale language models with human preference alignment, similar to OpenAI’s approach for training models like ChatGPT.

Features

  • Implements RLHF for fine-tuning large-scale language models
  • Uses PPO (Proximal Policy Optimization) for reinforcement learning stability
  • Optimized for training on distributed hardware like GPUs and TPUs
  • Supports both pretraining and reward model fine-tuning
  • Built on PyTorch with modular and extensible components
  • Designed for experimenting with human-aligned AI training

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow PaLM + RLHF - Pytorch

PaLM + RLHF - Pytorch Web Site

Other Useful Business Software
Gen AI apps are built with MongoDB Atlas Icon
Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of PaLM + RLHF - Pytorch!

Additional Project Details

Programming Language

Python

Related Categories

Python Reinforcement Learning Frameworks

Registered

2025-03-13