Highlights
Stars
📝 Study
4 repositories
nanoRLHF: from-scratch journey into how LLMs and RLHF really work.
Robust recipes to align language models with human and AI preferences




