-
Notifications
You must be signed in to change notification settings - Fork 6
add scripts to collect rollouts for hyperparameter sweep on roc_story #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@thomfoster Nice that we have the same folder name. Would you merge from the main first, and then clear things after running |
pipeline: "PromptPipeline" # prompt pipeline to load | ||
orchestrator: "PPOOrchestrator" # orchestrator to load | ||
|
||
rollout_logging_dir: "~/Algorithm-Distillation-RLHF/algorithm_distillation/sentiment-data/rollouts" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line was causing some trouble for me. Is there a way to use relative dir instead of absolute?
Tried and didn't work. Maybe it's a problem of trlx, like we should use pathlib.Path
instead of os.path
here
https://github.com/CarperAI/trlx/blob/1a3461d592a567409e11e3c47b5b355bf219449f/trlx/model/accelerate_ppo_model.py#L115
…sentiment_rollouts.py to reflect that this is the script to generate data, not the class that uses it
… just roc stories
In this trlx pull request I add the ability to trlx to configure local rollout logging during a hyperparameter sweep.
Here, we provide scripts to use those changes to collect rollout data for short completions of the roc stories dataset scored according to sentiment
From the
sentiment-data
directory, runpython -m trlx.sweep --config ppo_sweep.yml ppo_roc_story_sentiments.py -num-cpus 2 --num-gpus 1
to start collecting rollouts.