Skip to content

Conversation

thomfoster
Copy link
Collaborator

In this trlx pull request I add the ability to trlx to configure local rollout logging during a hyperparameter sweep.

Here, we provide scripts to use those changes to collect rollout data for short completions of the roc stories dataset scored according to sentiment

From the sentiment-data directory, run python -m trlx.sweep --config ppo_sweep.yml ppo_roc_story_sentiments.py -num-cpus 2 --num-gpus 1 to start collecting rollouts.

@honglu2875
Copy link
Collaborator

honglu2875 commented Dec 13, 2022

@thomfoster Nice that we have the same folder name. Would you merge from the main first, and then clear things after running make commit-checks? I will also try to run your script on my end and see if it works.

pipeline: "PromptPipeline" # prompt pipeline to load
orchestrator: "PPOOrchestrator" # orchestrator to load

rollout_logging_dir: "~/Algorithm-Distillation-RLHF/algorithm_distillation/sentiment-data/rollouts"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was causing some trouble for me. Is there a way to use relative dir instead of absolute?
Tried and didn't work. Maybe it's a problem of trlx, like we should use pathlib.Path instead of os.path here
https://github.com/CarperAI/trlx/blob/1a3461d592a567409e11e3c47b5b355bf219449f/trlx/model/accelerate_ppo_model.py#L115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants