Skip to content

Conversation

@awjuliani
Copy link
Contributor

@awjuliani awjuliani commented Feb 2, 2018

  • Add ability to seed learning (numpy, tensorflow, and Unity) with --seed flag.
  • Add maxStepReached flag to Agents and Academy.
  • Change way value bootstrapping works in PPO to take advantage of timeouts.
  • Default size of GridWorld changed to 5x5 in order to validate bootstrapping changes.

python/learn.py Outdated
fast_simulation = not bool(options['--slow'])

env = UnityEnvironment(file_name=env_name, worker_id=worker_id, curriculum=curriculum_file)
if seed != -1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit why this check exists? Why is the default pseudo randomizer explicitly to -1?

python/learn.py Outdated
--lesson=<n> Start learning from this lesson [default: 0].
--load Whether to load the model or randomly initialize [default: False].
--run-path=<path> The sub-directory name for model and summary statistics [default: ppo].
--run-id=<path> The sub-directory name for model and summary statistics [default: ppo].
Copy link
Contributor

@eshvk eshvk Feb 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is branching from dev-0.3, I think this is a typo and should be --run-path. model_path below uses --run-path anyways.

beta: 2.5e-3
buffer_size: 5000
batch_size: 32
beta: 5.0e-3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer adding a single line comment articulating the change in default parameters. Perhaps referencing a trial run?


self.variable_scope = trainer_parameters['graph_scope']
with tf.variable_scope(self.variable_scope):
tf.set_random_seed(seed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the master learn.py you check if seed != -1 before setting the seed for tf. However PPOTrainer is again called with the seed which may potentially be -1 which means that here we could end up setting tf.set_random_seed(-1) ?

List<float> concatenatedRewards = new List<float>(32);
List<float> concatenatedMemories = new List<float>(1024);
List<bool> concatenatedDones = new List<bool>(32);
List<bool> concatenatedMaxes = new List<bool>(32);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This are set as sane defaults. Perhaps we can do a public const int DEFAULT_NUM_AGENTS = 32 before hand?

info = info[self.brain_name]
for l in range(len(info.agents)):
agent_actions = self.training_buffer[info.agents[l]]['actions']
if ((info.local_done[l] or len(agent_actions) > self.trainer_parameters['time_horizon'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to check if info.max_reached[l] in this line ?

"""The ImitationTrainer is an implementation of the imitation learning."""
def __init__(self, sess, env, brain_name, trainer_parameters, training):
def __init__(self, sess, env, brain_name, trainer_parameters, training, seed):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to make use of this seed.

[launch_string,
'--port', str(self.port)])
'--port', str(self.port),
'--seed', str(seed)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not send the seed if the seed is -1 or None

@awjuliani awjuliani merged commit c87f700 into development-0.3 Feb 2, 2018
@awjuliani awjuliani deleted the dev-bootstrapping branch February 2, 2018 21:01
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants