Remove env creation logic from TrainerController #1562

harperj · 2019-01-03T23:00:30Z

Currently TrainerController includes logic related to creating the
UnityEnvironment, which causes poor separation of concerns between
the learn.py application script, TrainerController and UnityEnvironment:

TrainerController must know about the proper way to instantiate the
UnityEnvironment, which may differ from application to application.
This also makes mocking or subclassing UnityEnvironment more
difficult.
Many arguments are passed by learn.py to TrainerController and passed
along to UnityEnvironment.

This change moves environment construction logic into learn.py, as part
of the greater refactor to separate trainer logic from actor / environment.

harperj · 2019-01-03T23:01:45Z

Apologies for the size of the PR, though most of it was moving code it still grew a bit bigger than I'd like.

vincentpierre · 2019-01-03T23:02:30Z

ml-agents/mlagents/envs/environment.py

@@ -222,7 +223,7 @@ def __str__(self):
                                                   for k in self._resetParameters])) + '\n' + \
               '\n'.join([str(self._brains[b]) for b in self._brains])

-    def reset(self, config=None, train_mode=True) -> AllBrainInfo:


Does this mean it will no longer be possible to switch between training and inference configurations at reset ?

It does. Is this a case it is important for us to support? This seems to me like an uncommon enough case that it would be fine to create a new environment, but I'm not sure I fully understand it.

I am adding @awjuliani to this conversation because he is the one who asked for this feature. I personally think the user should be able to change the configuration (training vs inference) at each step and not only at reset.

Is it possible to not remove this feature ?

I agree with Vince. No reason to remove this feature, which is useful for those using ML-Agents in the context of Jupyter notebooks or other interactive tutorials.

Sure thing, will fix

ml-agents/mlagents/trainers/learn.py

vincentpierre · 2019-01-03T23:14:52Z

ml-agents/mlagents/trainers/trainer_controller.py

            if trainer_parameters_dict[brain_name]['trainer'] == 'offline_bc':
                self.trainers[brain_name] = OfflineBCTrainer(
-                    self.env.brains[brain_name],


Make sure BC works, I think there might be a bug here. self.env.brains contains all the brains, not just the external ones.

This is getting the brain_name from self.external_brains and just looking them up them in self.env.brains, so I think it should be fine.

I've double checked and it works as expected for PushBlockIL

awjuliani

Looks good to me. All that needs changing is to re-implement the inference/training mode switch in reset().

eshvk · 2019-01-08T01:51:42Z

ml-agents/mlagents/envs/environment.py

@@ -244,7 +245,7 @@ def reset(self, config=None, train_mode=True) -> AllBrainInfo:

        if self._loaded:
            outputs = self.communicator.exchange(
-                self._generate_reset_input(train_mode, config)
+                self._generate_reset_input(self.train_mode, config)


Can we just change the signature of this method to use train_mode which is a member of the UnityEnvironment class? i.e.self._generate_reset_input(config)?

eshvk · 2019-01-08T01:54:44Z

ml-agents/mlagents/trainers/learn.py



-def run_training(sub_id, run_seed, run_options, process_queue):
+def run_training(sub_id: int, run_seed: int, run_options, process_queue):


+1 on type hints :)

Currently TrainerController includes logic related to creating the UnityEnvironment, which causes poor separation of concerns between the learn.py application script, TrainerController and UnityEnvironment: * TrainerController must know about the proper way to instantiate the UnityEnvironment, which may differ from application to application. This also makes mocking or subclassing UnityEnvironment more difficult. * Many arguments are passed by learn.py to TrainerController and passed along to UnityEnvironment. This change moves environment construction logic into learn.py, as part of the greater refactor to separate trainer logic from actor / environment.

* Move load_config before environment creation * Extract curriculum loading logic into its own method

eshvk · 2019-01-24T01:37:25Z

…agents into develop-barracuda * 'develop-barracuda' of github.com:Unity-Technologies/ml-agents: deleted dead meta file and added a note on the OpenGLCore Graphics API Barracuda : Updating the documentation (#1607) Remove env creation logic from TrainerController (#1562) Fix In editor Docker training (#1582) Only using multiprocess when --num-runs>1 (#1583) Replace AddVectorObs(float[]) and AddVectorObs(List<float>) with a more generic AddVectorObs(IEnumerable<float>) (#1540) fixed the windows ctrl-c bug (#1558) Improve Gym wrapper compatibility and add Dopamine documentation (#1541) Fix typo in documentation (#1516) Update curricula brain names for 0.6 Addressing #1537 Fix for divide-by-zero error with Discrete Actions (#1520) Documentation tweaks and updates (#1479)

* develop-barracuda: Backup and restore fixedDeltaTime and maximumDeltaTime on Academy init / shutdown Restore global gravity value when Academy gets destroyed deleted dead meta file and added a note on the OpenGLCore Graphics API Barracuda : Updating the documentation (#1607) Remove env creation logic from TrainerController (#1562) Fix In editor Docker training (#1582) Only using multiprocess when --num-runs>1 (#1583) Replace AddVectorObs(float[]) and AddVectorObs(List<float>) with a more generic AddVectorObs(IEnumerable<float>) (#1540) fixed the windows ctrl-c bug (#1558) Improve Gym wrapper compatibility and add Dopamine documentation (#1541) Fix typo in documentation (#1516) Update curricula brain names for 0.6 Addressing #1537 Fix for divide-by-zero error with Discrete Actions (#1520) Documentation tweaks and updates (#1479)

* Remove env creation logic from TrainerController Currently TrainerController includes logic related to creating the UnityEnvironment, which causes poor separation of concerns between the learn.py application script, TrainerController and UnityEnvironment: * TrainerController must know about the proper way to instantiate the UnityEnvironment, which may differ from application to application. This also makes mocking or subclassing UnityEnvironment more difficult. * Many arguments are passed by learn.py to TrainerController and passed along to UnityEnvironment. This change moves environment construction logic into learn.py, as part of the greater refactor to separate trainer logic from actor / environment.

…1562) * Remove env creation logic from TrainerController Currently TrainerController includes logic related to creating the UnityEnvironment, which causes poor separation of concerns between the learn.py application script, TrainerController and UnityEnvironment: * TrainerController must know about the proper way to instantiate the UnityEnvironment, which may differ from application to application. This also makes mocking or subclassing UnityEnvironment more difficult. * Many arguments are passed by learn.py to TrainerController and passed along to UnityEnvironment. This change moves environment construction logic into learn.py, as part of the greater refactor to separate trainer logic from actor / environment.

harperj requested review from eshvk and vincentpierre January 3, 2019 23:00

vincentpierre suggested changes Jan 3, 2019

View reviewed changes

vincentpierre approved these changes Jan 7, 2019

View reviewed changes

Unity-Technologies deleted a comment from cratkid Jan 7, 2019

awjuliani approved these changes Jan 7, 2019

View reviewed changes

eshvk reviewed Jan 8, 2019

View reviewed changes

harper-u3d added 3 commits January 23, 2019 15:58

Tweaks based on PR feedback

8d3e10c

* Move load_config before environment creation * Extract curriculum loading logic into its own method

Adds back train_mode parameter to UnityEnvironment

45b083b

Final tweaks to fix merge issues

9a323f5

harperj force-pushed the develop-jh-distributed branch from bfde780 to 9a323f5 Compare January 24, 2019 01:43

harperj merged commit 553c6b7 into develop Jan 24, 2019

awjuliani deleted the develop-jh-distributed branch July 23, 2019 20:17

github-actions bot locked as resolved and limited conversation to collaborators May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove env creation logic from TrainerController #1562

Remove env creation logic from TrainerController #1562

Uh oh!

harperj commented Jan 3, 2019

Uh oh!

harperj commented Jan 3, 2019

Uh oh!

vincentpierre Jan 3, 2019

Uh oh!

harperj Jan 3, 2019

Uh oh!

vincentpierre Jan 7, 2019

Uh oh!

vincentpierre Jan 7, 2019

Uh oh!

awjuliani Jan 7, 2019

Uh oh!

harperj Jan 7, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vincentpierre Jan 3, 2019

Uh oh!

harperj Jan 3, 2019

Uh oh!

awjuliani left a comment •

edited

Loading

Uh oh!

eshvk Jan 8, 2019

Uh oh!

eshvk Jan 8, 2019

Uh oh!

eshvk commented Jan 24, 2019

Uh oh!

Uh oh!



		def run_training(sub_id, run_seed, run_options, process_queue):
		def run_training(sub_id: int, run_seed: int, run_options, process_queue):

Remove env creation logic from TrainerController #1562

Remove env creation logic from TrainerController #1562

Uh oh!

Conversation

harperj commented Jan 3, 2019

Uh oh!

harperj commented Jan 3, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

awjuliani left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eshvk commented Jan 24, 2019

Uh oh!

Uh oh!

awjuliani left a comment •

edited

Loading