-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Release 0.9.2 #2469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release 0.9.2 #2469
Conversation
* WIP still needs tests and merging from multiprocess * cleanup gauges * add TODO for subprocesses
Merge latest fixes from release into develop
* discrete action coverage * undo change * rename test * move test file * Revert "move test file" This reverts commit 2e72b2d. * move files post merge
fix mock_brain
Add MultiGpuPPOPolicy class and command line options to run multi-GPU training
Return list instead of np array for make_mini_batch() to reduce time copying data
This change moves trainer initialization outside of TrainerController, reducing some of the constructor arguments of TrainerController and setting up the ability for trainers to be initialized in the case where a TrainerController isn't needed.
- Move common functions to trainer.py, model.pyfromppo/trainer.py, ppo/policy.pyandppo/model.py' - Introduce RLTrainer class and move most of add_experiences and some common reward signal code there. PPO and SAC will inherit from this, not so much BC Trainer. - Add methods to Buffer to enable sampling, truncating, and save/loading. - Add scoping to create encoders in model.py
Hotfix v0.9.1 - develop
Variable "model" is undefined.
…into RunSwimFlyRich-master # Conflicts: # UnitySDK/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerDynamicLearning.nn # UnitySDK/Assets/ML-Agents/Examples/Crawler/TFModels/CrawlerStaticLearning.nn
* Adds evaluate_batch to reward signals. Evaluates on minibatch rather than on BrainInfo. * Changes the way reward signal results are reported in rl_trainer so that we get the pure, unprocessed environment reward separate from the reward signals. * Moves end_episode to rl_trainer * Fixed bug with BCModule with RNN
In order for downstream packages to make use of the latest pre-release features, we can pre-release versions of our packages. For packages ending in `devN` pip will not install that package version by default. This change manually updates our package version to a development version with the idea that we can manually perform development versions with the potential for future automated / nightly dev releases.
…r-prefabs fixed broken crawler prefabs
…errors-crawler Fix NaN training errors for crawler
More flexibility on the h5py version
…gpu-doc Added the doc for multi-gpu
…-fix Fixed the flake8
Vector3 velocityRelativeToLookRotationToTarget = targetDirMatrix.inverse.MultiplyVector(rb.velocity); | ||
AddVectorObs(velocityRelativeToLookRotationToTarget); | ||
|
||
Vector3 angularVelocityRelativeToLookRotationToTarget = targetDirMatrix.inverse.MultiplyVector(rb.angularVelocity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line too long, also, We should have a description of what targetDirMatrix.inverse does because it is not obvious.
// Update pos to target | ||
dirToTarget = target.position - body.position; | ||
lookRotation = Quaternion.LookRotation(dirToTarget); | ||
targetDirMatrix = Matrix4x4.TRS(Vector3.zero, lookRotation, Vector3.one); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does TRS do ? Can we have some explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matrix4x4.TRS creates a 4x4 matrix with Translate, Rotation, and Scale components.
No description provided.