-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Release v0.8 #1931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Release v0.8 #1931
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add GetTotalStepCount to the Academy This will allow the RecordVideos plugin to record based on the current academy step
* Add timeout wait param * Remove unnecessary function
Removing this function breaks some tests, and the only way around this at this time is a bigger refactor or hacky fixes to tests. For now, I'd suggest we just revert this small part of a change and keep a refactor in mind for the future.
* Move 'take_action' into Policy class This refactor is part of Actor-Trainer separation. Since policies will be distributed across actors in separate processes which share a single trainer, taking an action should be the responsibility of the policy. This change makes a few smaller changes: * Combines `take_action` logic between trainers, making it more generic * Adds an `ActionInfo` data class to be more explicit about the data returned by the policy, only used by TrainerController and policy for now. * Moves trainer stats logic out of `take_action` and into `add_experiences` * Renames 'take_action' to 'get_action'
* As discussed here: #1706 Fixed typos on Installation-Windows.md that made instructions unclear. Added warning about overwriting drivers. * edits * edits, formatting * Incorporated feedback from @vincentpierre
Release v0.7 into develop
Fixup link which point to instructions on how to install tensorflow using anaconda.
* Fix for Brains not reinitialising when the scene is reloaded. This was a bug caused by the conversion of Brains over to ScriptableObjects. ScriptableObjects persist in memory between scene changes, which means that after a scene change the Brains would still be initialised and the agentInfos list would contain invalid references to the Agents from the previous scene. The fix is to have the Academy notify the Brains when it is destroyed. This allows the Brains to clean themselves up and transition back to an uninitialised state. After the new scene is loaded, the Brain's LazyInitialise will reconnect the Brain to the new Academy as expected. * Fix for Brains not reinitialising when the scene is reloaded. This was a bug caused by the conversion of Brains over to ScriptableObjects. ScriptableObjects persist in memory between scene changes, which means that after a scene change the Brains would still be initialised and the agentInfos list would contain invalid references to the Agents from the previous scene. The fix is to have the Academy notify the Brains when it is destroyed. This allows the Brains to clean themselves up and transition back to an uninitialised state. After the new scene is loaded, the Brain's LazyInitialise will reconnect the Brain to the new Academy as expected.
* Fix typos * Use abstract class for rayperception * Created RayPerception2D. (#1721) * Incorporate RayPerception2D * Fix typo * Make abstract class * Add tests
* Garbage collection optimisations: - Changed a few IEnumerable instances to IReadOnlyList. This avoids some unnecessary GC allocs that cast the Lists to IEnumerables. - Moved cdf allocation outside of the loop to avoid unnecessary GC allocation. - Changed GeneratorImpl to use plain float and int arrays instead of Array during generation. This avoids SetValue performing boxing on the arrays, which eliminates an awful lot of GC allocs. * Convert InferenceBrain to use IReadOnlyList to avoid garbage creation.
* fixed the test break on pytest > 4.0, added the pytest cov * added the pytest-cov package * added the logic to upload coverage.yml report to codacy * remove the warning message in during the pytest * added the codacy badge to show what it looks like * added a space * removed the space * removed the duplicate pytest * removed the extra spaces * added the test coverage badge * point the badge to the test branch * changed * moved the python test coverage to circleci * removed the badge * added the badge * fixed the link * Added the gym_unity test to the circleci * Fixed the gym_unity installation * Changed the test-reports from the ml-agents subfolder to the root folder, so that it covers gym_unity’s pytest also
* API for sending custom protobuf messages to and from Unity. * Rename custom_output to custom_outputs. * Move custom protos to their own files. * Add SetCustomOutput method. * Add docstrings. * Various adjustments. * Rename CustomParameters -> CustomResetParameters * Rename CustomOutput -> CUstomObservation * Add CustomAction * Add CustomActionResult * Remove custom action result. * Remove custom action result from Python API * Start new documentation. * Add some docstrings * Expand documentation. * Typos * Tweak doc. Also eliminate GetCustomObservation. * Fix typo. * Clarify docs. * Remove trailing whitspace
… Observation (#1824) * Added RenderTexture support for visual observations * Cleaned up new ObservationToTexture function * Added check for to width/height of RenderTexture * Added check to hide HelpBox unless both cameras and RenderTextures are used * Added documentation for Visual Observations using RenderTextures * Added GridWorldRenderTexture Example scene * Adjusted image size of doc images * Added GridWorld example reference * Fixed missing reference in the GridWorldRenderTexture scene and resaved the agent prefab * Fix prefab instantiation and render timing in GridWorldRenderTexture * Added screenshot and reworded documentation * Unchecked control box * Rename renderTexture * Make RenderTexture scene default for GridWorld Co-authored-by: Mads Johansen <[email protected]>
* Reogranize project * Fix all tests * Address comments * Delete init file * Update requirements * Tick version * Add timeout wait parameter (mlagents_envs) (#1699) * Add timeout wait param * Remove unnecessary function * Add new meta files for communicator objects * Fix all tests * update circleci * Reorganize mlagents_envs tests * WIP: test removing circleci cache * Move gym tests * Namespaced packages * Update installation instructions for separate packages * Remove unused package from setup script * Add Readme for ml-agents-envs * Clarify docs and re-comment compiler in make.bat * Add more doc to installation * Add back fix for Hololens * Recompile Protobufs * Change mlagents_envs to mlagents.envs in trainer_controller * Remove extraneous files, fix win bat script * Support Python 3.7 for envs package
* Update to documentation * Update Custom-Protos.md
* add "Control" check instruction (#1719)
… training, time to collect experiences, buffer length, average return
Added logging per Brain of time to update policy, time elapsed during training, time to collect experiences, buffer length, average return per policy
This commit adds support for running Unity environments in parallel. An abstract base class was created for UnityEnvironment which a new SubprocessUnityEnvironment inherits from. SubprocessUnityEnvironment communicates through a pipe in order to send commands which will be run in parallel to its workers. A few significant changes needed to be made as a side-effect: * UnityEnvironments are created via a factory method (a closure) rather than being directly created by the main process. * In mlagents-learn "worker-id" has been replaced by "base-port" and "num-envs", and worker_ids are automatically assigned across runs. * BrainInfo objects now convert all fields to numpy arrays or lists to avoid serialization issues.
* Fixes missing tag change, plus code cleanup * Fix bug in agent position setting
- Ticked API for pypi for mlagents - Ticked API for pypi for mlagents_envs - Ticked Communication number for API - Ticked API for unity-gym * Ticked the API for the pytest
* Create Using-TensorFlow-Sharp-in-Unity.md * Update Using-TensorFlow-Sharp-in-Unity.md * Update Using-TensorFlow-Sharp-in-Unity.md
We need to document the meaning of the two new flags added for multi-environment training. We may also want to add more specific instructions for people wanting to speed up training in the future.
Sends close command when closing workers.
SubprocessUnityEnvironment sends an environment factory function to each worker which it can use to create a UnityEnvironment to interact with. We use Python's standard multiprocessing library, which pickles all data sent to the subprocess. The built-in pickle library doesn't pickle function objects on Windows machines (tested with Python 3.6 on Windows 10 Pro). This PR adds cloudpickle as a dependency in order to serialize the environment factory. Other implementations of subprocess environments do the same: https://github.com/openai/baselines/blob/master/baselines/common/vec_env/subproc_vec_env.py
When using the SubprocessUnityEnvironment, parallel writes are made to UnitySDK.log. This causes file access violation issues in Windows/C#. This change modifies the access and sharing mode for our writes to UnitySDK.log to fix the issue.
On Windows the interrupt for subprocesses works in a different way from OSX/Linux. The result is that child subprocesses and their pipes may close while the parent process is still running during a keyboard (ctrl+C) interrupt. To handle this, this change adds handling for EOFError and BrokenPipeError exceptions when interacting with subprocess environments. Additional management is also added to be sure when using parallel runs using the "num-runs" option that the threads for each run are joined and KeyboardInterrupts are handled. These changes made the "_win_handler" we used to specially manage interrupts on Windows unnecessary, so they have been removed.
A change was made to the way the "train_mode" flag was used by environments when SubprocessUnityEnvironment was added which was intended to be part of a separate change set. This broke the CLI '--slow' flag. This change undoes those changes, so that the slow / fast simulation option works correctly. As a minor additional change, the remaining tests from top level 'tests' folders have been moved into the new test folders.
* update title caps * Rename Custom-Protos.md to Creating-Custom-Protobuf-Messages.md * Updated with custom protobuf messages * Cleanup against to our doc guidelines * Minor text revision * Create Training-Concurrent-Unity-Instances * Rename Training-Concurrent-Unity-Instances to Training-Concurrent-Unity-Instances.md * update to right format for --num-envs * added link to concurrent unity instances * Update and rename Training-Concurrent-Unity-Instances.md to Training-Using-Concurrent-Unity-Instances.md * Added considerations section * Update Training-Using-Concurrent-Unity-Instances.md * cleaned up language to match doc * minor updates * retroactive migration from 0.6 to 0.7 * Updated from 0.7 to 0.8 migration * Minor typo * minor fix * accidentally duplicated step * updated with new features list
Fix '--slow' flag after environment updates
…pdate Updated all the scenes’s model and the bouncer’s expected reward
…pdate Updated the 3dballhard model
xiaomaogy
previously approved these changes
Apr 12, 2019
I guess we can turn off the code check on C# side= = |
Glanced over all of the files in the PR, seems all good. |
Migration doc fixes
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.