Skip to content

Commit 606f253

Browse files
committed
Merge branch 'develop-barracuda' into develop-barracuda-test
* develop-barracuda: Backup and restore fixedDeltaTime and maximumDeltaTime on Academy init / shutdown Restore global gravity value when Academy gets destroyed deleted dead meta file and added a note on the OpenGLCore Graphics API Barracuda : Updating the documentation (#1607) Remove env creation logic from TrainerController (#1562) Fix In editor Docker training (#1582) Only using multiprocess when --num-runs>1 (#1583) Replace AddVectorObs(float[]) and AddVectorObs(List<float>) with a more generic AddVectorObs(IEnumerable<float>) (#1540) fixed the windows ctrl-c bug (#1558) Improve Gym wrapper compatibility and add Dopamine documentation (#1541) Fix typo in documentation (#1516) Update curricula brain names for 0.6 Addressing #1537 Fix for divide-by-zero error with Discrete Actions (#1520) Documentation tweaks and updates (#1479)
2 parents 91ab439 + cea1a1f commit 606f253

36 files changed

+1138
-555
lines changed

UnitySDK/Assets/ML-Agents/Plugins/Barracuda.Core/Tools.meta

Lines changed: 0 additions & 8 deletions
This file was deleted.

UnitySDK/Assets/ML-Agents/Scripts/Agent.cs

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -755,21 +755,11 @@ protected void AddVectorObs(Vector2 observation)
755755
}
756756

757757
/// <summary>
758-
/// Adds a float array observation to the vector observations of the agent.
759-
/// Increases the size of the agents vector observation by size of array.
758+
/// Adds a collection of float observations to the vector observations of the agent.
759+
/// Increases the size of the agents vector observation by size of the collection.
760760
/// </summary>
761761
/// <param name="observation">Observation.</param>
762-
protected void AddVectorObs(float[] observation)
763-
{
764-
info.vectorObservation.AddRange(observation);
765-
}
766-
767-
/// <summary>
768-
/// Adds a float list observation to the vector observations of the agent.
769-
/// Increases the size of the agents vector observation by size of list.
770-
/// </summary>
771-
/// <param name="observation">Observation.</param>
772-
protected void AddVectorObs(List<float> observation)
762+
protected void AddVectorObs(IEnumerable<float> observation)
773763
{
774764
info.vectorObservation.AddRange(observation);
775765
}

UnitySDK/Assets/ML-Agents/Scripts/InferenceBrain/ModelParamLoader.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -411,8 +411,8 @@ private string CheckVisualObsShape(Tensor tensor, int visObsIndex)
411411
var widthBp = resolutionBp.width;
412412
var heightBp = resolutionBp.height;
413413
var pixelBp = resolutionBp.blackAndWhite ? 1 : 3;
414-
var widthT = tensor.Shape[1];
415-
var heightT = tensor.Shape[2];
414+
var heightT = tensor.Shape[1];
415+
var widthT = tensor.Shape[2];
416416
var pixelT = tensor.Shape[3];
417417
if ((widthBp != widthT) || (heightBp != heightT) || (pixelBp != pixelT))
418418
{

config/curricula/push-block/PushBlockBrain.json

Lines changed: 0 additions & 12 deletions
This file was deleted.

docs/Background-TensorFlow.md

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ to TensorFlow-related tools that we leverage within the ML-Agents toolkit.
1616
performing computations using data flow graphs, the underlying representation of
1717
deep learning models. It facilitates training and inference on CPUs and GPUs in
1818
a desktop, server, or mobile device. Within the ML-Agents toolkit, when you
19-
train the behavior of an agent, the output is a TensorFlow model (.bytes) file
19+
train the behavior of an agent, the output is a TensorFlow model (.nn) file
2020
that you can then embed within a Learning Brain. Unless you implement a new
2121
algorithm, the use of TensorFlow is mostly abstracted away and behind the
2222
scenes.
@@ -36,18 +36,3 @@ documentation, but, in the meantime, if you are unfamiliar with TensorBoard we
3636
recommend our guide on [using Tensorboard with ML-Agents](Using-Tensorboard.md) or
3737
this [tutorial](https://github.com/dandelionmane/tf-dev-summit-tensorboard-tutorial).
3838

39-
## TensorflowSharp
40-
41-
One of the drawbacks of TensorFlow is that it does not provide a native C# API.
42-
This means that the Learning Brain is not natively supported since Unity scripts
43-
are written in C#. Consequently, to enable the Learning Brain, we leverage a
44-
third-party library
45-
[TensorFlowSharp](https://github.com/migueldeicaza/TensorFlowSharp) which
46-
provides .NET bindings to TensorFlow. Thus, when a Unity environment that
47-
contains a Learning Brain is built, inference is performed via TensorFlowSharp.
48-
We provide an additional in-depth overview of how to leverage
49-
[TensorFlowSharp within Unity](Using-TensorFlow-Sharp-in-Unity.md)
50-
which will become more
51-
relevant once you install and start training behaviors within the ML-Agents
52-
toolkit. Given the reliance on TensorFlowSharp, the Learning Brain is currently
53-
marked as experimental.

docs/Basic-Guide.md

Lines changed: 11 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -25,32 +25,12 @@ Unity settings.
2525
Equivalent or .NET 4.x Equivalent)**
2626
6. Go to **File** > **Save Project**
2727

28-
## Setting up TensorFlowSharp
29-
30-
We provide pre-trained models (`.bytes` files) for all the agents
31-
in all our demo environments. To be able to run those models, you'll
32-
first need to set-up TensorFlowSharp support. Consequently, you need to install
33-
the TensorFlowSharp plugin to be able to run these models within the Unity
34-
Editor.
35-
36-
1. Download the [TensorFlowSharp Plugin](https://s3.amazonaws.com/unity-ml-agents/0.5/TFSharpPlugin.unitypackage)
37-
2. Import it into Unity by double clicking the downloaded file. You can check
38-
if it was successfully imported by checking the
39-
TensorFlow files in the Project window under **Assets** > **ML-Agents** >
40-
**Plugins** > **Computer**.
41-
3. Go to **Edit** > **Project Settings** > **Player** and add `ENABLE_TENSORFLOW`
42-
to the `Scripting Define Symbols` for each type of device you want to use
43-
(**`PC, Mac and Linux Standalone`**, **`iOS`** or **`Android`**).
44-
45-
![Project Settings](images/project-settings.png)
46-
47-
**Note**: If you don't see anything under **Assets**, drag the
48-
`UnitySDK/Assets/ML-Agents` folder under **Assets** within Project window.
49-
50-
![Imported TensorFlowsharp](images/imported-tensorflowsharp.png)
51-
5228
## Running a Pre-trained Model
53-
We've included pre-trained models for the 3D Ball example.
29+
30+
We include pre-trained models for our agents (`.nn` files) and we use the
31+
[Unity Inference Engine](Unity-Inference-Engine.md) to run these models
32+
inside Unity. In this section, we will use the pre-trained model for the
33+
3D Ball example.
5434

5535
1. In the **Project** window, go to the `Assets/ML-Agents/Examples/3DBall/Scenes` folder
5636
and open the `3DBall` scene file.
@@ -74,7 +54,9 @@ We've included pre-trained models for the 3D Ball example.
7454
folder.
7555
7. Drag the `3DBallLearning` model file from the `Assets/ML-Agents/Examples/3DBall/TFModels`
7656
folder to the **Model** field of the **3DBallLearning** Brain in the **Inspector** window. __Note__ : All of the brains should now have `3DBallLearning` as the TensorFlow model in the `Model` property
77-
8. Click the **Play** button and you will see the platforms balance the balls
57+
8. Select the **InferenceDevice** to use for this model (CPU or GPU).
58+
_Note: CPU is faster for the majority of ML-Agents toolkit generated models_
59+
9. Click the **Play** button and you will see the platforms balance the balls
7860
using the pretrained model.
7961

8062
![Running a pretrained model](images/running-a-pretrained-model.gif)
@@ -93,7 +75,7 @@ More information and documentation is provided in the
9375

9476
## Training the Brain with Reinforcement Learning
9577

96-
### Setting up the enviornment for training
78+
### Setting up the environment for training
9779

9880
To set up the environment for training, you will need to specify which agents are contributing
9981
to the training and which Brain is being trained. You can only perform training with
@@ -240,7 +222,7 @@ INFO:mlagents.trainers: first-run-0: 3DBallLearning: Step: 10000. Mean Reward: 2
240222
### After training
241223

242224
You can press Ctrl+C to stop the training, and your trained model will be at
243-
`models/<run-identifier>/<brain_name>.bytes` where
225+
`models/<run-identifier>/<brain_name>.nn` where
244226
`<brain_name>` is the name of the Brain corresponding to the model.
245227
(**Note:** There is a known bug on Windows that causes the saving of the model to
246228
fail when you early terminate the training, it's recommended to wait until Step
@@ -254,7 +236,7 @@ the steps described
254236
`UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/`.
255237
2. Open the Unity Editor, and select the **3DBall** scene as described above.
256238
3. Select the **3DBallLearning** Learning Brain from the Scene hierarchy.
257-
5. Drag the `<brain_name>.bytes` file from the Project window of
239+
5. Drag the `<brain_name>.nn` file from the Project window of
258240
the Editor to the **Model** placeholder in the **3DBallLearning**
259241
inspector window.
260242
6. Press the :arrow_forward: button at the top of the Editor.

docs/Getting-Started-with-Balance-Ball.md

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,10 @@ The `--train` flag tells the ML-Agents toolkit to run in training mode.
224224
follow the instructions in
225225
[Using an Executable](Learning-Environment-Executable.md).
226226

227+
**Note**: Re-running this command will start training from scratch again. To resume
228+
a previous training run, append the `--load` flag and give the same `--run-id` as the
229+
run you want to resume.
230+
227231
### Observing Training Progress
228232

229233
Once you start training using `mlagents-learn` in the way described in the
@@ -269,19 +273,9 @@ Once the training process completes, and the training process saves the model
269273
use it with Agents having a **Learning Brain**.
270274
__Note:__ Do not just close the Unity Window once the `Saved Model` message appears.
271275
Either wait for the training process to close the window or press Ctrl+C at the
272-
command-line prompt. If you close the window manually, the `.bytes` file
276+
command-line prompt. If you close the window manually, the `.nn` file
273277
containing the trained model is not exported into the ml-agents folder.
274278

275-
### Setting up TensorFlowSharp
276-
277-
Because TensorFlowSharp support is still experimental, it is disabled by
278-
default. Please note that the `Learning` Brain inference can only be used with
279-
TensorFlowSharp.
280-
281-
To set up the TensorFlowSharp Support, follow [Setting up ML-Agents Toolkit
282-
within Unity](Basic-Guide.md#setting-up-ml-agents-within-unity) section. of the
283-
Basic Guide page.
284-
285279
### Embedding the trained model into Unity
286280

287281
To embed the trained model into Unity, follow the later part of [Training the

docs/Learning-Environment-Create-New.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -154,7 +154,8 @@ public class RollerAcademy : Academy { }
154154

155155
The default settings for the Academy properties are also fine for this
156156
environment, so we don't need to change anything for the RollerAcademy component
157-
in the Inspector window.
157+
in the Inspector window. You may not have the RollerBrain in the Broadcast Hub yet,
158+
more on that later.
158159

159160
![The Academy properties](images/mlagents-NewTutAcademy.png)
160161

@@ -547,6 +548,44 @@ you pass to the `mlagents-learn` command for each training run. If you use
547548
the same id value, the statistics for multiple runs are combined and become
548549
difficult to interpret.
549550

551+
## Optional: Multiple Training Areas within the Same Scene
552+
553+
In many of the [example environments](Learning-Environment-Examples.md), many copies of
554+
the training area are instantiated in the scene. This generally speeds up training,
555+
allowing the environment to gather many experiences in parallel. This can be achieved
556+
simply by instantiating many Agents which share the same Brain. Use the following steps to
557+
parallelize your RollerBall environment.
558+
559+
### Instantiating Multiple Training Areas
560+
561+
1. Right-click on your Project Hierarchy and create a new empty GameObject.
562+
Name it TrainingArea.
563+
2. Reset the TrainingArea’s Transform so that it is at (0,0,0) with Rotation (0,0,0)
564+
and Scale (1,1,1).
565+
3. Drag the Floor, Target, and RollerAgent GameObjects in the Hierarchy into the
566+
TrainingArea GameObject.
567+
4. Drag the TrainingArea GameObject, along with its attached GameObjects, into your
568+
Assets browser, turning it into a prefab.
569+
5. You can now instantiate copies of the TrainingArea prefab. Drag them into your scene,
570+
positioning them so that they do not overlap.
571+
572+
### Editing the Scripts
573+
574+
You will notice that in the previous section, we wrote our scripts assuming that our
575+
TrainingArea was at (0,0,0), performing checks such as `this.transform.position.y < 0`
576+
to determine whether our agent has fallen off the platform. We will need to change
577+
this if we are to use multiple TrainingAreas throughout the scene.
578+
579+
A quick way to adapt our current code is to use
580+
localPosition rather than position, so that our position reference is in reference
581+
to the prefab TrainingArea's location, and not global coordinates.
582+
583+
1. Replace all references of `this.transform.position` in RollerAgent.cs with `this.transform.localPosition`.
584+
2. Replace all references of `Target.position` in RollerAgent.cs with `Target.localPosition`.
585+
586+
This is only one way to achieve this objective. Refer to the
587+
[example environments](Learning-Environment-Examples.md) for other ways we can achieve relative positioning.
588+
550589
## Review: Scene Layout
551590

552591
This section briefly reviews how to organize your scene when using Agents in

docs/Learning-Environment-Design-Agents.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -475,6 +475,10 @@ if ((ball.transform.position.y - gameObject.transform.position.y) < -2f ||
475475
The `Ball3DAgent` also assigns a negative penalty when the ball falls off the
476476
platform.
477477

478+
Note that all of these environments make use of the `Done()` method, which manually
479+
terminates an episode when a termination condition is reached. This can be
480+
called independently of the `Max Step` property.
481+
478482
## Agent Properties
479483

480484
![Agent Inspector](images/agent.png)

docs/Learning-Environment-Design-Learning-Brains.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,6 @@ model.
4343
To use a graph model:
4444

4545
1. Select the **Learning Brain** asset in the **Project** window of the Unity Editor.
46-
**Note:** In order to use the **Learning** Brain with inference, you need to have
47-
TensorFlowSharp enabled. Refer to [this section](Basic-Guide.md#setting-up-ml-agents-within-unity) for more information.
4846
2. Import the `model_name` file produced by the PPO training
4947
program. (Where `model_name` is the name of the model file, which is
5048
constructed from the name of your Unity environment executable and the run-id
@@ -54,7 +52,7 @@ To use a graph model:
5452
[import assets into Unity](https://docs.unity3d.com/Manual/ImportingAssets.html)
5553
in various ways. The easiest way is to simply drag the file into the
5654
**Project** window and drop it into an appropriate folder.
57-
3. Once the `model_name.bytes` file is imported, drag it from the **Project**
55+
3. Once the `model_name.nn` file is imported, drag it from the **Project**
5856
window to the **Model** field of the Brain component.
5957

6058
If you are using a model produced by the ML-Agents `mlagents-learn` command, use
@@ -65,9 +63,9 @@ the default values for the other Learning Brain parameters.
6563
The default values of the TensorFlow graph parameters work with the model
6664
produced by the PPO and BC training code in the ML-Agents SDK. To use a default
6765
ML-Agents model, the only parameter that you need to set is the `Model`,
68-
which must be set to the `.bytes` file containing the trained model itself.
66+
which must be set to the `.nn` file containing the trained model itself.
6967

70-
* `Model` : This must be the `.bytes` file corresponding to the pre-trained
68+
* `Model` : This must be the `.nn` file corresponding to the pre-trained
7169
TensorFlow graph. (You must first drag this file into your Project window
7270
and then from the Resources folder into the inspector)
7371

docs/Learning-Environment-Design.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -158,15 +158,16 @@ Brain assigned to this Agent must be set.
158158

159159
You must also determine how an Agent finishes its task or times out. You can
160160
manually set an Agent to done in your `AgentAction()` function when the Agent
161-
has finished (or irrevocably failed) its task. You can also set the Agent's `Max
162-
Steps` property to a positive value and the Agent will consider itself done
163-
after it has taken that many steps. When the Academy reaches its own `Max Steps`
164-
count, it starts the next episode. If you set an Agent's `ResetOnDone` property
165-
to true, then the Agent can attempt its task several times in one episode. (Use
166-
the `Agent.AgentReset()` function to prepare the Agent to start again.)
161+
has finished (or irrevocably failed) its task by calling the `Done()` function.
162+
You can also set the Agent's `Max Steps` property to a positive value and the
163+
Agent will consider itself done after it has taken that many steps. When the
164+
Academy reaches its own `Max Steps` count, it starts the next episode. If you
165+
set an Agent's `ResetOnDone` property to true, then the Agent can attempt its
166+
task several times in one episode. (Use the `Agent.AgentReset()` function to
167+
prepare the Agent to start again.)
167168

168169
See [Agents](Learning-Environment-Design-Agents.md) for detailed information
169-
about programing your own Agents.
170+
about programming your own Agents.
170171

171172
## Environments
172173

docs/Learning-Environment-Executable.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -201,7 +201,7 @@ INFO:mlagents.trainers: first-run-0: Ball3DLearning: Step: 10000. Mean Reward: 2
201201
```
202202

203203
You can press Ctrl+C to stop the training, and your trained model will be at
204-
`models/<run-identifier>/<brain_name>.bytes`, which corresponds
204+
`models/<run-identifier>/<brain_name>.nn`, which corresponds
205205
to your model's latest checkpoint. (**Note:** There is a known bug on Windows
206206
that causes the saving of the model to fail when you early terminate the
207207
training, it's recommended to wait until Step has reached the max_steps
@@ -212,7 +212,7 @@ into your Learning Brain by following the steps below:
212212
`UnitySDK/Assets/ML-Agents/Examples/3DBall/TFModels/`.
213213
2. Open the Unity Editor, and select the **3DBall** scene as described above.
214214
3. Select the **Ball3DLearning** object from the Project window.
215-
5. Drag the `<brain_name>.bytes` file from the Project window of
215+
5. Drag the `<brain_name>.nn` file from the Project window of
216216
the Editor to the **Model** placeholder in the **Ball3DLearning**
217217
inspector window.
218218
6. Remove the **Ball3DLearning** from the Academy's `Broadcast Hub`

docs/ML-Agents-Overview.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -246,10 +246,6 @@ training the Python API uses the observations it receives to learn a TensorFlow
246246
model. This model is then embedded within the Learning Brain during inference to
247247
generate the optimal actions for all Agents linked to that Brain.
248248

249-
**Note that our Learning Brain is currently experimental as it is limited to TensorFlow
250-
models and leverages the third-party
251-
[TensorFlowSharp](https://github.com/migueldeicaza/TensorFlowSharp) library.**
252-
253249
The
254250
[Getting Started with the 3D Balance Ball Example](Getting-Started-with-Balance-Ball.md)
255251
tutorial covers this training mode with the **3D Balance Ball** sample environment.

docs/Readme.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,8 @@
4242
* [Using TensorBoard to Observe Training](Using-Tensorboard.md)
4343

4444
## Inference
45-
* [TensorFlowSharp in Unity (Experimental)](Using-TensorFlow-Sharp-in-Unity.md)
45+
46+
* [Unity Inference Engine](Unity-Inference-Engine.md)
4647

4748
## Help
4849

@@ -55,4 +56,4 @@
5556

5657
* [API Reference](API-Reference.md)
5758
* [How to use the Python API](Python-API.md)
58-
* [Wrapping Learning Environment as a Gym](../gym-unity/README.md)
59+
* [Wrapping Learning Environment as a Gym (+Baselines/Dopamine Integration)](../gym-unity/README.md)

0 commit comments

Comments
 (0)