Skip to content

Commit 2e945ba

Browse files
authored
Release v0.8 docs (#1924)
* update title caps * Rename Custom-Protos.md to Creating-Custom-Protobuf-Messages.md * Updated with custom protobuf messages * Cleanup against to our doc guidelines * Minor text revision * Create Training-Concurrent-Unity-Instances * Rename Training-Concurrent-Unity-Instances to Training-Concurrent-Unity-Instances.md * update to right format for --num-envs * added link to concurrent unity instances * Update and rename Training-Concurrent-Unity-Instances.md to Training-Using-Concurrent-Unity-Instances.md * Added considerations section * Update Training-Using-Concurrent-Unity-Instances.md * cleaned up language to match doc * minor updates * retroactive migration from 0.6 to 0.7 * Updated from 0.7 to 0.8 migration * Minor typo * minor fix * accidentally duplicated step * updated with new features list
1 parent 15fcf95 commit 2e945ba

7 files changed

+83
-41
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ developer communities.
3434
* Visualizing network outputs within the environment
3535
* Simplified set-up with Docker
3636
* Wrap learning environments as a gym
37+
* Utilizes the Unity Inference Engine
38+
* Train using concurrent Unity environment instances
3739

3840
## Documentation
3941

docs/Custom-Protos.md renamed to docs/Creating-Custom-Protobuf-Messages.md

Lines changed: 34 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,30 @@
1-
# Creating custom protobuf messages
1+
# Creating Custom Protobuf Messages
22

33
Unity and Python communicate by sending protobuf messages to and from each other. You can create custom protobuf messages if you want to exchange structured data beyond what is included by default.
44

5-
Assume the ml-agents repository is checked out to a folder named $MLAGENTS_ROOT. Whenever you change the fields of a custom message, you must run `$MLAGENTS_ROOT/protobuf-definitions/make.bat` to create C# and Python files corresponding to the new message. Follow the directions in [this file](../protobuf-definitions/README.md) for guidance. After running it, reinstall the Python package by running `pip install $MLAGENTS_ROOT/ml-agents` and make sure your Unity project is using the newly-generated version of `$MLAGENTS_ROOT/UnitySDK`.
5+
## Implementing a Custom Message
66

7-
## Custom message types
7+
Assume the ml-agents repository is checked out to a folder named $MLAGENTS_ROOT. Whenever you change the fields of a custom message, you must run `$MLAGENTS_ROOT/protobuf-definitions/make.bat` to create C# and Python files corresponding to the new message. Follow the directions in [this file](../protobuf-definitions/README.md) for guidance. After running `$MLAGENTS_ROOT/protobuf-definitions/make.bat`, reinstall the Python package by running `pip install $MLAGENTS_ROOT/ml-agents` and make sure your Unity project is using the newly-generated version of `$MLAGENTS_ROOT/UnitySDK`.
88

9-
There are three custom message types currently supported, described below. In each case, `env` is an instance of a `UnityEnvironment` in Python. `CustomAction` is described most thoroughly; usage of the other custom messages follows a similar template.
9+
## Custom Message Types
1010

11-
### Custom actions
11+
There are three custom message types currently supported - Custom Actions, Custom Reset Parameters, and Custom Observations. In each case, `env` is an instance of a `UnityEnvironment` in Python.
1212

13-
By default, the Python API sends actions to Unity in the form of a floating-point list per agent and an optional string-valued text action.
13+
### Custom Actions
1414

15-
You can define a custom action type to replace or augment this by adding fields to the `CustomAction` message, which you can do by editing the file `protobuf-definitions/proto/mlagents/envs/communicator_objects/custom_action.proto`.
15+
By default, the Python API sends actions to Unity in the form of a floating point list and an optional string-valued text action for each agent.
1616

17-
Instances of custom actions are set via the `custom_action` parameter of `env.step`. An agent receives a custom action by defining a method with the signature
17+
You can define a custom action type, to either replace or augment the default, by adding fields to the `CustomAction` message, which you can do by editing the file `protobuf-definitions/proto/mlagents/envs/communicator_objects/custom_action.proto`.
18+
19+
Instances of custom actions are set via the `custom_action` parameter of the `env.step`. An agent receives a custom action by defining a method with the signature:
1820

1921
```csharp
2022
public virtual void AgentAction(float[] vectorAction, string textAction, CommunicatorObjects.CustomAction customAction)
2123
```
2224

23-
Here is an example of creating a custom action that instructs an agent to choose a cardinal direction to walk in and how far to walk.
25+
Below is an example of creating a custom action that instructs an agent to choose a cardinal direction to walk in and how far to walk.
2426

25-
`custom_action.proto` will look like
27+
The `custom_action.proto` file looks like:
2628

2729
```protobuf
2830
syntax = "proto3";
@@ -42,7 +44,7 @@ message CustomAction {
4244
}
4345
```
4446

45-
In your Python file, create an instance of a custom action:
47+
The Python instance of the custom action looks like:
4648

4749
```python
4850
from mlagents.envs.communicator_objects import CustomAction
@@ -52,7 +54,7 @@ action = CustomAction(direction=CustomAction.NORTH, walkAmount=2.0)
5254
env.step(custom_action=action)
5355
```
5456

55-
Then in your agent,
57+
And the agent code looks like:
5658

5759
```csharp
5860
...
@@ -72,17 +74,17 @@ class MyAgent : Agent {
7274
}
7375
```
7476

75-
Note that the protobuffer compiler automatically configures the capitalization scheme of the C# version of the custom field names you defined in the `CustomAction` message to match C# conventions - "NORTH" becomes "North", "walkAmount" becomes "WalkAmount", etc.
77+
Keep in mind that the protobuffer compiler automatically configures the capitalization scheme of the C# version of the custom field names you defined in the `CustomAction` message to match C# conventions - "NORTH" becomes "North", "walkAmount" becomes "WalkAmount", etc.
7678

77-
### Custom reset parameters
79+
### Custom Reset Parameters
7880

79-
By default, you can configure an environment `env ` in the Python API by specifying a `config` parameter that is a dictionary mapping strings to floats.
81+
By default, you can configure an environment `env` in the Python API by specifying a `config` parameter that is a dictionary mapping strings to floats.
8082

81-
You can also configure an environment using a custom protobuf message. To do so, add fields to the `CustomResetParameters` protobuf message in `custom_reset_parameters.proto`, analogously to `CustomAction` above. Then pass an instance of the message to `env.reset` via the `custom_reset_parameters` keyword parameter.
83+
You can also configure the environment reset using a custom protobuf message. To do this, add fields to the `CustomResetParameters` protobuf message in `custom_reset_parameters.proto`, analogously to `CustomAction` above. Then pass an instance of the message to `env.reset` via the `custom_reset_parameters` keyword parameter.
8284

8385
In Unity, you can then access the `customResetParameters` field of your academy to accesss the values set in your Python script.
8486

85-
In this example, an academy is setting the initial position of a box based on custom reset parameters that looks like
87+
In this example, the academy is setting the initial position of a box based on custom reset parameters. The `custom_reset_parameters.proto` would look like:
8688

8789
```protobuf
8890
message CustomResetParameters {
@@ -101,7 +103,18 @@ message CustomResetParameters {
101103
}
102104
```
103105

104-
In your academy, you'd have something like
106+
The Python instance of the custom reset parameter looks like
107+
108+
```python
109+
from mlagents.envs.communicator_objects import CustomResetParameters
110+
env = ...
111+
pos = CustomResetParameters.Position(x=1, y=1, z=2)
112+
color = CustomResetParameters.Color(r=.5, g=.1, b=1.0)
113+
params = CustomResetParameters(initialPos=pos, color=color)
114+
env.reset(custom_reset_parameters=params)
115+
```
116+
117+
The academy looks like
105118

106119
```csharp
107120
public class MyAcademy : Academy
@@ -122,18 +135,7 @@ public class MyAcademy : Academy
122135
}
123136
```
124137

125-
Then in Python, when setting up your scene, you might write
126-
127-
```python
128-
from mlagents.envs.communicator_objects import CustomResetParameters
129-
env = ...
130-
pos = CustomResetParameters.Position(x=1, y=1, z=2)
131-
color = CustomResetParameters.Color(r=.5, g=.1, b=1.0)
132-
params = CustomResetParameters(initialPos=pos, color=color)
133-
env.reset(custom_reset_parameters=params)
134-
```
135-
136-
### Custom observations
138+
### Custom Observations
137139

138140
By default, Unity returns observations to Python in the form of a floating-point vector.
139141

@@ -143,8 +145,7 @@ Then in your agent, create an instance of a custom observation via `new Communic
143145

144146
In Python, the custom observation can be accessed by calling `env.step` or `env.reset` and accessing the `custom_observations` property of the return value. It will contain a list with one `CustomObservation` instance per agent.
145147

146-
For example, if you have added a field called `customField` to the `CustomObservation` message, you would program your agent like
147-
148+
For example, if you have added a field called `customField` to the `CustomObservation` message, the agent code looks like:
148149

149150
```csharp
150151
class MyAgent : Agent {
@@ -156,7 +157,7 @@ class MyAgent : Agent {
156157
}
157158
```
158159

159-
Then in Python, the custom field would be accessed like
160+
In Python, the custom field would be accessed like:
160161

161162
```python
162163
...

docs/Installation.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,8 +82,7 @@ parameters you can use with `mlagents-learn`.
8282

8383
If you intend to make modifications to `ml-agents` or `ml-agents-envs`, you should install
8484
the packages from the cloned repo rather than from PyPi. To do this, you will need to install
85-
`ml-agents` and `ml-agents-envs` separately. Do this by running (starting from the repo's main
86-
directory):
85+
`ml-agents` and `ml-agents-envs` separately. From the repo's root directory, run:
8786

8887
```sh
8988
cd ml-agents-envs
@@ -98,7 +97,6 @@ reflected when you run `mlagents-learn`. It is important to install these packag
9897
`mlagents` package depends on `mlagents_envs`, and installing it in the other
9998
order will download `mlagents_envs` from PyPi.
10099

101-
102100
## Docker-based Installation
103101

104102
If you'd like to use Docker for ML-Agents, please follow

docs/Migrating.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,22 @@
11
# Migrating
22

3+
## Migrating from ML-Agents toolkit v0.7 to v0.8
4+
5+
### Important Changes
6+
* We have split the Python packges into two seperate packages `ml-agents` and `ml-agents-envs`
7+
8+
#### Steps to Migrate
9+
* If you are installing via PyPI, there is no change.
10+
* If you intend to make modifications to `ml-agents` or `ml-agents-envs` please check the Installing for Development in the [Installation documentation](Installation.md).
11+
12+
## Migrating from ML-Agents toolkit v0.6 to v0.7
13+
14+
### Important Changes
15+
* We no longer support TFS and are now using the [Unity Inference Engine](Unity-Inference-Engine.md)
16+
17+
#### Steps to Migrate
18+
* Make sure to remove the `ENABLE_TENSORFLOW` flag in your Unity Project settings
19+
320
## Migrating from ML-Agents toolkit v0.5 to v0.6
421

522
### Important Changes

docs/Readme.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
* [Learning Environment Best Practices](Learning-Environment-Best-Practices.md)
3030
* [Using the Monitor](Feature-Monitor.md)
3131
* [Using an Executable Environment](Learning-Environment-Executable.md)
32+
* [Creating Custom Protobuf Messages](Creating-Custom-Protobuf-Messages.md)
3233

3334
## Training
3435

@@ -39,6 +40,7 @@
3940
* [Training with LSTM](Feature-Memory.md)
4041
* [Training on the Cloud with Amazon Web Services](Training-on-Amazon-Web-Service.md)
4142
* [Training on the Cloud with Microsoft Azure](Training-on-Microsoft-Azure.md)
43+
* [Training Using Concurrent Unity Instances](Training-Using-Concurrent-Unity-Instances.md)
4244
* [Using TensorBoard to Observe Training](Using-Tensorboard.md)
4345

4446
## Inference

docs/Training-ML-Agents.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -134,12 +134,9 @@ environment, you can set the following command line options when invoking
134134
[Academy Properties](Learning-Environment-Design-Academy.md#academy-properties).
135135
* `--train` – Specifies whether to train model or only run in inference mode.
136136
When training, **always** use the `--train` option.
137-
* `--num-envs` - Specifies the number of parallel environments to collect
137+
* `--num-envs=<n>` - Specifies the number of concurrent Unity environment instances to collect
138138
experiences from when training. Defaults to 1.
139-
* `--base-port` - Specifies the starting port for environment workers. Each Unity
140-
environment will use the port `(base_port + worker_id)`, where the worker ID
141-
are sequential IDs given to each environment from 0 to `num_envs - 1`.
142-
Defaults to 5005.
139+
* `--base-port` - Specifies the starting port. Each concurrent Unity environment instance will get assigned a port sequentially, starting from the `base-port`. Each instance will use the port `(base_port + worker_id)`, where the `worker_id` is sequential IDs given to each instance from 0 to `num_envs - 1`. Default is 5005.
143140
* `--docker-target-name=<dt>` – The Docker Volume on which to store curriculum,
144141
executable and model files. See [Using Docker](Using-Docker.md).
145142
* `--no-graphics` - Specify this option to run the Unity executable in
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Training Using Concurrent Unity Instances
2+
3+
As part of release v0.8, we enabled developers to run concurrent, parallel instances of the Unity executable during training. For certain scenarios, this should speed up the training.
4+
5+
## How to Run Concurrent Unity Instances During Training
6+
7+
Please refer to the general instructions on [Training ML-Agents](Training-ML-Agents.md). In order to run concurrent Unity instances during training, set the number of environment instances using the command line option `--num-envs=<n>` when you invoke `mlagents-learn`. Optionally, you can also set the `--base-port`, which is the starting port used for the concurrent Unity instances.
8+
9+
## Considerations
10+
11+
### Buffer Size
12+
13+
If you are having trouble getting an agent to train, even with multiple concurrent Unity instances, you could increase `buffer_size` in the `config/trainer_config.yaml` file. A common practice is to multiply `buffer_size` by `num-envs`.
14+
15+
### Resource Constraints
16+
17+
Invoking concurrent Unity instances is constrained by the resources on the machine. Please use discretion when setting `--num-envs=<n>`.
18+
19+
### Using num-runs and num-envs
20+
21+
If you set `--num-runs=<n>` greater than 1 and are also invoking concurrent Unity instances using `--num-envs=<n>`, then the number of concurrent Unity instances is equal to `num-runs` times `num-envs`.
22+
23+
### Result Variation Using Concurrent Unity Instances
24+
25+
If you keep all the hyperparameters the same, but change `--num-envs=<n>`, the results and model would likely change.

0 commit comments

Comments
 (0)