This guide walks you through setting up our Walker Soccer project for training multi-agent soccer with Unity ML-Agents. It reflects the final configuration: 269 observations across both stages and MA-POCA in Stage 2.
- Unity Editor: Version 2022.3+ (the project was built with this version).
- Python: 3.10.12 recommended (ML-Agents requires Python 3.9–3.10).
- Git: For cloning the ML-Agents repository.
- Conda (optional but recommended): For managing Python environments.
You should have the Project folder containing:
Project/
├── Assets/
│ └── ML-Agents/
│ └── CustomWalkingSoccerTwos/
│ ├── Scripts/
│ ├── Prefabs/
│ ├── Scenes/
│ ├── WalkerSoccerStage1_Locomotion.yaml
│ ├── WalkerSoccerStage2_Soccer.yaml
│ └── Docs/
├── Builds/
├── Library/
├── Packages/
└── ProjectSettings/
Follow the official Unity ML-Agents installation instructions:
Unity ML-Agents Installation Guide
- Open the Unity project (
Projectfolder) in Unity Editor (version: 6000.0.40f1). - In Unity, go to Window > Package Manager.
- Click the + button and select Add package from git URL.
- Enter:
com.unity.ml-agents - Unity will install the ML-Agents package (version 4.0 or latest).
The Python library is required to train agents from the command line.
-
Create a new conda environment with Python 3.10.12:
conda create -n mlagents python=3.10.12 conda activate mlagents
-
Install PyTorch (GPU-accelerated version for faster training):
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu129
Note: This installs CUDA 12.9 compatible PyTorch. If you don't have an NVIDIA GPU, use the CPU version:
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cpu
-
Clone the ML-Agents repository (if you haven't already):
git clone https://github.com/Unity-Technologies/ml-agents.git C:\Documents\GitHub\ml-agents
-
Install ML-Agents packages in editable mode:
cd C:\Documents\GitHub\ml-agents pip install -e ./ml-agents-envs pip install -e ./ml-agents
-
Verify installation:
mlagents-learn --help
You should see the ML-Agents command-line options.
-
Create a virtual environment:
python -m venv mlagents-env # On Windows: mlagents-env\Scripts\activate # On macOS/Linux: source mlagents-env/bin/activate
-
Follow steps 2–5 from Option A above.
You can train directly in the Unity Editor or build a standalone executable for faster training.
- Open the project in Unity Editor.
- Go to File > Build Settings.
- Select your platform (Windows, macOS, Linux).
- Click Build and save to
Project/Builds/WalkerStage2_20_V2.exe(or your preferred name).
mlagents-learn Assets/ML-Agents/CustomWalkingSoccerTwos/WalkerSoccerStage2_Soccer.yaml --env="C:\Documents\GitHub\ml-agents\Project\Builds\WalkerStage2_20_V2.exe" --num-envs=8 --run-id=MyTrainingRun- Open the scene:
Assets/ML-Agents/CustomWalkingSoccerTwos/Scenes/WalkerSoccerStage2.unity - Run the training command (without
--envflag):mlagents-learn Assets/ML-Agents/CustomWalkingSoccerTwos/WalkerSoccerStage2_Soccer.yaml --run-id=MyTrainingRun
- Press Play in Unity Editor when prompted.
Stage 1 trains basic walking skills with a unified 269‑observation space (soccer-specific dims are zero-filled for transfer consistency). If you have a pre-trained Stage 1 model, skip to Stage 2.
mlagents-learn Assets/ML-Agents/CustomWalkingSoccerTwos/WalkerSoccerStage1_Locomotion.yaml --run-id=WalkerStage1Transfer locomotion skills and learn soccer tactics (Stage 2 uses MA‑POCA with 2v2 teams and the same 269‑observation space populated with real soccer context):
mlagents-learn Assets/ML-Agents/CustomWalkingSoccerTwos/WalkerSoccerStage2_Soccer.yaml --initialize-from=WalkerStage1 --run-id=WalkerStage2To resume an interrupted training session:
mlagents-learn Assets/ML-Agents/CustomWalkingSoccerTwos/WalkerSoccerStage2_Soccer.yaml --run-id=WalkerStage2 --resumeTraining progress is saved in the results folder. You can monitor with TensorBoard:
tensorboard --logdir=resultsOpen your browser and navigate to http://localhost:6006 to view training graphs.
The best results we got can be seen using the following:
tensorboard --logdir_spec Stage1_best:results\WalkerStage1_48_269obs,Stage2_best:results\WalkerStage2_20_269obsSolution: Use ONNX exports for initialization instead of .pt checkpoints, or patch torch_model_saver.py to add weights_only=False.
Solution:
- Build a standalone executable instead of training in-editor.
- Use
--num-envs=4or higher to run multiple environments in parallel. - Ensure GPU acceleration is enabled (check PyTorch CUDA installation).
Solution:
- Check curriculum thresholds in the YAML file (e.g.,
ball_touch,ball_spawn_radius,locomotion_scale). - Review reward/penalty balance in
WalkerSoccerAgent.cs(goal-aware, role-aware, spacing/marking, corner/own-goal deterrents). - Ensure anti-dive and anti-bunching parameters are tuned correctly, and avoid compounding strong penalties.
- Review training hyperparameters in the YAML files.
- Experiment with reward shaping in
WalkerSoccerAgent.cs. - Enable self-play in Stage 2 YAML once rewards stabilize.
- Experiment with 2v2 and bigger teams
For more details, see the ML-Agents documentation. Also see Docs/PROJECT_HISTORY.md and Docs/PPO_AND_POCA.md for our project’s design and training notes.