Skip to content

Refactor reward signals into separate class #2144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 144 commits into from
Jul 3, 2019
Merged
Changes from 1 commit
Commits
Show all changes
144 commits
Select commit Hold shift + click to select a range
eb4abf2
New version of GAIL
awjuliani Oct 9, 2018
d0852ac
Move Curiosity to separate class
awjuliani Oct 12, 2018
4b15b80
Curiosity fully working under new system
awjuliani Oct 12, 2018
ad9381b
Begin implementing GAIL
awjuliani Oct 12, 2018
8bf8302
fix discrete curiosity
vincentpierre Oct 12, 2018
d3e244e
Add expert demonstration
awjuliani Oct 13, 2018
a5b95f7
Remove notebook
awjuliani Oct 13, 2018
dc2fcaa
Record intrinsic rewards properly
awjuliani Oct 13, 2018
49cff40
Add gail model updating
awjuliani Oct 13, 2018
48d3769
Code cleanup
awjuliani Oct 15, 2018
6eeb565
Nested structure for intrinsic rewards
awjuliani Oct 15, 2018
8ca7728
Rename files
awjuliani Oct 15, 2018
226b5c7
Update models so files
awjuliani Oct 15, 2018
3386aa7
fix typo
awjuliani Oct 15, 2018
6799756
Add reward strength parameter
awjuliani Oct 15, 2018
468c407
Use dictionary of reward signals
awjuliani Oct 17, 2018
519e2d3
Remove reward manager
awjuliani Oct 17, 2018
7df1a69
Extrinsic reward just another type
awjuliani Oct 17, 2018
99237cd
Clean up imports
awjuliani Oct 17, 2018
9fa51c1
All reward signals use strength to scale output
awjuliani Oct 17, 2018
7f24677
produce scaled and unscaled reward
awjuliani Oct 18, 2018
4a714d0
Remove unused dictionary
awjuliani Oct 18, 2018
3e2671d
Current trainer config
awjuliani Oct 18, 2018
77211d8
Add discrete control and pyramid experimentation
awjuliani Oct 19, 2018
2334de8
Minor changes to GAIL
awjuliani Oct 20, 2018
439387e
Add relevant strength parameters
awjuliani Oct 21, 2018
ba793a3
Replace string
awjuliani Oct 21, 2018
a52ba0b
Add support for visual observations w/ GAIL
awjuliani Oct 31, 2018
5b2ef22
Finish implementing visual obs for GAIL
awjuliani Nov 1, 2018
13542b4
Include demo files
awjuliani Nov 1, 2018
ae7a8b0
Fix for RNN w/ GAIL
awjuliani Nov 1, 2018
bf89082
Keep track of reward streams separately
awjuliani Nov 2, 2018
360482b
Bootstrap value estimates separately
awjuliani Nov 2, 2018
c78639d
Add value head
awjuliani Nov 14, 2018
3b2485d
Use sepaprate value streams for each reward
awjuliani Nov 15, 2018
40bc9ba
Add VAIL
awjuliani Nov 15, 2018
c6e1504
Use adaptive B
awjuliani Nov 16, 2018
60d9ff7
Comments improvements
vincentpierre Jan 10, 2019
49ec682
Added comments and refactored a pievce of the code
vincentpierre Jan 10, 2019
d9847e0
Added Comments
vincentpierre Jan 10, 2019
dc7620b
Fix on Curriosity
vincentpierre Jan 11, 2019
28e0bd5
Fixed typo
vincentpierre Jan 11, 2019
0257d2b
Added a forgotten comment
vincentpierre Jan 11, 2019
fd55c00
Stabilized Vail learning. Still no learning for Walker
vincentpierre Jan 14, 2019
2343b3f
Fixing typo on curiosity when using visual input
vincentpierre Jan 17, 2019
c74ad19
Added some comments
vincentpierre Jan 17, 2019
2dd7c61
modified the hyperparameters
vincentpierre Jan 17, 2019
42429a5
Fixed some of the tests, will need to refactor the reward signals in …
vincentpierre Jan 19, 2019
ec0e106
Putting the has_updated fags inside each reward signal
vincentpierre Jan 22, 2019
6ae1c2f
Added comments for the GAIL update method
vincentpierre Jan 22, 2019
ef65bc2
initial commit
vincentpierre Jan 24, 2019
8cbdbf4
No more normalization after pre-training
vincentpierre Jan 24, 2019
3f35d45
Fixed large bug in Vail
vincentpierre Jan 30, 2019
3be9be7
BUG FIX VAIL : The noise dimension was wrong and the discriminator sc…
vincentpierre Feb 1, 2019
9e9b4ff
implemented discrete control pretraining
vincentpierre Feb 2, 2019
d537a6b
bug fixing
vincentpierre Feb 3, 2019
713263c
Bug fix, still not tested for recurrent
vincentpierre Feb 6, 2019
ca5b948
Fixing beta in GAIL so it will change properly
vincentpierre Mar 6, 2019
671629e
Allow for not specifying an extrinsic reward
Apr 19, 2019
a31c8a5
Rough implementation of annealed BC
Apr 24, 2019
93cb4ff
Fixes for rebase onto v0.8
Apr 24, 2019
6534291
Moved BC trainer out of reward_signals and code cleanup
Apr 25, 2019
700b478
Rename folder to "components"
Apr 25, 2019
71eedf5
Fix renaming in Curiosity
Apr 25, 2019
83b4603
Remove demo_aided as a required param
May 2, 2019
9e4b4e2
Make old BC compatible
May 2, 2019
f814432
Fix visual obs for curiosity
May 3, 2019
e10194f
Tweaks all around
May 9, 2019
fdcfb30
Add reward normalization and bug fix
May 9, 2019
cb5e927
Load multiple .demo files. Fix bug with csv nans
May 30, 2019
2c5c853
Remove reward normalization
May 30, 2019
e66a343
Rename demo_aided to pretraining
May 30, 2019
0a98289
Fix bc configs
May 30, 2019
cd6e498
Increase small val to prevent NaNs
May 30, 2019
d23f6f3
Fix init in components
May 31, 2019
d93e36e
Merge remote-tracking branch 'origin/develop' into develop-irl-ervin
May 31, 2019
1bf68c7
Fix PPO tests
May 31, 2019
9da6e6c
Refactor components into common location
May 31, 2019
4a57a32
Minor code cleanup
Jun 3, 2019
11cc6f9
Preliminary RNN support
Jun 5, 2019
e66a6f7
Revert regression with NaNs for LSTMs
Jun 6, 2019
bea2bc7
Better LSTM support for BC
Jun 6, 2019
6302a55
Code cleanup and black reformat
Jun 6, 2019
d1cded9
Remove demo_helper and reformat signal
Jun 6, 2019
2b98f3b
Tests for GAIL and curiosity
Jun 6, 2019
440146b
Fix Black again...
Jun 6, 2019
98f9160
Tests for BCModule and visual tests for RewardSignals
Jun 6, 2019
5c923cb
Refactor to new structure and use class generator
Jun 7, 2019
e7ce888
Generalize reward_signal interface and stats
Jun 8, 2019
858194f
Fix incorrect environment reward reporting
Jun 10, 2019
28bceba
Rename reward signals for consistency. clean up comments
Jun 10, 2019
248cae4
Default trainer config (for cloud testing)
Jun 10, 2019
744df94
Remove "curiosity_enc_size" from the regular params
Jun 10, 2019
31dabfc
Fix PushBlock config
Jun 10, 2019
a557f84
Revert Pyramids environment
Jun 10, 2019
d4dbddb
Fix indexing issue with add_experiences
Jun 11, 2019
ddb673b
Fix tests
Jun 11, 2019
975e05b
Change to BCModule
Jun 11, 2019
a83fd5d
Merge branch 'develop' into develop-irl-ervin
Jun 12, 2019
fae7646
Remove the bools for reward signals
Jun 12, 2019
5cf98ac
Make update take in a mini buffer rather than the
Jun 13, 2019
d1afc9b
Always reference reward signals name and not index
Jun 13, 2019
80f2c75
More code cleanup
Jun 13, 2019
394b25a
Clean up reward_signal abstract class
Jun 13, 2019
a9724a3
Fix issue with recording values
Jun 13, 2019
66fef61
Add use_actions to GAIL
Jun 17, 2019
0e3be1d
Add documentation for Reward Signals
Jun 17, 2019
015f50d
Add documentation for GAIL
Jun 17, 2019
7c3059b
Remove unused variables in BCModel
Jun 17, 2019
16c3c06
Remove Entropy Reward Signal
Jun 17, 2019
1fbfa5d
Change tests to use safe_load
Jun 17, 2019
f9a3808
Don't use mutable default
Jun 17, 2019
ce551bf
Set defaults in parent __init__ (Reward Signals)
Jun 17, 2019
3e7ea5b
Remove unneccesary lines
Jun 17, 2019
a40d8be
Remove new features
Jun 17, 2019
abc66cc
Add learning rate option to Curiosity
Jun 17, 2019
1aa0fc5
Correct docs for Reward Signals
Jun 17, 2019
3bccf7f
Revert trainer configs to develop ver
Jun 17, 2019
aab7165
Clean up BC files
Jun 17, 2019
bbbb2e9
Revert BC model
Jun 17, 2019
b5ca952
Revert some changes to trainer
Jun 17, 2019
31cf875
Some more trainer_config cleanup
Jun 17, 2019
53a472d
Make new trainer compatible with old BC
Jun 17, 2019
29e93f2
Merge branch 'develop' into develop-rewardsignalsrefactor
Jun 17, 2019
133a258
Fix black formats
Jun 17, 2019
bae045d
Fixes to typos and unneccessary enumerate()
Jun 18, 2019
b03de8f
Use NamedTuple and more code cleanup
Jun 18, 2019
3499d60
Recursive printing of hyperparams
Jun 18, 2019
411db72
Black format
Jun 18, 2019
84478bf
Doc fixes
Jun 19, 2019
a5f148c
Fixed comment for evaluate
Jun 19, 2019
3733a68
More doc tweaks
Jun 19, 2019
32815f5
Make PPO prints more generic
Jun 19, 2019
70f7407
fix crawler dynamic hyperparams
Jun 20, 2019
0d02b24
Clean up doc formatting
Jun 20, 2019
bba9d7d
Change setup.py so all packages are installed
Jun 20, 2019
de2b5d5
Tweak pyramids hyperparams
Jun 21, 2019
adc9915
More tweaks to Pyramids
Jun 21, 2019
6a1d8d1
curiosity doc section
Jul 1, 2019
554b1c2
Merge remote-tracking branch 'origin/develop' into develop-rewardsign…
Jul 1, 2019
52d3974
get mypy passing
Jul 1, 2019
471b489
Tweak Pyramids hyperparameters
Jul 1, 2019
ed5e84e
Merge branch 'develop-rewardsignalsrefactor' of github.com:Unity-Tech…
Jul 1, 2019
87d77b4
Call static function rather than class function
Jul 3, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Change setup.py so all packages are installed
  • Loading branch information
Ervin Teng committed Jun 20, 2019
commit bba9d7d06b1c6c6ae1f523d32f28fdcf2ddfa81e
7 changes: 5 additions & 2 deletions ml-agents/setup.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from setuptools import setup, find_packages
from setuptools import setup, find_namespace_packages
from os import path
from io import open

Expand All @@ -23,7 +23,10 @@
"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3.6",
],
packages=["mlagents.trainers"], # Required
# find_namespace_packages will recurse through the directories and find all the packages
packages=find_namespace_packages(
exclude=["*.tests", "*.tests.*", "tests.*", "tests"]
),
zip_safe=False,
install_requires=[
"mlagents_envs==0.8.1",
Expand Down