Skip to content

A couple fixes for recording demonstrations #1999

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 3, 2019
Merged

Conversation

ervteng
Copy link
Contributor

@ervteng ervteng commented Apr 30, 2019

  • During sanitation of demo filename, cut the string if it is too long. A long name may exceed the 32-bytes reserved for metadata, and cause corruption.
  • Store lastActions[] in Agent.cs so that we can write it to the demonstration on Done(). Previously, the last action recorded was always 0.
  • Fix for on demand actions to prevent the Agent from taking an additional action after Done() is called. This is problematic for some games where the reset causes the scene to be in a state where taking an action causes undesirable behavior.

@ervteng ervteng requested a review from vincentpierre April 30, 2019 05:40
@@ -1011,6 +1021,8 @@ void ResetIfDone()
// as it is done
_AgentReset();
hasAlreadyReset = true;
requestDecision = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we make sure this does not break stuff ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any of our environments (aside from Snoopy Pop, where this does work) use On Demand Decisions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bouncer uses ODD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Training Bouncer, seems to work (1.0 reward after 50k steps and climbing)

@@ -14,6 +14,7 @@ public class DemonstrationRecorder : MonoBehaviour
private Agent recordingAgent;
private string filePath;
private DemonstrationStore demoStore;
private int maxNameLength = 16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a const and capitalize variable name

@@ -584,7 +586,15 @@ void SendInfoToBrain()
}

info.memories = action.memories;
info.storedVectorActions = action.vectorActions;
if(done)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see how this solves the problem, could you add some comments explaining what problem it solves and how ?

Copy link
Contributor Author

@ervteng ervteng Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a short comment - my initial description on the pull request itself was wrong. Since Reset() is called before SendInfoToBrain(), when Reset happens, action.vectorActions is zeroed out, then SendInfoToBrain() is called and the 0's are stored in storedVectorActions. This is OK for training since the Python code doesn't use the storedVectorActions, but when recording a demo, it saves the 0's to the file. (now that I think about it, we might need to do this for the textActions as well.)

I don't neccessarily like this fix as it isn't very clean. But it seems like the only thing possible without reordering the Academy step loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not zero out action.vectorActions at reset ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we do this? Maybe we can initialize them to 0 on creation but not zero them out? I guess it's problematic since 0 isn't null, it has an actual meaning when it comes to actions

@Unity-Technologies Unity-Technologies deleted a comment Apr 30, 2019
@Unity-Technologies Unity-Technologies deleted a comment May 1, 2019
@ervteng ervteng requested a review from vincentpierre May 1, 2019 18:44
@Unity-Technologies Unity-Technologies deleted a comment May 1, 2019
@ervteng ervteng merged commit e1d2b69 into develop May 3, 2019
@ervteng ervteng deleted the develop-demofixes branch July 9, 2019 22:01
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants