Skip to content

Fix the CSV file reward lagging way behind the actual rewards #2120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 11, 2019

Conversation

ervteng
Copy link
Contributor

@ervteng ervteng commented Jun 11, 2019

We aren't clearing the List called self.cumulative_returns_since_policy_update when we update the policy. This is used to compute the mean rewards to write to CSV, and it just gets longer and longer through training.

This PR clears it when we update the policy.

Before the CSV file's mean rewards would lag much behind the rest of the code since this buffer was never cleared.
@ervteng ervteng changed the base branch from master to develop June 11, 2019 01:37
@ervteng ervteng requested a review from xiaomaogy June 11, 2019 01:38
@xiaomaogy xiaomaogy merged commit c5226f6 into develop Jun 11, 2019
@xiaomaogy xiaomaogy deleted the develop-fix-csvwriting branch June 11, 2019 17:56
sankalp04 pushed a commit that referenced this pull request Jun 21, 2019
Before the CSV file's mean rewards would lag much behind the rest of the code since this buffer was never cleared.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants