Skip to content

Commit c4da78e

Browse files
authored
Merge pull request dennybritz#39 from yenchenlin/fix-value-prediction
Fix value prediction in A3C
2 parents f117e5d + b271647 commit c4da78e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

PolicyGradient/a3c/worker.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -164,7 +164,7 @@ def update(self, transitions, sess):
164164
# If we episode was not done we bootstrap the value from the last state
165165
reward = 0.0
166166
if not transitions[-1].done:
167-
reward = self._value_net_predict(transitions[-1].state, sess)
167+
reward = self._value_net_predict(transitions[-1].next_state, sess)
168168

169169
# Accumulate minibatch exmaples
170170
states = []

0 commit comments

Comments
 (0)