Skip to content

Speed-up to O(1) from O(N) of the computation of each return in REINFORCE #1083

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 17, 2022

Conversation

Chris1nexus
Copy link
Contributor

Fix

The function finish_episode() in the reinforce.py script computes returns by insertion at the beginning of a list.
This is an expensive computation in python as shown by the official docs, since it requires O(N) time to shift all its successor elements.
The same can be achieved but with O(1) time complexity with the python deque data structure, natively supported by python.
To verify the above claims, i tested insertion of 100k elements at the beginning of a list and a deque.
The results are:

  • list insertion at the beginning -> 1 second
  • deque insertion at the beginning -> 0.004 seconds

Result

Hence, this is a 250 x speed-up, which becomes relevant when the number of episodes becomes large, as this computation is done at the end of every episode.

Python docs Reference: https://docs.python.org/3/tutorial/datastructures.html#using-lists-as-queues

Test

As per the contribution guidelines, the following tests have been run and successfully completed.
./run_python_examples.sh "install_deps,reinforcement_learning,clean"

@netlify
Copy link

netlify bot commented Oct 17, 2022

Deploy Preview for pytorch-examples-preview canceled.

Name Link
🔨 Latest commit f29a6a8
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-examples-preview/deploys/634cc7c7727ad00008c25ab5

Copy link
Contributor

@hudeven hudeven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The improvement is impressive!

@Chris1nexus
Copy link
Contributor Author

Thanks for the quick review!

YinZhengxun pushed a commit to YinZhengxun/mt-exercise-02 that referenced this pull request Mar 30, 2025
…ORCE (pytorch#1083)

Replace list with deque to obtain O(1) time complexity of insertion at the beginning of the list of returns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants