@@ -26,21 +26,21 @@ All code is written in Python 3 and uses RL environments from [OpenAI Gym](https
2626
2727### List of Implemented Algorithms
2828
29- - [ Dynamic Programming Policy Evaluation] (DP/Policy Evaluation Solution .ipynb)
30- - [ Dynamic Programming Policy Iteration] (DP/Policy Iteration Solution .ipynb)
31- - [ Dynamic Programming Value Iteration] (DP/Value Iteration Solution .ipynb)
32- - [ Monte Carlo Prediction] (MC/MC Prediction Solution .ipynb)
33- - [ Monte Carlo Control with Epsilon-Greedy Policies] (MC/MC Control with Epsilon -Greedy Policies Solution .ipynb)
34- - [ Monte Carlo Off-Policy Control with Importance Sampling] (MC/Off-Policy MC Control with Weighted Importance Sampling Solution .ipynb)
35- - [ SARSA (On Policy TD Learning)] (TD/SARSA Solution .ipynb)
36- - [ Q-Learning (Off Policy TD Learning)] (TD/Q-Learning Solution .ipynb)
37- - [ Q-Learning with Linear Function Approximation] (FA/Q-Learning with Value Function Approximation Solution .ipynb)
38- - [ Deep Q-Learning for Atari Games] (DQN/Deep Q Learning Solution .ipynb)
39- - [ Double Deep-Q Learning for Atari Games] (DQN/Double DQN Solution .ipynb)
29+ - [ Dynamic Programming Policy Evaluation] ( DP/Policy%20Evaluation%20Solution .ipynb )
30+ - [ Dynamic Programming Policy Iteration] ( DP/Policy%20Iteration%20Solution .ipynb )
31+ - [ Dynamic Programming Value Iteration] ( DP/Value%20Iteration%20Solution .ipynb )
32+ - [ Monte Carlo Prediction] ( MC/MC%20Prediction%20Solution .ipynb )
33+ - [ Monte Carlo Control with Epsilon-Greedy Policies] ( MC/MC%20Control%20with%20Epsilon -Greedy%20Policies%20Solution .ipynb )
34+ - [ Monte Carlo Off-Policy Control with Importance Sampling] ( MC/Off-Policy%20MC%20Control%20with%20Weighted%20Importance%20Sampling%20Solution .ipynb )
35+ - [ SARSA (On Policy TD Learning)] ( TD/SARSA%20Solution .ipynb )
36+ - [ Q-Learning (Off Policy TD Learning)] ( TD/Q-Learning%20Solution .ipynb )
37+ - [ Q-Learning with Linear Function Approximation] ( FA/Q-Learning%20with%20Value%20Function%20Approximation%20Solution .ipynb )
38+ - [ Deep Q-Learning for Atari Games] ( DQN/Deep%20Q%20Learning%20Solution .ipynb )
39+ - [ Double Deep-Q Learning for Atari Games] ( DQN/Double%20DQN%20Solution .ipynb )
4040- Deep Q-Learning with Prioritized Experience Replay (WIP)
41- - [ Policy Gradient: REINFORCE with Baseline] (PolicyGradient/CliffWalk REINFORCE with Baseline Solution .ipynb)
42- - [ Policy Gradient: Actor Critic with Baseline] (PolicyGradient/CliffWalk Actor Critic Solution .ipynb)
43- - [ Policy Gradient: Actor Critic with Baseline for Continuous Action Spaces] (PolicyGradient/Continuous MountainCar Actor Critic Solution .ipynb)
41+ - [ Policy Gradient: REINFORCE with Baseline] ( PolicyGradient/CliffWalk%20REINFORCE%20with%20Baseline%20Solution .ipynb )
42+ - [ Policy Gradient: Actor Critic with Baseline] ( PolicyGradient/CliffWalk%20Actor%20Critic%20Solution .ipynb )
43+ - [ Policy Gradient: Actor Critic with Baseline for Continuous Action Spaces] ( PolicyGradient/Continuous%20MountainCar%20Actor%20Critic%20Solution .ipynb )
4444- Deterministic Policy Gradients for Continuous Action Spaces (WIP)
4545- Deep Deterministic Policy Gradients (DDPG) (WIP)
4646- [ Asynchronous Advantage Actor Critic (A3C)] ( PolicyGradient/a3c )
0 commit comments