Skip to content

fanghao1024/MaMingRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a reinforcement learning library.

The library will contain these RL algorithms:

算法algorithm 算法版本 algorithm version 实现的环境 env 进度 status 位置position
蒙特卡罗评估 MC on-policy version 十点半 half ten /utils/MC_method
off-policy version /utils/MC_method
weighted importance off-policy version /utils/MC_method
时序差分TD 1-step SARSA CliffWalking /utils/TD_method
Q-learning /utils/TD_method
n-step SARSA /utils/TD_method
资格迹 eligible trace forward TD(λ) 暂定清华大学出版社《强化学习》第6章里的风格子
backward TD(λ)
forward SARSA(λ)
backward SARSA(λ)
forward Watkins‘s Q(λ)
backward Watkins's Q(λ)
Peng's Q(λ)
值函数线性逼近 多项式基函数-增量法-MC参数逼近-SARSA实现
多项式基函数-增量法-MC参数逼近-Q实现
多项式基函数-增量法-TD参数逼近-SARSA实现
多项式基函数-增量法-TD参数逼近-Q实现
多项式基函数-增量法-前向TD(λ)参数逼近-SARSA实现
多项式基函数-增量法-后向TD(λ)参数逼近-Q实现
多项式基函数-批量法-MC参数逼近-SARSA实现
多项式基函数-批量法-MC参数逼近-Q实现
多项式基函数-批量法-TD参数逼近-SARSA实现
多项式基函数-批量法-TD参数逼近-Q实现
多项式基函数-批量法-前向TD(λ)参数逼近-SARSA实现
多项式基函数-批量法-后向TD(λ)参数逼近-Q实现
傅里叶基函数
径向基函数
值函数非线性逼近 DQN Pendulumn-v1 /Solver/DQNSolver
Double DQN /Solver/DQNSolver
Dueling DQN /Solver/DQNSolver
策略梯度policy gradient REINFORCE CartPole /Solver/reinforce
REINFORCE with baseline
TRPO CartPole /Solver/TRPOSolver
PPO
Actor-Critic on-policy AC(Critic use TD(0),SARSA,TD(lambda),Q-learning,n-step) Pendulum
off-policy AC(Critic use TD(0),SARSA,TD(lambda),Q-learning,n-step)
off-policy AC(Critic use TD with Gradient Correction Term)
A2C
异步方法 异步Q-learning
异步SARSA
异步n-step Q-learning
A3C
确定性策略DPG on-policy 确定性AC Pendulum
off-policy 确定性AC
DDPG
学习与规划
learning and plan
Dyna-Q CliffWalking /utils/dynaQ
Dyna-Q+
优先级扫描的Dyna-Q
Dyna-2
探索与利用
exploration vs exploitation
epsilon-greedy 多臂老虎机 utils/explore_discrete_action
linear decaying epsilon greedy utils/explore_discrete_action
UCB1 utils/explore_discrete_action
Thompson Sampling Solver/BanditSolver
DP policy iteration frozenLake Solver/DPIteration
value iteration Solver/DPIteration
博弈强化学习 AlphaGo Zero 五子棋

​ ​ ​ ​ ​ ​ ​

About

RL library which will contain some basic reinforcement learning algorithms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages