Skip to content

第四课 无模型的预测 | Pre-Demo-Field #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Duan-JM opened this issue Mar 10, 2019 · 0 comments
Closed

第四课 无模型的预测 | Pre-Demo-Field #54

Duan-JM opened this issue Mar 10, 2019 · 0 comments

Comments

@Duan-JM
Copy link
Owner

Duan-JM commented Mar 10, 2019

https://vdeamov.github.io/2019/03/11/%E7%AC%AC%E5%9B%9B%E8%AF%BE-%E6%97%A0%E6%A8%A1%E5%9E%8B%E7%9A%84%E9%A2%84%E6%B5%8B/

--- layout: post title: "第四课 无模型的预测" date: 2019-03-11 categories: ReinforceLearning tags: ["ReinforceLearning", "强化学习"] --- 第四课 无模型的预测 这一课帅小哥主要讲的内容是预测的部分,在第五课会加入控制的部分。其中预测的部分主要是两个相似的算法,一个为 Monte-Carlo(MC),另一个为 Temporal-Difference(TD)。两者的区别主要在于,MC 为需要在出现终止状态后,才能得到 Reward,而 TD 则是实时的。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant