Commented code for solving CartPole-v1 with Policy Gradient and a simple Neural Network policy
- Fuma
- Apr 6, 2021
- 1 min read
A commented, Colab-compatible version of an implementation of a Neural Network Policy to solve OpenAI gym's CartPole-v1 using Policy Gradient.
Colab Notebook: https://gist.github.com/FumaNet/d4b88e3ee8e5b3b456afbb23666ed023
To understand the maths behind Vanilla Policy Gradient aka REINFORCE algorithm: https://fumanet.wixsite.com/website/post/policy-optimization-algorithm-the-maths-behind-reinforce-and-its-gradient-calculation
To get more info about the environment: https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py
Original code available at https://github.com/pytorch/examples/blob/master/reinforcement_learning/reinforce.py
Comments