top of page

This site was designed with the

website builder. Create your website today.Start Now

Search

Commented code for solving CartPole-v1 with Policy Gradient and a simple Neural Network policy

Fuma
Apr 6, 2021
1 min read

A commented, Colab-compatible version of an implementation of a Neural Network Policy to solve OpenAI gym's CartPole-v1 using Policy Gradient.

Colab Notebook: https://gist.github.com/FumaNet/d4b88e3ee8e5b3b456afbb23666ed023

To understand the maths behind Vanilla Policy Gradient aka REINFORCE algorithm: https://fumanet.wixsite.com/website/post/policy-optimization-algorithm-the-maths-behind-reinforce-and-its-gradient-calculation

To get more info about the environment: https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py

Original code available at https://github.com/pytorch/examples/blob/master/reinforcement_learning/reinforce.py

Recent Posts

Implementing a Linear Regression & Multi-Armed Bandit

Implementing a Linear Regression & Multi-Armed Bandit

Policy Optimization Algorithm: the maths behind REINFORCE and its gradient calculation

Policy Optimization Algorithm: the maths behind REINFORCE and its gradient calculation

A Map of Reinforcement Learning Algorithms

A Map of Reinforcement Learning Algorithms

Comments

Post: Blog2_Post

bottom of page