A pytorch reprelication of the model-based reinforcement learning algorithm MBPO

Last update: Jan 05, 2023

Related tags

Overview

This is a re-implementation of the model-based RL algorithm MBPO in pytorch as described in the following paper: When to Trust Your Model: Model-Based Policy Optimization.

This code is based on a previous paper in the NeurIPS reproducibility challenge that reproduces the result with a tensorflow ensemble model but shows a significant drop in performance with a pytorch ensemble model. This code re-implements the ensemble dynamics model with pytorch and closes the gap.

Reproduced results

The comparison are done on two tasks while other tasks are not tested. But on the tested two tasks, the pytorch implementation achieves similar performance compared to the official tensorflow code.

Dependencies

MuJoCo 1.5 & MuJoCo 2.0

Usage

python main_mbpo.py --env_name 'Walker2d-v2' --num_epoch 300 --model_type 'pytorch'

python main_mbpo.py --env_name 'Hopper-v2' --num_epoch 300 --model_type 'pytorch'

Reference

Official tensorflow implementation: https://github.com/JannerM/mbpo
Code to the reproducibility challenge paper: https://github.com/jxu43/replication-mbpo

A pytorch reprelication of the model-based reinforcement learning algorithm MBPO

Related tags

Overview

Overview

Reproduced results

Dependencies

Usage

Reference

Owner

Xingyu Lin

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

use tensorflow 2.0 to tell a dog and cat from a specified picture

Cross View SLAM

Mitsuba 2: A Retargetable Forward and Inverse Renderer

Code & Data for Enhancing Photorealism Enhancement

A toolkit for Lagrangian-based constrained optimization in Pytorch

Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"

Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

Empower Sequence Labeling with Task-Aware Language Model

🌳 A Python-inspired implementation of the Optimum-Path Forest classifier.

An updated version of virtual model making

Image Captioning on google cloud platform based on iot

In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy

Codes for "Template-free Prompt Tuning for Few-shot NER".

Differentiable Wavetable Synthesis

Graph Convolutional Networks in PyTorch

Instance Semantic Segmentation List

Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.