Bag of Tricks for Natural Policy Gradient Reinforcement Learning [ArXiv]

Setup

Python 3.8.0
pip install -r req.txt
Mujoco 200 license

Main Files

main.py: main run file for model training
models.py: neural networks for policy and critic models
optim.py: second-order approximations for realizing the natural gradient
utils.py: helper functions

Reproducing Experiments

scripts/: bash training scripts formatted for compute canada/SLURM jobs
visualize/json: training hyperparameters for each experiment
visualize/csv: training results in .csv format
visualize/performance.py: (after training) view results & create .csv results
- best to run with VSCode ipython cells

Experiment Example

To run the baseline experiments:

Tune hparams: bash scripts/hparams/baseline.sh
- runs will be saved in runs/hparams_baseline/...
Extract best hparams from runs: python baseline_hparams.py
- the best hparams will be saved in visualize/json/baseline.json
Run training with hparams: bash scripts/baseline/diagonal.sh
- runs will be saved in runs/5e6_baseline/...
Run speed tests: bash scripts/speed/baseline.sh
- runs will be saved in runs/baseline_speed/...
View results: run interactive ipython in visualize/performance.py

# %%
runs_path = pathlib.Path("../runs/5e6_baseline/")
speed_runs_path = pathlib.Path("../runs/baseline_speed/")
name = "baseline"
baseline_data = analyze(runs_path, speed_runs_path)
baseline_df = mean_df(*baseline_data, name, save=True)

Second-order Approximation References

Implementations

Other

Code formatted with Black
Experiment runs format: runs/{experiment_name}/{env_name}/{approximation}_runs/{tensorboard folder}/...

Bag of Tricks for Natural Policy Gradient Reinforcement Learning

Related tags

Overview

Bag of Tricks for Natural Policy Gradient Reinforcement Learning [ArXiv]

Setup

Main Files

Reproducing Experiments

Experiment Example

Second-order Approximation References

Implementations

Other

Owner

Brennan Gebotys

Run containerized, rootless applications with podman

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Generalized and Efficient Blackbox Optimization System.

code for paper -- "Seamless Satellite-image Synthesis"

Implementation of Kronecker Attention in Pytorch

Simulation of the solar system using various nummerical methods

Program your own vulkan.gpuinfo.org query in Python. Used to determine baseline hardware for WebGPU.

This dlib-based facial login system

Syntax-Aware Action Targeting for Video Captioning

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Utilities to bridge Canvas-generated course rosters with GitLab's API.

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

PyTorch Implementation of Sparse DETR

Deep High-Resolution Representation Learning for Human Pose Estimation

Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM)

MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.