Deeprl - Standard DQN and dueling network for simple games

Last update: Apr 12, 2020

Overview

DeepRL

This code implements the standard deep Q-learning and dueling network with experience replay (memory buffer) for playing simple games.

DQN algorithm implemented in this code is from the Google DeepMind's paper Playing Atari with Deep Reinforcement Learning[link].

Dueling network is from the paper Dueling Network Architectures for Deep Reinforcement Learning [link]

Requirement

DeepRL is implemented with Torch and the packages of its ecosystem. This code is well worked on my Mac Pro with CPU (I haven't tested it on Linux and GPU). Install Torch7 firstly, then you should install the following packages by luarocks

luarocks install nn
luarocks install image
luarocks install qt
luarocks install optim

Running

You can run this code by tapping the command in the project dir.

qlua main.lua

The result looks like

DQN: I got the accuracy of 93.2% (932 success of 1000 epochs).

Dueling: I got the accuracy of 99.2% (992 success of 1000 epochs).

Code

The envir.lua indicates the environment in reinforcement learning stage, which receives the action and produces the states and a reward for agent.

The agent.lua is the implementation of agent which receives the states and reward to produce the action directed by the policy network.

The learner.lua is the learning algorithm of DQN with experience replay as the following.

MISC

I completed this code when I was an intern at Horizon Robotics. I will greatly thank the article of Andrej Karpathy and other implementations:SeanNaren's code and EderSantana's gist.

LICENSE

MIT

Deeprl - Standard DQN and dueling network for simple games

Related tags

Overview

DeepRL

Requirement

Running

Code

MISC

LICENSE

Owner

Yao Zhou

This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

A simple AI that will give you si ple task and this is made with python

hipCaffe: the HIP port of Caffe

Self-driving car env with PPO algorithm from stable baseline3

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

High-resolution networks and Segmentation Transformer for Semantic Segmentation

shufflev2-yolov5：lighter, faster and easier to deploy

Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

Contains supplementary materials for reproduce results in HMC divergence time estimation manuscript

Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data

[BMVC2021] "TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation"

Contextual Attention Network: Transformer Meets U-Net

MIMO-UNet - Official Pytorch Implementation

Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

https://sites.google.com/cornell.edu/recsys2021tutorial

Object-aware Contrastive Learning for Debiased Scene Representation

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Machine Learning Toolkit for Kubernetes