Safe Policy Optimization with Local Features

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization with Local Generalized Linear Function Approximations" which was presented in NeurIPS-21.

Installation

There is requirements.txt in this repository. Except for the common modules (e.g., numpy, scipy), our source code depends on the following modules.

We also provide Dockerfile in this repository, which can be used for reproducing our grid-world experiment.

Simulation configuration

We manage the simulation configuration using hydra. Configurations are listed in config.yaml. For example, the algorithm to run should be chosen from the ones we implemented:

sim_type: {safe_glm, unsafe_glm, random, oracle, safe_gp_state, safe_gp_feature, safe_glm_stepwise}

Grid World Experiment

The source code necessary for our grid-world experiment is contained in /grid_world folder. To run the simulation, for example, use the following commands.

cd grid_world
python main.py sim_type=safe_glm env.reuse_env=False

For the monte carlo simulation while comparing our proposed method with baselines, use the shell file, run.sh.

We also provide a script for visualization. If you want to render how the agent behaves, use the following command.

python main.py sim_type=safe_glm env.reuse_env=True

Safety-Gym Experiment

The source code necessary for our safety-gym experiment is contained in /safety_gym_discrete folder. Our experiment is based on safety-gym. Our proposed method utilize dynamic programming algorithms to solve Bellman Equation, so we modified engine.py to discrtize the environment. We attach modified safety-gym source code in /safety_gym_discrete/engine.py. To use the modified library, please clone safety-gym, then replace safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py in our repo. Using the following commands to install the modified library:

cd safety_gym
pip install -e .

Note that MuJoCo licence is needed for installing Safety-Gym. To run the simulation, use the folowing commands.

cd safety_gym_discrete
python main.py sim_idx=0

We compare our proposed method with three notable baselines: CPO, PPO-Lagrangian, and TRPO-Lagrangian. The baseline implementation depends on safety-starter-agents. We modified run_agent.py in the repo source code.

To run the baseline, use the folowing commands.

cd safety_gym_discrete/baseline
python baseline_run.py sim_type=cpo

The environment that agent runs on is generated using generate_env.py. We provide 10 50*50 environments. If you want to generate other environments, you can change the world shape in safety_gym_discrete.py, and running the following commands:

cd safety_gym_discrete
python generate_env.py

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{wachi_yue_sui_neurips2021,
  Author = {Wachi, Akifumi and Wei, Yunyue and Sui, Yanan},
  Title = {Safe Policy Optimization with Local Generalized Linear Function Approximations},
  Booktitle  = {Neural Information Processing Systems (NeurIPS)},
  Year = {2021}
}
Owner
Akifumi Wachi
Akifumi Wachi
Credit Card And SK Checker Written In Python

💳 Credit Card Checker (CC Checker) & Mass SK Checker & Generator 💳

Rimuru Tempest 53 Dec 31, 2022
This is a Crypto asset tracker that I built to aid my personal journey in cryptocurrencies.

Wallet Tracker This is a Crypto asset tracker that I built to aid my personal journey in cryptocurrencies. build docker build -t wallet-tracker . run

2 Mar 21, 2022
The probability of having the password you want in the PassMaker is +90%!!

PasswordMaker Strong listing password Introduction The probability of having the password you want in the tool is +90%!! How to Install Open the termi

MasterBurnt 4 Sep 05, 2021
A toolkit for web reconnaissance, it's fast and easy to use.

A toolkit for web reconnaissance, it's fast and easy to use. File Structure httpsuite/ main.py init.py db/ db.py init.py subdomains_db directories_db

whoami security 22 Jul 22, 2022
A python package with tools to read and postprocess the output of the channel DNS-solver (davecats/channel), as well as its associated postprocessing tools.

Python tools for davecats/channel A python package with tools to read and postprocess the output of the channel dns solver, as well as its associated

Andrea Andreolli 1 Dec 13, 2021
macOS persistence tool

PoisonApple Command-line tool to perform various persistence mechanism techniques on macOS. This tool was designed to be used by threat hunters for cy

Cyborg Security, Inc 212 Dec 29, 2022
Generate obfuscated meterpreter shells

Generator Evade AV with obfuscated payloads Installation must install dotnet prior to running the script with net45 Running ./generator.py -ip Your-I

Fawaz Al-Mutairi 219 Nov 28, 2022
Python-based proof-of-concept tool for generating payloads that utilize unsafe Java object deserialization.

Python-based proof-of-concept tool for generating payloads that utilize unsafe Java object deserialization.

Astro 9 Sep 27, 2022
A Safer PoC for CVE-2022-22965 (Spring4Shell)

Safer_PoC_CVE-2022-22965 A Safer PoC for CVE-2022-22965 (Spring4Shell) Functionality Creates a file called CVE_2022-22965_exploited.txt in the tomcat

Colin Cowie 46 Nov 12, 2022
A Radare2 based Python module for Binary Analysis and Reverse Engineering.

Zepu1chr3 A Radare2 based Python module for Binary Analysis and Reverse Engineering. Installation You can simply run this command. pip3 install zepu1c

Mehmet Ali KERİMOĞLU 5 Aug 25, 2022
CTF framework and exploit development library

pwntools - CTF toolkit Pwntools is a CTF framework and exploit development library. Written in Python, it is designed for rapid prototyping and develo

Gallopsled 9.8k Dec 31, 2022
Description Basic Recon tool for beginners. Especially those who faces issue on how to recon or what all tools to use

Description Basic Recon tool for beginners. Especially those who faces issue on how to recon or what all tools to use. Will try to add atleast 10 more tools currently use 7 sources to gather domains.

Harinder Singh 7 Jan 03, 2022
Dome - Subdomain Enumeration Tool. Fast and reliable python script that makes active and/or passive scan to obtain subdomains and search for open ports.

DOME - A subdomain enumeration tool Check the Spanish Version Dome is a fast and reliable python script that makes active and/or passive scan to obtai

Vadi 329 Jan 01, 2023
Python implementation of the diceware password generating algorithm.

Diceware Password Generator - Generate High Entropy Passwords Please Note - This Program Do Not Store Passwords In Any Form And All The Passwords Are

Sameera Madushan 35 Dec 25, 2022
client attack remotely , this script was written for educational purposes only

client attack remotely , this script was written for educational purposes only, do not use against to any victim, which you do not have permission for it

9 Jun 05, 2022
Cobalt Strike Beacon configuration extractor and parser.

Cobalt Strike Configuration Extractor and Parser Overview Pure Python library and set of scripts to extract and parse configurations (configs) from Co

Stroz Friedberg 102 Dec 18, 2022
ClusterFuzz is a scalable fuzzing infrastructure that finds security and stability issues in software.

ClusterFuzz ClusterFuzz is a scalable fuzzing infrastructure that finds security and stability issues in software. Google uses ClusterFuzz to fuzz all

Google 4.9k Jan 08, 2023
Yuyu Scanner is a Web Reconnaissance & Web Analysis Scanner to find assets and information about targets.

Yuyu Scanner Yuyu Scanner is a Web Reconnaissance & Web Analysis Scanner to find assets and information about targets. installation ! run as root

Justakazh 20 Nov 24, 2022
一款辅助探测Orderby注入漏洞的BurpSuite插件,Python3编写,适用于上xray等扫描器被ban的场景

OrderbyHunter 一款辅助探测Orderby注入漏洞的BurpSuite插件,Python3编写,适用于上xray等扫描器被ban的场景 1. 支持Get/Post型请求参数的探测,被动探测,对于存在Orderby注入的请求将会在HTTP Histroy里标红 2. 自定义排序参数list

Automne 21 Aug 12, 2022
Directory Traversal in Afterlogic webmail aurora and pro

CVE-2021-26294 Exploit Directory Traversal in Afterlogic webmail aurora and pro . Description: AfterLogic Aurora and WebMail Pro products with 7.7.9 a

Ashish Kunwar 8 Nov 09, 2022