An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

Overview

gym-idsgame An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

gym-idsgame is a reinforcement learning environment for simulating attack and defense operations in an abstract network intrusion game. The environment extends the abstract model described in (Elderman et al. 2017). The model constitutes a two-player Markov game between an attacker agent and a defender agent that face each other in a simulated computer network. The reinforcement learning environment exposes an interface to a partially observed Markov decision process (POMDP) model of the Markov game. The interface can be used to train, simulate, and evaluate attack- and defend policies against each other. Moreover, the repository contains code to reproduce baseline results for various reinforcement learning algorithms, including:

  • Tabular Q-learning
  • Neural-fitted Q-learning using the DQN algorithm.
  • REINFORCE with baseline
  • Actor-Critic REINFORCE
  • PPO

Please use this bibtex if you make use of this code in your publications (paper: https://arxiv.org/abs/2009.08120):

@INPROCEEDINGS{Hamm2011:Finding,
AUTHOR="Kim Hammar and Rolf Stadler",
TITLE="Finding Effective Security Strategies through Reinforcement Learning and
{Self-Play}",
BOOKTITLE="International Conference on Network and Service Management (CNSM 2020)
(CNSM 2020)",
ADDRESS="Izmir, Turkey",
DAYS=1,
MONTH=nov,
YEAR=2020,
KEYWORDS="Network Security; Reinforcement Learning; Markov Security Games",
ABSTRACT="We present a method to automatically find security strategies for the use
case of intrusion prevention. Following this method, we model the
interaction between an attacker and a defender as a Markov game and let
attack and defense strategies evolve through reinforcement learning and
self-play without human intervention. Using a simple infrastructure
configuration, we demonstrate that effective security strategies can emerge
from self-play. This shows that self-play, which has been applied in other
domains with great success, can be effective in the context of network
security. Inspection of the converged policies show that the emerged
policies reflect common-sense knowledge and are similar to strategies of
humans. Moreover, we address known challenges of reinforcement learning in
this domain and present an approach that uses function approximation, an
opponent pool, and an autoregressive policy representation. Through
evaluations we show that our method is superior to two baseline methods but
that policy convergence in self-play remains a challenge."
}

Publications

Table of Contents

Design

Included Environments

A rich set of configurations of the Markov game are registered as openAI gym environments. The environments are specified and implemented in gym_idsgame/envs/idsgame_env.py see also gym_idsgame/__init__.py.

minimal_defense

This is an environment where the agent is supposed to play the attacker in the Markov game and the defender is following the defend_minimal baseline defense policy. The defend_minimal policy entails that the defender will always defend the attribute with the minimal value out of all of its neighbors.

Registered configurations:

  • idsgame-minimal_defense-v0
  • idsgame-minimal_defense-v1
  • idsgame-minimal_defense-v2
  • idsgame-minimal_defense-v3
  • idsgame-minimal_defense-v4
  • idsgame-minimal_defense-v5
  • idsgame-minimal_defense-v6
  • idsgame-minimal_defense-v7
  • idsgame-minimal_defense-v8
  • idsgame-minimal_defense-v9
  • idsgame-minimal_defense-v10
  • idsgame-minimal_defense-v11
  • idsgame-minimal_defense-v12
  • idsgame-minimal_defense-v13
  • idsgame-minimal_defense-v14
  • idsgame-minimal_defense-v15
  • idsgame-minimal_defense-v16
  • idsgame-minimal_defense-v17
  • idsgame-minimal_defense-v18
  • idsgame-minimal_defense-v19
  • idsgame-minimal_defense-v20

maximal_attack

This is an environment where the agent is supposed to play the defender and the attacker is following the attack_maximal baseline attack policy. The attack_maximal policy entails that the attacker will always attack the attribute with the maximum value out of all of its neighbors.

Registered configurations:

  • idsgame-maximal_attack-v0
  • idsgame-maximal_attack-v1
  • idsgame-maximal_attack-v2
  • idsgame-maximal_attack-v3
  • idsgame-maximal_attack-v4
  • idsgame-maximal_attack-v5
  • idsgame-maximal_attack-v6
  • idsgame-maximal_attack-v7
  • idsgame-maximal_attack-v8
  • idsgame-maximal_attack-v9
  • idsgame-maximal_attack-v10
  • idsgame-maximal_attack-v11
  • idsgame-maximal_attack-v12
  • idsgame-maximal_attack-v13
  • idsgame-maximal_attack-v14
  • idsgame-maximal_attack-v15
  • idsgame-maximal_attack-v16
  • idsgame-maximal_attack-v17
  • idsgame-maximal_attack-v18
  • idsgame-maximal_attack-v19
  • idsgame-maximal_attack-v20

random_attack

This is an environment where the agent is supposed to play as the defender and the attacker is following a random baseline attack policy.

Registered configurations:

  • idsgame-random_attack-v0
  • idsgame-random_attack-v1
  • idsgame-random_attack-v2
  • idsgame-random_attack-v3
  • idsgame-random_attack-v4
  • idsgame-random_attack-v5
  • idsgame-random_attack-v6
  • idsgame-random_attack-v7
  • idsgame-random_attack-v8
  • idsgame-random_attack-v9
  • idsgame-random_attack-v10
  • idsgame-random_attack-v11
  • idsgame-random_attack-v12
  • idsgame-random_attack-v13
  • idsgame-random_attack-v14
  • idsgame-random_attack-v15
  • idsgame-random_attack-v16
  • idsgame-random_attack-v17
  • idsgame-random_attack-v18
  • idsgame-random_attack-v19
  • idsgame-random_attack-v20

random_defense

An environment where the agent is supposed to play as the attacker and the defender is following a random baseline defense policy.

Registered configurations:

  • idsgame-random_defense-v0
  • idsgame-random_defense-v1
  • idsgame-random_defense-v2
  • idsgame-random_defense-v3
  • idsgame-random_defense-v4
  • idsgame-random_defense-v5
  • idsgame-random_defense-v6
  • idsgame-random_defense-v7
  • idsgame-random_defense-v8
  • idsgame-random_defense-v9
  • idsgame-random_defense-v10
  • idsgame-random_defense-v11
  • idsgame-random_defense-v12
  • idsgame-random_defense-v13
  • idsgame-random_defense-v14
  • idsgame-random_defense-v15
  • idsgame-random_defense-v16
  • idsgame-random_defense-v17
  • idsgame-random_defense-v18
  • idsgame-random_defense-v19
  • idsgame-random_defense-v20

two_agents

This is an environment where neither the attacker nor defender is part of the environment, i.e. it is intended for 2-agent simulations or RL training. In the experiments folder you can see examples of using this environment for training PPO-attacker vs PPO-defender, DQN-attacker vs REINFORCE-defender, etc..

Registered configurations:

  • idsgame-v0
  • idsgame-v1
  • idsgame-v2
  • idsgame-v3
  • idsgame-v4
  • idsgame-v5
  • idsgame-v6
  • idsgame-v7
  • idsgame-v8
  • idsgame-v9
  • idsgame-v10
  • idsgame-v11
  • idsgame-v12
  • idsgame-v13
  • idsgame-v14
  • idsgame-v15
  • idsgame-v16
  • idsgame-v17
  • idsgame-v18
  • idsgame-v19
  • idsgame-v20

Requirements

  • Python 3.5+
  • OpenAI Gym
  • NumPy
  • Pyglet (OpenGL 3D graphics)
  • GPU for 3D graphics acceleration (optional)
  • jsonpickle (for configuration files)
  • torch (for baseline algorithms)

Installation & Tests

# install from pip
pip install gym-idsgame==1.0.12
# local install from source
$ pip install -e gym-idsgame
# force upgrade deps
$ pip install -e gym-idsgame --upgrade

# git clone and install from source
git clone https://github.com/Limmen/gym-idsgame
cd gym-idsgame
pip3 install -e .

# run unit tests
pytest

# run it tests
cd experiments
make tests

Usage

The environment can be accessed like any other OpenAI environment with gym.make. Once the environment has been created, the API functions step(), reset(), render(), and close() can be used to train any RL algorithm of your preference.

import gym
from gym_idsgame.envs import IdsGameEnv
env_name = "idsgame-maximal_attack-v3"
env = gym.make(env_name)

The environment ships with implementation of several baseline algorithms, e.g. the tabular Q(0) algorithm, see the example code below.

import gym
from gym_idsgame.agents.training_agents.q_learning.q_agent_config import QAgentConfig
from gym_idsgame.agents.training_agents.q_learning.tabular_q_learning.tabular_q_agent import TabularQAgent
random_seed = 0
util.create_artefact_dirs(default_output_dir(), random_seed)
q_agent_config = QAgentConfig(gamma=0.999, alpha=0.0005, epsilon=1, render=False, eval_sleep=0.9,
                              min_epsilon=0.01, eval_episodes=100, train_log_frequency=100,
                              epsilon_decay=0.9999, video=True, eval_log_frequency=1,
                              video_fps=5, video_dir=default_output_dir() + "/results/videos/" + str(random_seed), num_episodes=20001,
                              eval_render=False, gifs=True, gif_dir=default_output_dir() + "/results/gifs/" + str(random_seed),
                              eval_frequency=1000, attacker=True, defender=False, video_frequency=101,
                              save_dir=default_output_dir() + "/results/data/" + str(random_seed))
env_name = "idsgame-minimal_defense-v2"
env = gym.make(env_name, save_dir=default_output_dir() + "/results/data/" + str(random_seed))
attacker_agent = TabularQAgent(env, q_agent_config)
attacker_agent.train()
train_result = attacker_agent.train_result
eval_result = attacker_agent.eval_result

Manual Play

You can run the environment in a mode of "manual control" as well:

from gym_idsgame.agents.manual_agents.manual_defense_agent import ManualDefenseAgent
random_seed = 0
env_name = "idsgame-random_attack-v2"
env = gym.make(env_name)
ManualDefenseAgent(env.idsgame_config)

Baseline Experiments

The experiments folder contains results, hyperparameters and code to reproduce reported results using this environment. For more information about each individual experiment, see this README.

Clean All Experiment Results

cd experiments # cd into experiments folder
make clean

Run All Experiment Results (Takes a long time)

cd experiments # cd into experiments folder
make all

Run All Experiments For a specific environment (Takes a long time)

cd experiments # cd into experiments folder
make v0

Run a specific experiment

cd experiments/training/v0/random_defense/tabular_q_learning/ # cd into the experiment folder
make run

Clean a specific experiment

cd experiments/training/v0/random_defense/tabular_q_learning/ # cd into the experiment folder
make clean

Start tensorboard for a specifc experiment

cd experiments/training/v0/random_defense/tabular_q_learning/ # cd into the experiment folder
make tensorboard

Fetch Baseline Experiment Results

By default when cloning the repo the experiment results are not included, to fetch the experiment results, install and setup git-lfs then run:

git lfs fetch --all
git lfs pull

Author & Maintainer

Kim Hammar [email protected]

Copyright and license

LICENSE

MIT

(C) 2020, Kim Hammar

Owner
Kim Hammar
PhD @KTH, ML, Distributed systems, security & stuff. Previously @logicalclocks, Allstate, Ericsson.
Kim Hammar
Companion code for the paper Theoretical characterization of uncertainty in high-dimensional linear classification

Companion code for the paper Theoretical characterization of uncertainty in high-dimensional linear classification Usage The required packages are lis

0 Feb 07, 2022
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdh

Ayan Kumar Bhunia 22 Aug 27, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022
This repository is for DSA and CP scripts for reference.

dsa-script-collections This Repo is the collection of DSA and CP scripts for reference. Contents Python Bubble Sort Insertion Sort Merge Sort Quick So

Aditya Kumar Pandey 9 Nov 22, 2022
From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

SESNet for remote sensing image change detection It is the implementation of the paper: "SESNet: A Semantically Enhanced Siamese Network for Remote Se

1 May 24, 2022
Train SN-GAN with AdaBelief

SNGAN-AdaBelief Train a state-of-the-art spectral normalization GAN with AdaBelief https://github.com/juntang-zhuang/Adabelief-Optimizer Acknowledgeme

Juntang Zhuang 10 Jun 11, 2022
Predict the latency time of the deep learning models

Deep Neural Network Prediction Step 1. Genernate random parameters and Run them sequentially : $ python3 collect_data.py -gp -ep -pp -pl pooling -num

QAQ 1 Nov 12, 2021
Elevation Mapping on GPU.

Elevation Mapping cupy Overview This is a ros package of elevation mapping on GPU. Code are written in python and uses cupy for GPU calculation. * pla

Robotic Systems Lab - Legged Robotics at ETH Zürich 183 Dec 19, 2022
✔️ Visual, reactive testing library for Julia. Time machine included.

PlutoTest.jl (alpha release) Visual, reactive testing library for Julia A macro @test that you can use to verify your code's correctness. But instead

Pluto 68 Dec 20, 2022
PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

IBRNet: Learning Multi-View Image-Based Rendering PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021. IBRN

Google Interns 371 Jan 03, 2023
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 08, 2022
A non-linear, non-parametric Machine Learning method capable of modeling complex datasets

Fast Symbolic Regression Symbolic Regression is a non-linear, non-parametric Machine Learning method capable of modeling complex data sets. fastsr aim

VAMSHI CHOWDARY 3 Jun 22, 2022
基于pytorch构建cyclegan示例

cyclegan-demo 基于Pytorch构建CycleGAN示例 如何运行 准备数据集 将数据集整理成4个文件,分别命名为 trainA, trainB:训练集,A、B代表两类图片 testA, testB:测试集,A、B代表两类图片 例如 D:\CODE\CYCLEGAN-DEMO\DATA

Koorye 3 Oct 18, 2022
S2s2net - Sentinel-2 Super-Resolution Segmentation Network

S2S2Net Sentinel-2 Super-Resolution Segmentation Network Getting started Install

Wei Ji 10 Nov 10, 2022
Efficiently Disentangle Causal Representations

Efficiently Disentangle Causal Representations Install dependency pip install -r requirements.txt Main experiments Causality direction prediction cd

4 Apr 01, 2022
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.

An-Introduction-to-Statistical-Learning This repository contains the exercises and its solution contained in the book An Introduction to Statistical L

2.1k Jan 02, 2023
CLIP + VQGAN / PixelDraw

clipit Yet Another VQGAN-CLIP Codebase This started as a fork of @nerdyrodent's VQGAN-CLIP code which was based on the notebooks of @RiversWithWings a

dribnet 276 Dec 12, 2022
EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

This repository contains data and code for our EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation. Please contact me at

9 Oct 28, 2022
A MatConvNet-based implementation of the Fully-Convolutional Networks for image segmentation

MatConvNet implementation of the FCN models for semantic segmentation This package contains an implementation of the FCN models (training and evaluati

VLFeat.org 175 Feb 18, 2022
A high-performance distributed deep learning system targeting large-scale and automated distributed training.

HETU Documentation | Examples Hetu is a high-performance distributed deep learning system targeting trillions of parameters DL model training, develop

DAIR Lab 150 Dec 21, 2022