An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.

Overview

CyberBattleSim

April 8th, 2021: See the announcement on the Microsoft Security Blog.

CyberBattleSim is an experimentation research platform to investigate the interaction of automated agents operating in a simulated abstract enterprise network environment. The simulation provides a high-level abstraction of computer networks and cyber security concepts. Its Python-based Open AI Gym interface allows for the training of automated agents using reinforcement learning algorithms.

The simulation environment is parameterized by a fixed network topology and a set of vulnerabilities that agents can utilize to move laterally in the network. The goal of the attacker is to take ownership of a portion of the network by exploiting vulnerabilities that are planted in the computer nodes. While the attacker attempts to spread throughout the network, a defender agent watches the network activity and tries to detect any attack taking place and mitigate the impact on the system by evicting the attacker. We provide a basic stochastic defender that detects and mitigates ongoing attacks based on pre-defined probabilities of success. We implement mitigation by re-imaging the infected nodes, a process abstractly modeled as an operation spanning over multiple simulation steps.

To compare the performance of the agents we look at two metrics: the number of simulation steps taken to attain their goal and the cumulative rewards over simulation steps across training epochs.

Project goals

We view this project as an experimentation platform to conduct research on the interaction of automated agents in abstract simulated network environments. By open-sourcing it, we hope to encourage the research community to investigate how cyber-agents interact and evolve in such network environments.

The simulation we provide is admittedly simplistic, but this has advantages. Its highly abstract nature prohibits direct application to real-world systems thus providing a safeguard against potential nefarious use of automated agents trained with it. At the same time, its simplicity allows us to focus on specific security aspects we aim to study and quickly experiment with recent machine learning and AI algorithms.

For instance, the current implementation focuses on the lateral movement cyber-attacks techniques, with the hope of understanding how network topology and configuration affects them. With this goal in mind, we felt that modeling actual network traffic was not necessary. This is just one example of a significant limitation in our system that future contributions might want to address.

On the algorithmic side, we provide some basic agents as starting points, but we would be curious to find out how state-of-the-art reinforcement learning algorithms compare to them. We found that the large action space intrinsic to any computer system is a particular challenge for Reinforcement Learning, in contrast to other applications such as video games or robot control. Training agents that can store and retrieve credentials is another challenge faced when applying RL techniques where agents typically do not feature internal memory. These are other areas of research where the simulation could be used for benchmarking purposes.

Other areas of interest include the responsible and ethical use of autonomous cyber-security systems: How to design an enterprise network that gives an intrinsic advantage to defender agents? How to conduct safe research aimed at defending enterprises against autonomous cyber-attacks while preventing nefarious use of such technology?

Documentation

Read the Quick introduction to the project.

Build status

Type Branch Status
CI master .github/workflows/ci.yml
Docker image master .github/workflows/build-container.yml

Benchmark

See Benchmark.

Setting up a dev environment

It is strongly recommended to work under a Linux environment, either directly or via WSL on Windows. Running Python on Windows directly should work but is not supported anymore.

Start by checking out the repository:

git clone https://github.com/microsoft/CyberBattleSim.git

On Linux or WSL

The instructions were tested on a Linux Ubuntu distribution (both native and via WSL). Run the following command to set-up your dev environment and install all the required dependencies (apt and pip packages):

./init.sh

The script installs python3.8 if not present. If you are running a version of Ubuntu older than 20, it will automatically add an additional apt repository to install python3.8.

The script will create a virtual Python environment under a venv subdirectory, you can then run Python with venv/bin/python.

Note: If you prefer Python from a global installation instead of a virtual environment then you can skip the creation of the virtual environment by running the script with ./init.sh -n. This will instead install all the Python packages on a system-wide installation of Python 3.8.

Windows Subsystem for Linux

The supported dev environment on Windows is via WSL. You first need to install an Ubuntu WSL distribution on your Windows machine, and then proceed with the Linux instructions (next section).

Git authentication from WSL

To authenticate with Git, you can either use SSH-based authentication or alternatively use the credential-helper trick to automatically generate a PAT token. The latter can be done by running the following command under WSL (more info here):

git config --global credential.helper "/mnt/c/Program\ Files/Git/mingw64/libexec/git-core/git-credential-manager.exe"

Docker on WSL

To run your environment within a docker container, we recommend running docker via Windows Subsystem on Linux (WSL) using the following instructions: Installing Docker on Windows under WSL).

Windows (unsupported)

This method is not maintained anymore, please prefer instead running under a WSL subsystem Linux environment. But if you insist you want to start by installing Python 3.8 then in a Powershell prompt run the ./init.ps1 script.

Getting started quickly using Docker

The quickest method to get up and running is via the Docker container.

NOTE: For licensing reasons, we do not publicly redistribute any build artifact. In particular, the docker registry spinshot.azurecr.io referred to in the commands below is kept private to the project maintainers only.

As a workaround, you can recreate the docker image yourself using the provided Dockerfile, publish the resulting image to your own docker registry and replace the registry name in the commands below.

Running from Docker registry

commit=7c1f8c80bc53353937e3c69b0f5f799ebb2b03ee
docker login spinshot.azurecr.io
docker pull spinshot.azurecr.io/cyberbattle:$commit
docker run -it spinshot.azurecr.io/cyberbattle:$commit cyberbattle/agents/baseline/run.py

Recreating the Docker image

docker build -t cyberbattle:1.1 .
docker run -it -v "$(pwd)":/source --rm cyberbattle:1.1 cyberbattle/agents/baseline/run.py

Check your environment

Run the following command to run a simulation with a baseline RL agent:

python cyberbattle/agents/baseline/run.py --training_episode_count 1 --eval_episode_count 1 --iteration_count 10 --rewardplot_with 80  --chain_size=20 --ownership_goal 1.0

If everything is setup correctly you should get an output that looks like this:

torch cuda available=True
###### DQL
Learning with: episode_count=1,iteration_count=10,ϵ=0.9,ϵ_min=0.1, ϵ_expdecay=5000,γ=0.015, lr=0.01, replaymemory=10000,
batch=512, target_update=10
  ## Episode: 1/1 'DQL' ϵ=0.9000, γ=0.015, lr=0.01, replaymemory=10000,
batch=512, target_update=10
Episode 1|Iteration 10|reward:  139.0|Elapsed Time: 0:00:00|###################################################################|
###### Random search
Learning with: episode_count=1,iteration_count=10,ϵ=1.0,ϵ_min=0.0,
  ## Episode: 1/1 'Random search' ϵ=1.0000,
Episode 1|Iteration 10|reward:  194.0|Elapsed Time: 0:00:00|###################################################################|
simulation ended
Episode duration -- DQN=Red, Random=Green
   10.00  ┼
Cumulative rewards -- DQN=Red, Random=Green
  194.00  ┼      ╭──╴
  174.60  ┤      │
  155.20  ┤╭─────╯
  135.80  ┤│     ╭──╴
  116.40  ┤│     │
   97.00  ┤│    ╭╯
   77.60  ┤│    │
   58.20  ┤╯ ╭──╯
   38.80  ┤  │
   19.40  ┤  │
    0.00  ┼──╯

Jupyter notebooks

To quickly get familiar with the project, you can open one of the provided Jupyter notebooks to play interactively with the gym environments. Just start jupyter with jupyter notebook, or venv/bin/jupyter notebook if you are using a virtual environment setup.

How to instantiate the Gym environments?

The following code shows how to create an instance of the OpenAI Gym environment CyberBattleChain-v0, an environment based on a chain-like network structure, with 10 nodes (size=10) where the agent's goal is to either gain full ownership of the network (own_atleast_percent=1.0) or break the 80% network availability SLA (maintain_sla=0.80), while the network is being monitored and protected by the basic probalistically-modelled defender (defender_agent=ScanAndReimageCompromisedMachines):

import cyberbattle._env.cyberbattle_env

cyberbattlechain_defender =
  gym.make('CyberBattleChain-v0',
      size=10,
      attacker_goal=AttackerGoal(
          own_atleast=0,
          own_atleast_percent=1.0
      ),
      defender_constraint=DefenderConstraint(
          maintain_sla=0.80
      ),
      defender_agent=ScanAndReimageCompromisedMachines(
          probability=0.6,
          scan_capacity=2,
          scan_frequency=5))

To try other network topologies, take example on chainpattern.py to define your own set of machines and vulnerabilities, then add an entry in the module initializer to declare and register the Gym environment.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Ideas for contributions

Here are some ideas on how to contribute: enhance the simulation (event-based, refined the simulation, …), train an RL algorithm on the existing simulation, implement benchmark to evaluate and compare novelty of agents, add more network generative modes to train RL-agent on, contribute to the doc, fix bugs.

See also the wiki for more ideas.

Citing this project

@misc{msft:cyberbattlesim,
  Author = {Microsoft Defender Research Team.}
  Note = {Created by Christian Seifert, Michael Betser, William Blum, James Bono, Kate Farris, Emily Goren, Justin Grana, Kristian Holsheimer, Brandon Marken, Joshua Neil, Nicole Nichols, Jugal Parikh, Haoran Wei.},
  Publisher = {GitHub},
  Howpublished = {\url{https://github.com/microsoft/cyberbattlesim}},
  Title = {CyberBattleSim},
  Year = {2021}
}

Note on privacy

This project does not include any customer data. The provided models and network topologies are purely fictitious. Users of the provided code provide all the input to the simulation and must have the necessary permissions to use any provided data.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
Send text to girlfriend in the morning

Girlfriend Text Send text to girlfriend (or really anyone with a phone number) in the morning 1. Configure your settings in utils.py. phone_number = "

Paras Adhikary 199 Oct 25, 2022
Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains This repository contains the source code for an end-to-end open-domain question

7 Sep 27, 2022
Demonstrates iterative FGSM on Apple's NeuralHash model.

apple-neuralhash-attack Demonstrates iterative FGSM on Apple's NeuralHash model. TL;DR: It is possible to apply noise to CSAM images and make them loo

Lim Swee Kiat 11 Jun 23, 2022
Gym-TORCS is the reinforcement learning (RL) environment in TORCS domain with OpenAI-gym-like interface.

Gym-TORCS Gym-TORCS is the reinforcement learning (RL) environment in TORCS domain with OpenAI-gym-like interface. TORCS is the open-rource realistic

naoto yoshida 400 Dec 27, 2022
Official Pytorch Code for the paper TransWeather

TransWeather Official Code for the paper TransWeather, Arxiv Tech Report 2021 Paper | Website About this repo: This repo hosts the implentation code,

Jeya Maria Jose 81 Dec 30, 2022
The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

AICITY2021_Track2_DMT The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop. Introduction

Hao Luo 91 Dec 21, 2022
Real-time 3D multi-person detection made easy with OpenPose and the ZED

OpenPose ZED This sample show how to simply use the ZED with OpenPose, the deep learning framework that detects the skeleton from a single 2D image. T

blanktec 5 Nov 06, 2020
Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

41 Jan 03, 2023
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
A chemical analysis of lipophilicities & molecule drawings including ML

A chemical analysis of lipophilicity & molecule drawings including a bit of ML analysis. This is a simple project that includes two Jupyter files (one

Aurimas A. Nausėdas 7 Nov 22, 2022
Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

DominoSearch This is repository for codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense n

11 Sep 10, 2022
Project page for End-to-end Recovery of Human Shape and Pose

End-to-end Recovery of Human Shape and Pose Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik CVPR 2018 Project Page Requirements Pyt

1.4k Dec 29, 2022
Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Non-Parametric Prior Actor-Critic (N-PPAC) This repository contains the code for On Pathologies in KL-Regularized Reinforcement Learning from Expert D

Cong Lu 5 May 13, 2022
StyleGAN of All Trades: Image Manipulation withOnly Pretrained StyleGAN

StyleGAN of All Trades: Image Manipulation withOnly Pretrained StyleGAN This is the PyTorch implementation of StyleGAN of All Trades: Image Manipulati

360 Dec 28, 2022
Fast and robust clustering of point clouds generated with a Velodyne sensor.

Depth Clustering This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velo

Photogrammetry & Robotics Bonn 957 Dec 21, 2022
Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Quadruped command tracking controller (flat terrain) Prepare Install RAISIM link

Yunho Kim 4 Oct 20, 2022
diablo2 resurrected loot filter

Only For Chinese and Traditional Chinese The filter only for Chinese and Traditional Chinese, i didn't change it for other language.Maybe you could mo

elmagnifico 249 Dec 04, 2022
Meandering In Networks of Entities to Reach Verisimilar Answers

MINERVA Meandering In Networks of Entities to Reach Verisimilar Answers Code and models for the paper Go for a Walk and Arrive at the Answer - Reasoni

Shehzaad Dhuliawala 271 Dec 13, 2022
This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

neon_course This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework. For more information, see

Nervana 92 Jan 03, 2023
Research - dataset and code for 2016 paper Learning a Driving Simulator

the people's comma the paper Learning a Driving Simulator the comma.ai driving dataset 7 and a quarter hours of largely highway driving. Enough to tra

comma.ai 4.1k Jan 02, 2023