Malware Bypass Research using Reinforcement Learning

Overview

MalwareRL

Malware Bypass Research using Reinforcement Learning

Background

This is a malware manipulation environment using OpenAI's gym environments. The core idea is based on paper "Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning" (paper). I am extending the original repo because:

  1. It is no longer maintained
  2. It uses Python2 and an outdated version of LIEF
  3. I wanted to integrate new Malware gym environments and additional manipulations

Over the past three years there have been breakthrough open-source projects published in the security ML space. In particular, Ember (Endgame Malware BEnchmark for Research) (paper) and MalConv: Malware detection by eating a whole exe (paper) have provided security researchers the ability to develop sophisticated, reproducible models that emulate features/techniques found in NGAVs.

MalwareRL Gym Environment

MalwareRL exposes gym environments for both Ember and MalConv to allow researchers to develop Reinforcement Learning agents to bypass Malware Classifiers. Actions include a variety of non-breaking (e.g. binaries will still execute) modifications to the PE header, sections, imports and overlay and are listed below.

Action Space

ACTION_TABLE = {
    'modify_machine_type': 'modify_machine_type',
    'pad_overlay': 'pad_overlay',
    'append_benign_data_overlay': 'append_benign_data_overlay',
    'append_benign_binary_overlay': 'append_benign_binary_overlay',
    'add_bytes_to_section_cave': 'add_bytes_to_section_cave',
    'add_section_strings': 'add_section_strings',
    'add_section_benign_data': 'add_section_benign_data',
    'add_strings_to_overlay': 'add_strings_to_overlay',
    'add_imports': 'add_imports',
    'rename_section': 'rename_section',
    'remove_debug': 'remove_debug',
    'modify_optional_header': 'modify_optional_header',
    'modify_timestamp': 'modify_timestamp',
    'break_optional_header_checksum': 'break_optional_header_checksum',
    'upx_unpack': 'upx_unpack',
    'upx_pack': 'upx_pack'
}

Observation Space

The observation_space of the gym environments are an array representing the feature vector. For ember this is numpy.array == 2381 and malconv numpy.array == 1024**2. The MalConv gym presents an opportunity to try RL techniques to generalize learning across large State Spaces.

Agents

A baseline agent RandomAgent is provided to demonstrate how to interact w/ gym environments and expected output. This agent attempts to evade the classifier by randomly selecting an action. This process is repeated up to the length of a game (e.g. 50 mods). If the modifed binary scores below the classifier threshold we register it as an evasion. In a lot of ways the RandomAgent acts as a fuzzer trying a bunch of actions with no regard to minimizing the modifications of the resulting binary.

Additional agents will be developed and made available (both model and code) in the coming weeks.

Table 1: Evasion Rate against Ember Holdout Dataset*

gym agent evasion_rate avg_ep_len
ember RandomAgent 89.2% 8.2
malconv RandomAgent 88.5% 16.33


* 250 random samples

Setup

To get malware_rl up and running you will need the follow external dependencies:

  • LIEF
  • Ember, Malconv and SOREL-20M models. All of these then need to be placed into the malware_rl/envs/utils/ directory.

    The SOREL-20M model requires use of the aws-cli in order to get. When accessing the AWS S3 bucket, look in the sorel-20m-model/checkpoints/lightGBM folder and fish out any of the models in the seed folders. The model file will need to be renamed to sorel.model and placed into malware_rl/envs/utils alongside the other models.

  • UPX has been added to support pack/unpack modifications. Download the binary here and place in the malware_rl/envs/controls directory.
  • Benign binaries - a small set of "trusted" binaries (e.g. grabbed from base Windows installation) you can download some via MSFT website (example). Store these binaries in malware_rl/envs/controls/trusted
  • Run strings command on those binaries and save the output as .txt files in malware_rl/envs/controls/good_strings
  • Download a set of malware from VirusShare or VirusTotal. I just used a list of hashes from the Ember dataset

Note: The helper script download_deps.py can be used as a quickstart to get most of the key dependencies setup.

I used a conda env set for Python3.7:

conda create -n malware_rl python=3.7

Finally install the Python3 dependencies in the requirements.txt.

pip3 install -r requirements.txt

References

The are a bunch of good papers/blog posts on manipulating binaries to evade ML classifiers. I compiled a few that inspired portions of this project below. Also, I have inevitably left out other pertinent reseach, so if there is something that should be in here let me know in an Git Issue or hit me up on Twitter (@filar).

Papers

  • Demetrio, Luca, et al. "Efficient Black-box Optimization of Adversarial Windows Malware with Constrained Manipulations." arXiv preprint arXiv:2003.13526 (2020). (paper)
  • Demetrio, Luca, et al. "Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection." arXiv preprint arXiv:2008.07125 (2020). (paper)
  • Song, Wei, et al. "Automatic Generation of Adversarial Examples for Interpreting Malware Classifiers." arXiv preprint arXiv:2003.03100 (2020). (paper)
  • Suciu, Octavian, Scott E. Coull, and Jeffrey Johns. "Exploring adversarial examples in malware detection." 2019 IEEE Security and Privacy Workshops (SPW). IEEE, 2019. (paper)
  • Fleshman, William, et al. "Static malware detection & subterfuge: Quantifying the robustness of machine learning and current anti-virus." 2018 13th International Conference on Malicious and Unwanted Software (MALWARE). IEEE, 2018. (paper)
  • Pierazzi, Fabio, et al. "Intriguing properties of adversarial ML attacks in the problem space." 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020. (paper/code)
  • Fang, Zhiyang, et al. "Evading anti-malware engines with deep reinforcement learning." IEEE Access 7 (2019): 48867-48879. (paper)

Blog Posts

Talks

  • 42: The answer to life the universe and everything offensive security by Will Pearce, Nick Landers (slides)
  • Bot vs. Bot: Evading Machine Learning Malware Detection by Hyrum Anderson (slides)
  • Trying to Make Meterpreter into an Adversarial Example by Andy Applebaum (slides)
Owner
Bobby Filar
Security Data Science @ Elastic
Bobby Filar
A new framework, collaborative cascade prediction based on graph neural networks (CCasGNN) to jointly utilize the structural characteristics, sequence features, and user profiles.

CCasGNN A new framework, collaborative cascade prediction based on graph neural networks (CCasGNN) to jointly utilize the structural characteristics,

5 Apr 29, 2022
Remote sensing change detection tool based on PaddlePaddle

PdRSCD PdRSCD(PaddlePaddle Remote Sensing Change Detection)是一个基于飞桨PaddlePaddle的遥感变化检测的项目,pypi包名为ppcd。目前0.2版本,最新支持图像列表输入的训练和预测,如多期影像、多源影像甚至多期多源影像。可以快速完

38 Aug 31, 2022
Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Repository under construction. Example dataset, checkpoints, and training/testing scripts will be avaible soon! 💡 Collated best practices from most p

4 Jun 26, 2022
Source code of article "Towards Toxic and Narcotic Medication Detection with Rotated Object Detector"

Towards Toxic and Narcotic Medication Detection with Rotated Object Detector Introduction This is the source code of article: Towards Toxic and Narcot

Woody. Wang 3 Oct 29, 2022
A Python parser that takes the content of a text file and then reads it into variables.

Text-File-Parser A Python parser that takes the content of a text file and then reads into variables. Input.text File 1. What is your ***? 1. 18 -

Kelvin 0 Jul 26, 2021
Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

causal-bald | Abstract | Installation | Example | Citation | Reproducing Results DUE An implementation of the methods presented in Causal-BALD: Deep B

OATML 13 Oct 07, 2022
[3DV 2021] A Dataset-Dispersion Perspective on Reconstruction Versus Recognition in Single-View 3D Reconstruction Networks

dispersion-score Official implementation of 3DV 2021 Paper A Dataset-dispersion Perspective on Reconstruction versus Recognition in Single-view 3D Rec

Yefan 7 May 28, 2022
Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations Trevor Ablett, Daniel (Yifan) Zhai, Jonatha

STARS Laboratory 3 Feb 01, 2022
Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)

Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)- Emirhan BULUT

Emirhan BULUT 102 Nov 18, 2022
Command-line tool for downloading and extending the RedCaps dataset.

RedCaps Downloader This repository provides the official command-line tool for downloading and extending the RedCaps dataset. Users can seamlessly dow

RedCaps dataset 33 Dec 14, 2022
PyTorch wrapper for Taichi data-oriented class

Stannum PyTorch wrapper for Taichi data-oriented class PRs are welcomed, please see TODOs. Usage from stannum import Tin import torch data_oriented =

86 Dec 23, 2022
Polynomial-time Meta-Interpretive Learning

Louise - polynomial-time Program Learning Getting help with Louise Louise's author can be reached by email at Stassa Patsantzis 64 Dec 26, 2022

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Master Docs License Apache MXNet (incubating) is a deep learning framework designed for both efficiency an

ROCm Software Platform 29 Nov 16, 2022
use tensorflow 2.0 to tell a dog and cat from a specified picture

dog_or_cat use tensorflow 2.0 to tell a dog and cat from a specified picture This is one of the classic experiments for the introduction of deep learn

你这个代码我看不懂 1 Oct 22, 2021
Anomaly detection related books, papers, videos, and toolboxes

Anomaly Detection Learning Resources Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify

Yue Zhao 6.7k Dec 31, 2022
MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Felix Wimbauer 494 Jan 06, 2023
[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation (ICCV 2021) Introduction This is an official pytorch implemen

rongchangxie 42 Jan 04, 2023
A simple approach to emable dense segmentation with ViT.

Vision Transformer Segmentation Network This implementation of ViT in pytorch uses a super simple and straight-forward way of generating an output of

HReynaud 5 Jan 03, 2023
Saliency - Framework-agnostic implementation for state-of-the-art saliency methods (XRAI, BlurIG, SmoothGrad, and more).

Saliency Methods 🔴 Now framework-agnostic! (Example core notebook) 🔴 🔗 For further explanation of the methods and more examples of the resulting ma

PAIR code 849 Dec 27, 2022
【Arxiv】Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

SANet Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 to

36 Jan 05, 2023