Various capabilities for static malware analysis.

Overview

Malchive

The malchive serves as a compendium for a variety of capabilities mainly pertaining to malware analysis, such as scripts supporting day to day binary analysis and decoder modules for various components of malicious code.

The goals behind the 'malchive' are to:

  • Allow teams to centralize efforts made in this realm and enforce communication and continuity
  • Have a shared corpus of tools for people to build on
  • Enforce clean coding practices
  • Allow others to interface with project members to develop their own capabilities
  • Promote a positive feedback loop between Threat Intel and Reverse Engineering staff
  • Make static file analysis more accessible
  • Serve as a vehicle to communicate the unique opportunity space identified via deep dive analysis

Documentation

At its core, malchive is a bunch of standalone scripts organized in a manner that the authors hope promotes the project's goals.

To view the documentation associated with this project, checkout the wiki page!

Scripts within the malchive are split up into the following core categories:

  • Utilities - These scripts may be run standalone to assist with static binary analysis or as modules supporting a broader program. Utilities always have a standalone component.
  • Helpers - These modules primarily serve to assist components in one or more of the other categories. They generally do not have a stand-alone component and instead serve the intents of those that do.
  • Binary Decoders - The purpose of scripts in this category is to retrieve, decrypt, and return embedded data (typically inside malware).
  • Active Discovery - Standalone scripts designed to emulate a small portion of a malware family's protocol for the purposes of discovering active controllers.

Installation

The malchive is a packaged distribution that is easily installed and will automatically create console stand-alone scripts.

Steps

You will need to install some dependencies for some of the required Python modules to function correctly.

  • First do a source install of YARA and make sure you compile using --dotnet
  • Next source install the YARA Python package.
  • Ensure you have sqlite3-dev installed
    • Debian: libsqlite3-dev
    • Red Hat: sqlite-devel / pip install pysqlite3

You can then clone the malchive repo and install...

  • pip install . when in the parent directory.
  • To remove, just pip uninstall malchive

Scripts

Console scripts stemming from utilities are appended with the prefix malutil, decoders are appended with maldec, and active discovery scripts are appended with maldisc. This allows for easily identifiable malchive scripts via tab autocompletion.

; running superstrings from cmd line
malutil-superstrings 1.exe -ss
0x9535 (stack) lstrlenA
0x9592 (stack) GetFileSize
0x95dd (stack) WriteFile
0x963e (stack) CreateFileA
0x96b0 (stack) SetFilePointer
0x9707 (stack) GetSystemDirectoryA

; running a decoder from cmd line
maldec-pivy test.exe_
{
    "MD5": "2973ee05b13a575c06d23891ab83e067",
    "Config": {
        "PersistActiveSetupName": "StubPath",
        "DefaultBrowserKey": "SOFTWARE\\Classes\\http\\shell\\open\\command",
        "PersistActiveSetupKeyPart": "Software\\Microsoft\\Active Setup\\Installed Components\\",
        "ServerId": "TEST - WIN_XP",
        "Callbacks": [
            {
                "callback": "192.168.1.104",
                "protocol": "Direct",
                "port": 3333
            },
            {
                "callback": "192.168.1.111",
                "protocol": "Direct",
                "port": 4444
            }
        ],
        "ProxyCfgPresent": false,
        "Password": "test$321$",
        "Mutex": ")#V0qA.I4",
        "CopyAsADS": true,
        "Melt": true,
        "InjectPersist": true,
        "Inject": true
    }
}

; cmd line use with other common utilities
echo -ne 'eJw9kLFuwzAMRIEC7ZylrVGgRSFZiUbBZmwqsMUP0VfcnuQn+rMde7KLTBIPj0ce34tHyMUJjrnw
p3apz1kicjoJrDRlQihwOXmpL4RmSR5qhEU9MqvgWo8XqGMLJd+sKNQPK0dIGjK+e5WANIT6NeOs
k2mI5NmYAmcrkbn4oLPK5gZX+hVlRoKloMV20uQknv2EPunHKQtcig1cpHY4Jodie5pRViV+rp1t
629J6Dyu4hwLR97LINqY5rYILm1hhlvinoyJZavOKTrwBHTwpZ9yPSzidUiPt8PUTkZ0FBfayWLp
a71e8U8YDrbtu0aWDj+/eBOu+jRkYabX+3hPu9LZ5fb41T+7fmRf' | base64 -d | zlib-flate -uncompress | malutil-xor - [KEY]

Interfacing

Utilities, decoders, and discovery scripts in this collection are designed to support single ad-hoc analysis as well as inclusion into other frameworks. After installation, the malchive should be part of your Python path. At this point accessing any of the scripts is straight forward.

Here are a few examples:

; accessing decoder modules
import sys
from malchive.decoders import testdecoder

p = testdecoder.GetConfig(open(sys.argv[1], 'rb').read())
print('password', p.rc4_key)
for c in p.callbacks:
    print('c2 address', c)

; accessing utilities
from malchive.utilities import xor
ret = xor.GenericXor(buff=b'testing', key=[0x51], count=0xff)
print(ret.run_crypt())

; accessing helpers
from malchive.helpers import winfunc
key = winfunc.CryptDeriveKey(b'testdatatestdata')

To understand more about a given module, see the associated wiki entry.

Contributing

Contributing to the malchive is easy, just ensure the following requirements are met:

  • When writing utilities, decoders, or discovery scripts, consider using the available templates or review existing code if you're not sure how to get started.
  • Make sure modification or contributions pass pre-commit tests.
  • Ensure the contribution is placed in one of the component folders.
  • Updated the setup file if needed with an entry.
  • Python3 is a must.

Legal

©2021 The MITRE Corporation. ALL RIGHTS RESERVED.

Approved for Public Release; Distribution Unlimited. Public Release Case Number 21-0153

Owner
MITRE Cybersecurity
MITRE Cybersecurity
Code for "Generating Disentangled Arguments with Prompts: a Simple Event Extraction Framework that Works"

GDAP The code of paper "Code for "Generating Disentangled Arguments with Prompts: a Simple Event Extraction Framework that Works"" Event Datasets Prep

45 Oct 29, 2022
CYGNUS, the Cynical AI, combines snarky responses with uncanny aggression.

New & (hopefully) Improved CYGNUS with several API updates, user updates, and online/offline operations added!!!

Simran Farrukh 0 Mar 28, 2022
Implementation of Multistream Transformers in Pytorch

Multistream Transformers Implementation of Multistream Transformers in Pytorch. This repository deviates slightly from the paper, where instead of usi

Phil Wang 47 Jul 26, 2022
Python module (C extension and plain python) implementing Aho-Corasick algorithm

pyahocorasick pyahocorasick is a fast and memory efficient library for exact or approximate multi-pattern string search meaning that you can find mult

Wojciech Muła 763 Dec 27, 2022
An assignment from my grad-level data mining course demonstrating some experience with NLP/neural networks/Pytorch

NLP-Pytorch-Assignment An assignment from my grad-level data mining course (before I started personal projects) demonstrating some experience with NLP

David Thorne 0 Feb 06, 2022
:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

deepset 6.4k Jan 09, 2023
Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).

FREE_7773 Repo containing material for the NYU class (Master of Engineering) I teach on NLP, ML Sys etc. For context on what the class is trying to ac

Jacopo Tagliabue 90 Dec 19, 2022
A Facebook Messenger Chatbot using NLP

A Facebook Messenger Chatbot using NLP This project is about creating a messenger chatbot using basic NLP techniques and models like Logistic Regressi

6 Nov 20, 2022
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

STEMM: Self-learning with Speech-Text Manifold Mixup for Speech Translation This is a PyTorch implementation for the ACL 2022 main conference paper ST

ICTNLP 29 Oct 16, 2022
NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles NewsMTSC is a dataset for target-dependent sentiment classification (TSC)

Felix Hamborg 79 Dec 30, 2022
PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation

SITT The repo contains official PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation. Authors: Boyi Li Yin Cui T

Boyi Li 52 Jan 05, 2023
Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks, which modifies the input text with a textual template and directly uses PLMs to conduct pre

THUNLP 2.3k Jan 08, 2023
NLP, before and after spaCy

textacy: NLP, before and after spaCy textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the hig

Chartbeat Labs Projects 2k Jan 04, 2023
Stuff related to Ben Eater's 8bit breadboard computer

8bit breadboard computer simulator This is an assembler + simulator/emulator of Ben Eater's 8bit breadboard computer. For a version with its RAM upgra

Marijn van Vliet 29 Dec 29, 2022
text to speech toolkit. 好用的中文语音合成工具箱,包含语音编码器、语音合成器、声码器和可视化模块。

ttskit Text To Speech Toolkit: 语音合成工具箱。 安装 pip install -U ttskit 注意 可能需另外安装的依赖包:torch,版本要求torch=1.6.0,=1.7.1,根据自己的实际环境安装合适cuda或cpu版本的torch。 ttskit的

KDD 483 Jan 04, 2023
Just a basic Telegram AI chat bot written in Python using Pyrogram.

Nikko ChatBot Just a basic Telegram AI chat bot written in Python using Pyrogram. Requirements Python 3.7 or higher. A bot token. Installation $ https

ʀᴇxɪɴᴀᴢᴏʀ 2 Oct 21, 2022
Turkish Stop Words Türkçe Dolgu Sözcükleri

trstop Turkish Stop Words Türkçe Dolgu Sözcükleri In this repository I put Turkish stop words that is contained in the first 10 thousand words with th

Ahmet Aksoy 103 Nov 12, 2022
Model parallel transformers in JAX and Haiku

Table of contents Mesh Transformer JAX Updates Pretrained Models GPT-J-6B Links Acknowledgments License Model Details Zero-Shot Evaluations Architectu

Ben Wang 4.9k Jan 04, 2023
Yes it's true :broken_heart:

Information WARNING: No longer hosted If you would like to be on this repo's readme simply fork or star it! Forks 1 - Flowzii 2 - Errorcrafter 3 - vk-

Dropout 66 Dec 31, 2022
List of GSoC organisations with number of times they have been selected.

Welcome to GSoC Organisation Frequency And Details 👋 List of GSoC organisations with number of times they have been selected, techonologies, topics,

Shivam Kumar Jha 41 Oct 01, 2022