Searches through git repositories for high entropy strings and secrets, digging deep into commit history

Overview

truffleHog

codecov

Searches through git repositories for secrets, digging deep into commit history and branches. This is effective at finding secrets accidentally committed.

Join The Slack

Have questions? Feedback? Jump in slack and hang out with me

https://join.slack.com/t/trufflehog-community/shared_invite/zt-pw2qbi43-Aa86hkiimstfdKH9UCpPzQ

NEW

truffleHog previously functioned by running entropy checks on git diffs. This functionality still exists, but high signal regex checks have been added, and the ability to suppress entropy checking has also been added.

truffleHog --regex --entropy=False https://github.com/dxa4481/truffleHog.git

or

truffleHog file:///user/dxa4481/codeprojects/truffleHog/

With the --include_paths and --exclude_paths options, it is also possible to limit scanning to a subset of objects in the Git history by defining regular expressions (one per line) in a file to match the targeted object paths. To illustrate, see the example include and exclude files below:

include-patterns.txt:

src/
# lines beginning with "#" are treated as comments and are ignored
gradle/
# regexes must match the entire path, but can use python's regex syntax for
# case-insensitive matching and other advanced options
(?i).*\.(properties|conf|ini|txt|y(a)?ml)$
(.*/)?id_[rd]sa$

exclude-patterns.txt:

(.*/)?\.classpath$
.*\.jmx$
(.*/)?test/(.*/)?resources/

These filter files could then be applied by:

trufflehog --include_paths include-patterns.txt --exclude_paths exclude-patterns.txt file://path/to/my/repo.git

With these filters, issues found in files in the root-level src directory would be reported, unless they had the .classpath or .jmx extension, or if they were found in the src/test/dev/resources/ directory, for example. Additional usage information is provided when calling trufflehog with the -h or --help options.

These features help cut down on noise, and makes the tool easier to shove into a devops pipeline.

Example

Install

pip install truffleHog

Customizing

Custom regexes can be added with the following flag --rules /path/to/rules. This should be a json file of the following format:

{
    "RSA private key": "-----BEGIN EC PRIVATE KEY-----"
}

Things like subdomain enumeration, s3 bucket detection, and other useful regexes highly custom to the situation can be added.

Feel free to also contribute high signal regexes upstream that you think will benefit the community. Things like Azure keys, Twilio keys, Google Compute keys, are welcome, provided a high signal regex can be constructed.

trufflehog's base rule set sources from https://github.com/dxa4481/truffleHogRegexes/blob/master/truffleHogRegexes/regexes.json

To explicitly allow particular secrets (e.g. self-signed keys used only for local testing) you can provide an allow list --allow /path/to/allow in the following format:

{
    "local self signed test key": "-----BEGIN EC PRIVATE KEY-----\nfoobar123\n-----END EC PRIVATE KEY-----",
    "git cherry pick SHAs": "regex:Cherry picked from .*",
}

Note that values beginning with regex: will be used as regular expressions. Values without this will be literal, with some automatic conversions (e.g. flexible newlines).

How it works

This module will go through the entire commit history of each branch, and check each diff from each commit, and check for secrets. This is both by regex and by entropy. For entropy checks, truffleHog will evaluate the shannon entropy for both the base64 char set and hexidecimal char set for every blob of text greater than 20 characters comprised of those character sets in each diff. If at any point a high entropy string >20 characters is detected, it will print to the screen.

Help

usage: trufflehog [-h] [--json] [--regex] [--rules RULES] [--allow ALLOW]
                  [--entropy DO_ENTROPY] [--since_commit SINCE_COMMIT]
                  [--max_depth MAX_DEPTH]
                  git_url

Find secrets hidden in the depths of git.

positional arguments:
  git_url               URL for secret searching

optional arguments:
  -h, --help            show this help message and exit
  --json                Output in JSON
  --regex               Enable high signal regex checks
  --rules RULES         Ignore default regexes and source from json list file
  --allow ALLOW         Explicitly allow regexes from json list file
  --entropy DO_ENTROPY  Enable entropy checks
  --since_commit SINCE_COMMIT
                        Only scan from a given commit hash
  --branch BRANCH       Scans only the selected branch
  --max_depth MAX_DEPTH
                        The max commit depth to go back when searching for
                        secrets
  -i INCLUDE_PATHS_FILE, --include_paths INCLUDE_PATHS_FILE
                        File with regular expressions (one per line), at least
                        one of which must match a Git object path in order for
                        it to be scanned; lines starting with "#" are treated
                        as comments and are ignored. If empty or not provided
                        (default), all Git object paths are included unless
                        otherwise excluded via the --exclude_paths option.
  -x EXCLUDE_PATHS_FILE, --exclude_paths EXCLUDE_PATHS_FILE
                        File with regular expressions (one per line), none of
                        which may match a Git object path in order for it to
                        be scanned; lines starting with "#" are treated as
                        comments and are ignored. If empty or not provided
                        (default), no Git object paths are excluded unless
                        effectively excluded via the --include_paths option.

Running with Docker

First, enter the directory containing the git repository

cd /path/to/git

To launch the trufflehog with the docker image, run the following"

docker run --rm -v "$(pwd):/proj" dxa4481/trufflehog file:///proj

-v mounts the current working dir (pwd) to the /proj dir in the Docker container

file:///proj references that very same /proj dir in the container (which is also set as the default working dir in the Dockerfile)

Wishlist

  • A way to detect and not scan binary diffs
  • Don't rescan diffs if already looked at in another branch
  • A since commit X feature
  • Print the file affected
Comments
  • fix #8 - add `--include` and `--exclude` options

    fix #8 - add `--include` and `--exclude` options

    Fixes issue #8 by adding --include_paths and --exclude_paths options that allow the user to limit scanning to a subset of objects in the Git history by defining regular expressions (one per line) in a file to match the targeted object paths.

    If provided, the --include_paths option should point to a file with regular expressions (one per line), at least one of which must match a Git object path in order for it to be scanned. If empty or not provided (default), all Git object paths are included (unless otherwise excluded via the --exclude_paths option).

    Likewise, the --exclude_paths option, when provided, should point to a file with regular expressions, none of which may match a Git object path in order for it to be scanned. If empty or not provided (default), no Git object paths are excluded (unless effectively excluded via the --include_paths option).

    In either file, lines starting with "#" are treated as comments and are ignored.

    opened by milo-minderbinder 22
  • fix --since_commit parameter

    fix --since_commit parameter

    Hi, how can I contribute to this project? I was running truffleHog and using the --since_commit parameter, however it was buggy and did not work as expected. I made a very small change, and it worked as expected. Do you accept PRs or should I just tell you the change so you can verify it?

    opened by fahrishb 18
  • The regex functionality is not working as expected

    The regex functionality is not working as expected

    I git cloned the truffleHog repository. Changed my regexChecks.py file to look like below:

    import re
    
    regexes = {
        "Slack Token XOXP": re.compile('xoxp.*'),
        "Slack Token XOXB": re.compile('xoxb.*'),
        "Slack Token XOXO": re.compile('xoxo.*'),
        "Slack Token XOXA": re.compile('xoxa.*'),
        "AWS API Key": re.compile('AKIA.*'),
        "Private key": re.compile('-----BEGIN PRIVATE KEY-----.*')
    }
    

    I then installed the libraries required to run the tool by typing pip install -r requirements.txt. My requirements.txt file looked like below:

    GitPython==2.1.5
    gitdb2==2.0.2
    smmap2==2.0.2
    

    Finally, I ran the tool by typing - python truffleHog.py --regex --entropy=False https://github.com/secretuser1/secretrepo.git

    It printed out the Private Key, Slack Token XOXP and Slack Token XOXB. It should have also printed out the AWS key here - https://github.com/secretuser1/secretrepo/blob/master/secretfile.txt#L2 but it did not, even though the regex is present.

    Any idea why?

    opened by anshumanbh 14
  • Adding the capability for scanning a directory

    Adding the capability for scanning a directory

    This PR adds the capability for truffleHog to recursively scan a directory instead of a Git repository with all its history. This can be useful in CI pipelines or other situations where it is desirable to scan the codebase at a single point in time. Additionally, it can also be used to scan code that is not stored in Git.

    I've done some minor refactoring to the existing scanning code to reduce code duplication.

    opened by runako 14
  • ValueError: unknown reasons (During run application on EC2 RedHat)

    ValueError: unknown reasons (During run application on EC2 RedHat)

    If anyone can help I will be appreciate! Describe the bug Having an error during running the app on EC2 RedHat : ValueError: unknown reasons

    I installed on redhat ec2 instance trufflehog. By default there python 2.7 and 3.6 Trufflehog was installed from pip3. pip3 freeze shows that everything installed : gitdb==4.0.5 gitdb2==4.0.2 GitPython==3.0.6 smmap==3.0.5 truffleHog==2.2.1 truffleHogRegexes==0.0.7

    After installation and running next command ( just to check does it work or not) trufflehog --regex --entropy=False https://github.com/dxa4481/truffleHog.git ( I got next error) : Traceback (most recent call last): File "/usr/local/bin/trufflehog", line 11, in sys.exit(main()) File "/usr/local/lib64/python3.6/site-packages/truffleHog/truffleHog.py", line 93, in main surpress_output=False, branch=args.branch, repo_path=args.repo_path, path_inclusions=path_inclusions, path_exclusions=path_exclusions, allow=allow) File "/usr/local/lib64/python3.6/site-packages/truffleHog/truffleHog.py", line 351, in find_strings diff_hash = hashlib.md5((str(prev_commit) + str(curr_commit)).encode('utf-8')).digest() ValueError: unknown reasons

    opened by RepositoryOfCode 12
  •  Git issue when trying to scan cloned project in Apple-mac: trufflehog file:///

    Git issue when trying to scan cloned project in Apple-mac: trufflehog file:///

    Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.6/bin/trufflehog", line 10, in sys.exit(main()) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 82, in main surpress_output=False, branch=args.branch, repo_path=args.repo_path, path_inclusions=path_inclusions, path_exclusions=path_exclusions) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 309, in find_strings project_path = clone_git_repo(git_url) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 152, in clone_git_repo Repo.clone_from(git_url, project_path) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/repo/base.py", line 925, in clone_from return cls._clone(git, url, to_path, GitCmdObjectDB, progress, **kwargs) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/repo/base.py", line 880, in _clone finalize_process(proc, stderr=stderr) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/util.py", line 341, in finalize_process proc.wait(**kwargs) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/cmd.py", line 291, in wait raise GitCommandError(self.args, status, errstr) git.exc.GitCommandError: Cmd('git') failed due to: exit code(128) cmdline: git clone -v file:///GitHub/Github-guardian/ /var/folders/37/md_m401d073bw1vt7q863pbw0000gn/T/tmpchgjs_ln stderr: 'Cloning into '/var/folders/37/md_m401d073bw1vt7q863pbw0000gn/T/tmpchgjs_ln'... fatal: '/GitHub/Github-guardian/' does not appear to be a git repository fatal: Could not read from remote repository.

    Please make sure you have the correct access rights and the repository exists. '

    opened by dgurazada 12
  • Depth limits are needed to prevent long jobs

    Depth limits are needed to prevent long jobs

    When leveraging trufflehog for repo scans, it would be helpful to introduce the concept of depth limits, to ensure that when a scan is performed, it only goes to a certain number of commits back. On a test repository that I have, there is a huge number of commits dating back to 2014, and the job is running well more than 24 hours to go deep across all of them.

    opened by dend 11
  • WindowsError: [Error 5] Access is denied

    WindowsError: [Error 5] Access is denied

    Traceback (most recent call last): File "trufflehog.py", line 106, in <module> find_strings(args.git_url) File "trufflehog.py", line 98, in find_strings shutil.rmtree(project_path) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 252, in rmtree onerror(os.remove, fullname, sys.exc_info()) File "C:\Python27\lib\shutil.py", line 250, in rmtree os.remove(fullname) WindowsError: [Error 5] Access is denied: 'temp\\[uuid]\\.git\\objects\\pack\\pack-[uuid].idx'

    When scanning some repos. (This one crashes half way through, This one crashes at startup)

    opened by Peter-Maguire 11
  • i cant see the result

    i cant see the result

    1. See error

    {"level":"debug","msg":"Cloning remote Git repo without authentication","time":"2022-04-05T16:19:28Z"} {"level":"debug","msg":"Git repo local path: /tmp/trufflehog944564607","time":"2022-04-05T16:23:19Z"}

    2022/04/05 16:44:16 [updater parent] prog exited with 1

    I can see the result if found or note even if I use --json I cant see the saved file its always clone the repo in tmp folder after finish scanning it should delete the cloned folder in the tmp

    bug 
    opened by abramas 10
  • Hardcoded thresholds of 20 in get_strings_of_set()

    Hardcoded thresholds of 20 in get_strings_of_set()

    threshold keyword variable is declared and used on the last if statement Line 39, but not in the first else statement Line 35

    def get_strings_of_set(word, char_set, threshold=20):
        count = 0
        letters = ""
        strings = []
        for char in word:
            if char in char_set:
                letters += char
                count += 1
            else:
                if count > 20:
                    strings.append(letters)
                letters = ""
                count = 0
        if count > threshold:
            strings.append(letters)
    
    opened by bandrel 10
  • gitdb update breaks trufflehog

    gitdb update breaks trufflehog

    Probably related to #198

    We install inside a docker container using:

    $ pip install truffleHog==2.0.99
    

    We run:

    $ trufflehog --regex --entropy=False .
    

    Starting today this errored with:

    Traceback (most recent call last):
       File "/usr/local/bin/trufflehog", line 5, in <module>
         from truffleHog.truffleHog import main
       File "/usr/local/lib/python3.8/site-packages/truffleHog/truffleHog.py", line 17, in <module>
         from git import Repo
       File "/usr/local/lib/python3.8/site-packages/git/__init__.py", line 38, in <module>
         from git.config import GitConfigParser  # @NoMove @IgnorePep8
       File "/usr/local/lib/python3.8/site-packages/git/config.py", line 16, in <module>
         from git.compat import (
       File "/usr/local/lib/python3.8/site-packages/git/compat.py", line 16, in <module>
         from gitdb.utils.compat import (
     ModuleNotFoundError: No module named 'gitdb.utils.compat'
    

    A quick dive down the dependency tree showed that the trufflehog dependency on gitpython-2.1.1 (here) is pulling in gitdb2-3.0.2 (here) which has removed the gitdb.utils.compat (PR)

    Our fix for now is to use (may be useful to others):

    pip install gitdb2==3.0.0 truffleHog==2.0.99
    
    opened by danieldooley 9
  • Use access-token endpoint for validity check

    Use access-token endpoint for validity check

    This PR fixes the issue https://github.com/trufflesecurity/trufflehog/issues/990, it should correctly report keys as valid even if they are missing the user_read scope.

    opened by clonsdale-canva 1
  • Buildkite token validation missing tokens without user_read scope

    Buildkite token validation missing tokens without user_read scope

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    TruffleHog Version

    3.21.0

    Expected Behavior

    Buildkite token is reported as valid

    Actual Behavior

    Buildkite token is not validated as the API call fails due to missing user_read scope

    Additional Context

    The logic to check if a buildkite token is valid will send out an API call to the /user endpoint https://github.com/trufflesecurity/trufflehog/blob/009756dce61948a66cf90a8b14018460c91ab4f0/pkg/detectors/buildkite/buildkite.go#L51. This will miss all tokens which do not have the read_user scope.

    Instead, we can use the access-token endpoint, which will return 200 for any valid token, and report on the scopes present / ID of the token - https://buildkite.com/docs/apis/rest-api/access-token.

    bug 
    opened by clonsdale-canva 0
  • Add max-depth limit to GitHub subcommand

    Add max-depth limit to GitHub subcommand

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    Description

    Ability to limit the depth of the commit history being scanned for GitHub users We need the ability to set a --max-depth= limit to GitHub sub command.

    Problem to be Addressed

    It is very noisy for large GitHub enterprises to detect new issues due to the inability to ignore historical commit history that one has already remediated. Results have to be saved into a spreadsheet or database and then diff'd to see what has changed.

    Description of the Preferred Solution

    The ability to set a --max-depth= limit to GitHub sub command. This would be very beneficial when attempting to scan a GitHub enterprise repositories as a group.

    Additional Context

    References

    • #0000
    enhancement 
    opened by dwilliamsstc 0
  • go install - missing dot in first path element

    go install - missing dot in first path element

    build github.com/trufflesecurity/trufflehog/v3: cannot load embed: malformed module path "embed": missing dot in first path element

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    TruffleHog Version

    Trace Output

    Expected Behavior

    Actual Behavior

    Steps to Reproduce

    Distributor ID: Elementary Description: elementary OS 6.1 Jólnir Release: 6.1 Codename: jolnir

    Additional Context

    References

    • #0000
    bug 
    opened by rip752 0
  • Run certain Detector Type

    Run certain Detector Type

    trufflehog version: trufflehog dev

    Currently I am running trufflehog as a pre-commit hook with all possible Detector type. Is it possible to only run few Detector types , say AWS keys, Private keys as such?

    bug 
    opened by Priyadhana 0
Releases(v3.21.0)
RedDrop is a quick and easy web server for capturing and processing encoded and encrypted payloads and tar archives.

RedDrop Exfil Server Check out the accompanying MaverisLabs Blog Post Here! RedDrop Exfil Server is a Python Flask Web Server for Penetration Testers,

53 Nov 01, 2022
Brute-Force-Connected

Brute-Force-Connected Guess the password for Connected accounts the use : Create a new file and put usernames and passwords in it Example : joker:1234

4 Jun 05, 2022
Hikvision 流媒体管理服务器敏感信息泄漏

Hikvisioninformation Hikvision 流媒体管理服务器敏感信息泄漏 Options optional arguments: -h, --help show this help message and exit -u url, --url url

Henry4E36 13 Nov 09, 2022
Experimental musig2 python code, not for production use!

musig2-py Experimental musig2 python code, not for production use! This is just for testing things out. All public keys are encoded as 32 bytes, assum

Samuel Dobson 14 Jul 08, 2022
Tool To generate Stable Undetected Payload

windowsPayload Tool To generate Stable Undetected Payload Don t Upload to Virus Total :) Follow on Social Media Platforms ScreenShots How to install +

youhacker55 117 Dec 30, 2022
A (completely native) python3 wifi brute-force attack using the 100k most common passwords (2021)

wifi-bf [LINUX ONLY] A (completely native) python3 wifi brute-force attack using the 100k most common passwords (2021) This script is purely for educa

Finn Lancaster 20 Nov 12, 2022
CVE-2022-22965 : about spring core rce

CVE-2022-22965: Spring-Core-Rce EXP 特性: 漏洞探测(不写入 webshell,简单字符串输出) 自定义写入 webshell 文件名称及路径 不会追加写入到同一文件中,每次检测写入到不同名称 webshell 文件 支持写入 冰蝎 webshell 代理支持,可

东方有鱼名为咸 53 Nov 09, 2022
wsvuls - website vulnerability scanner detect issues [ outdated server software and insecure HTTP headers.]

WSVuls Website vulnerability scanner detect issues [ outdated server software and insecure HTTP headers.] What's WSVuls? WSVuls is a simple and powerf

Anouar Ben Saad 47 Sep 22, 2022
telegram bug that discloses user's hidden phone number (still unpatched) (exploit included)

CVE-2019-15514 Type: Information Disclosure Affected Users, Versions, Devices: All Telegram Users Still not fixed/unpatched. brute.py is available exp

Gray Programmerz 66 Dec 08, 2022
edgedressing leverages a Windows "feature" in order to force a target's Edge browser to open. This browser is then directed to a URL of choice.

edgedressing One day while experimenting with airpwn-ng, I noticed unexpected GET requests on the target node. The node in question happened to be a W

stryngs 43 Dec 23, 2022
Tinyman exploit finder - Tinyman exploit finder for python

tinyman_exploit_finder There was a big tinyman exploit. You can read about it he

fish.exe 9 Dec 27, 2022
This is a Cryptographied Password Manager, a tool for storing Passwords in a Secure way

Cryptographied Password Manager This is a Cryptographied Password Manager, a tool for storing Passwords in a Secure way without using external Service

Francesco 3 Nov 23, 2022
A set of blender assets created for the $yb NFT project.

fyb-blender A set of blender assets created for the $yb NFT project. Install just as you would any other Blender Add-on (via Edit-Preferences-Add-on

Pedro Arroyo 1 May 06, 2022
Transparent proxy server that works as a poor man's VPN. Forwards over ssh. Doesn't require admin. Works with Linux and MacOS. Supports DNS tunneling.

sshuttle: where transparent proxy meets VPN meets ssh As far as I know, sshuttle is the only program that solves the following common case: Your clien

9.4k Jan 04, 2023
Workshop Material on VM-based Deobfuscation

Analysis of Virtualization-based Obfuscation This repository contains slides, samples and code of the 4h code deobfuscation workshop at r2con2021. We

Tim Blazytko 133 Dec 18, 2022
Binary check tool to identify command injection and format string vulnerabilities in blackbox binaries

Binary check tool to identify command injection and format string vulnerabilities in blackbox binaries. Using xrefs to commonly injected and format string'd files, it will scan binaries faster than F

Christopher Roberts 3 Nov 16, 2021
MassStringer, CTF Flag Finder

massStringer MassStringer, CTF Flag Finder Usage: python3 massStringer.py Enter absolute path of the directory to scan for flags Edit "flag = re.searc

SuperTsumu 4 Sep 06, 2022
Fuck - Multi Brute Force 🚶‍♂

f-mbf Fuck - Multi Brute Force 🚶‍♂ Install Script $ pkg update && pkg upgrade $ pkg install python2 $ pkg install git $ pip2 install requests $ pip2

Yumasaa 1 Dec 03, 2021
一款针对向日葵的识别码和验证码提取工具

Sunflower_get_Password 一款针对向日葵的识别码和验证码提取工具 👮🏻‍♀️ 免责声明 由于传播、利用Sunflower_get_Password工具提供的功能而造成的任何直接或者间接的后果及损失,均由使用者本人负责,本人不为此承担任何责任。 安装环境 本工具使用Python

635 Dec 20, 2022