Similar looking domain detection using python fuzzywuzzy

Overview

Similar-looking-domain-detection-using-python-fuzzywuzzy

Major cause of phishing and BEC incident is similar looking domain, if you detect it early, you can prevent incidents early, python fuzzywuzzy module let you do that and here is the process.

By statistics every day thousands of domains are registered, some are use for legit purpose and some are not. BEC incidents incresing every day and cost millions to businesses, the core of BEC is spoofed email that looks similar to your business email. Because of these similar looking domain we fall pray to BEC incidents. Sometimes we end up submitting our credential when we received any email having link that looks like genuine website. e.g. microsoft.com vs micr0soft.com.

How can we detect/prevent such incident?

In python you have module named "fuzzywuzzy" that looks for similarity in strings and gives score of how similar strings are, like 90% match, 66% match. Use that to look for simiar domains

  1. Gather data (SIEM) having domain related information e.g. Proxy logs, DNS logs, Mail logs.
  2. Have list of domains related to your business ( your owned domain, list of vendor domains with whom you carry out business )
  3. Now run fuzz module against this data and check ratio which is more than 50% ( e.g. given below )
  4. Do the analysis of domains (check whois data) which looks similar to your domain, if genuine add to list gathered in Step 2
  5. If domain is not genuine, start digging more on that domain, like any email received(mail logs), any user visited the domain(proxy logs)
  6. You can run such checks on hourly, daily basis.... thats it.

Here are coule of examples.

  1. Basic example

    from fuzzywuzzy import fuzz
    a = "microsoft.com"
    b = "micros0ft.com"
    print("Match ratio is ", str(fuzz.ratio(a, b)), "%") // fuzz.ration(a,b) function gives you match score

  2. Working code

    from fuzzywuzzy import fuzz

    dns_data = open(r'/home/user/Desktop/BEC/your_domain.txt','r') # List of genuine domains owned by you
    output = open(r'/home/user/Desktop/BEC/output.txt','w') # Output file

    for dns in dns_data:

    domain = open(r'/home/user/Desktop/BEC/domain-names-data.txt','r') # domain data gathered from proxy/dns/mail logs
    for site in domain:

    if ( fuzz.ratio(site.rstrip(),dns.rstrip()) > 80 ):

    output.write("Match ratio is: \t" + dns.rstrip() + "\t" + site.rstrip() + "\t" + str(fuzz.ratio(site.rstrip(),dns.rstrip())))
    output.write("\n")

If you have access to whois database then you can run this code against newly registered domain everyday and probably you can get the result early!!!

I have run this code against newly registered doamin on 3rd Nov. Legit domains considered are top 1000 domains. Results are amazing as to how many similar looking domains are registered everyday and no wonder we receive lot of offerers from amzon apples :) Check out Output.txt file

Feel free to share your thoughts!!!

Fuzzy box is a quick program I wrote to fuzz a URL that is in the format https:// url 20characterstring.

What is this? Fuzzy box is a quick program I wrote to fuzz a URL that is in the format https://url/20characterstring.extension. I have redacted th

Graham Helton 1 Oct 19, 2021
python package for generating typescript grpc-web stubs from protobuf files.

grpc-web-proto-compile NOTE: This package has been superseded by romnn/proto-compile, which provides the same functionality but offers a lot more flex

Roman Dahm 0 Sep 05, 2021
jsoooooooon derulo - Make sure your 'jason derulo' is featured as the first part of your json data

jsonderulo Make sure your 'jason derulo' is featured as the first part of your json data Install: # python pip install jsonderulo poetry add jsonderul

jesse 3 Sep 13, 2021
A utility that makes it easy to work with Python projects containing lots of packages, of which you only want to develop some.

Mixed development source packages on top of stable constraints using pip mxdev [mɪks dɛv] is a utility that makes it easy to work with Python projects

BlueDynamics Alliance 6 Jun 08, 2022
A set of Python scripts to surpass human limits in accomplishing simple tasks.

Human benchmark fooler Summary A set of Python scripts with Selenium designed to surpass human limits in accomplishing simple tasks available on https

Bohdan Dudchenko 3 Feb 10, 2022
Go through a random file in your favourite open source projects!

Random Source Codes Never be bored again! Staring at your screen and just scrolling the great world wide web? Would you rather read through some code

Mridul Seth 1 Nov 03, 2022
Just some scripts to export vector tiles to geojson.

Vector tiles to GeoJSON Nowadays modern web maps are usually based on vector tiles. The great thing about vector tiles is, that they are not just imag

Lilith Wittmann 77 Jul 26, 2022
Genart - Generate random art to sell as nfts

Genart - Generate random art to sell as nfts Usage git clone

Will 13 Mar 17, 2022
The Black shade analyser and comparison tool.

diff-shades The Black shade analyser and comparison tool. AKA Richard's personal take at a better black-primer (by stealing ideas from mypy-primer) :p

Richard Si 10 Apr 29, 2022
Python HTTP Agent Parser

Features Fast Detects OS and Browser. Does not aim to be a full featured agent parser Will not turn into django-httpagentparser ;) Usage import ht

Shekhar 213 Dec 06, 2022
Rabbito is a mini tool to find serialized objects in input values

Rabbito-ObjectFinder Rabbito is a mini tool to find serialized objects in input values What does Rabbito do Rabbito has the main object finding Serial

7 Dec 13, 2021
API for obtaining results from the Beery-Bukenica test of the visomotor integration development (VMI) 4th edition.

VMI API API for obtaining results from the Beery-Bukenica test of the visomotor integration development (VMI) 4th edition. Install docker-compose up -

Victor Vargas Sandoval 1 Oct 26, 2021
An extremely simple package with a single utillity class used for gracefully handling POSIX shutdown signals.

graceful-killer An extremely simple package with a single utillity class used for gracefully handling POSIX shutdown signals. Installation Use pip to

Sven Ćurković 1 Dec 09, 2021
Stubmaker is an easy-to-use tool for generating python stubs.

Stubmaker is an easy-to-use tool for generating python stubs. Requirements Stubmaker is to be run under Python 3.7.4+ No side effects during

Toloka 24 Aug 28, 2022
This repository contains some utilities for playing with PKINIT and certificates.

PKINIT tools This repository contains some utilities for playing with PKINIT and certificates. The tools are built on minikerberos and impacket. Accom

Dirk-jan 395 Dec 27, 2022
Import the module and create an object of the class LocalVariable.

LocalVariable Import the module and create an object of the class LocalVariable. Call the save method with the name and the value of a variable as arg

Sajedur Rahman Fiad 2 Dec 14, 2022
iOS Snapchat parser for chats and cached files

ParseSnapchat iOS Snapchat parser for chats and cached files Tested on Windows and Linux install required libraries: pip install -r requirements.txt c

11 Dec 05, 2022
Python code to remove empty folders from Windows/Android.

Empty Folder Cleaner is a program that deletes empty folders from your computer or device and removes clutter to improve performance. It supports only windows and android for now.

Dark Coder Cat | Vansh 4 Sep 27, 2022
Enable ++x and --x expressions in Python

By default, Python supports neither pre-increments (like ++x) nor post-increments (like x++). However, the first ones are syntactically correct since Python parses them as two subsequent +x operation

Alexander Borzunov 85 Dec 29, 2022
Personal Toolbox Package

Jammy (Jam) A personal toolbox by Qsh.zh. Usage setup For core package, run pip install jammy To access functions in bin git clone https://gitlab.com/

5 Sep 16, 2022