Complete portable pipeline for masking of Aadhaar Number adhering to Govt. Privacy Guidelines.

Overview

Aadhaar Number Masking Pipeline

Implementation of a complete pipeline that masks the Aadhaar Number in given images to adhere to Govt. of India's Privacy Guidelines for storage of Aadhaar Card images in digital repository. The following project was carried out as an internship for Muthoot Finance. We make use of the open source packages CRAFT text detector | Paper | Pretrained Model | Github Repo provided by Clova AI Research for OSD and combine a heurestic model with pytesseract OCR for masking.

Rohit Ranjan, Ram Sundaram.

Sample Results

teaser

Versions

The search for the best masking pipeline led us to experiment with several different approaches. We have documented our experiments in other branches.

Branch(->model) Speed/Performance Pipeline
main Best performing CRAFT + pytesseract + dimensional heuristics
CNN_OCR->cnn_model Fastest masking CRAFT + LeNet trained by us
CNN_OCR->cnn_model_2 Fastest masking CRAFT + LeNet trained by us
UNET_OCDR Theoretically Fastest but trained model unavailable** UNet

**We proposed and implemented a pipeline which uses a single UNet model for achieving a desirable mask. A single model would have made the inference very fast and real time use capable on mobile devices. Training meant creating a dataset since the company could not legally provide us the needed data. After several trials, we halted work on this model because with barely 150 unique datapoints available, a data hungry UNet Model is simply unsatiable for now.

Datasets

Lenet was trained on our self-created labelled dataset | labels.

Getting started

Install dependencies

Requirements

  • torch
  • opencv-python
  • tesseract-ocr
  • check requirements.txt
pip install -r requirements.txt

Test instructions

  • Clone this repository
git clone https://github.com/thefurorjuror/Aadhaar_Masker.git
  • Run on an image folder
python [folder path to the cloned repo]/masker.py --test_folder=[folder path to test images] --output_folder=[folder path to output images] --cuda=[True/False]
#Example- When one is inside the cloned repo
python masker.py --test_folder=./images/ --output_folder=./output/ --cuda=True

cuda is set to False by default.

Citation

@inproceedings{baek2019character,
  title={Character Region Awareness for Text Detection},
  author={Baek, Youngmin and Lee, Bado and Han, Dongyoon and Yun, Sangdoo and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9365--9374},
  year={2019}
}

License

Copyright (c) 2021-present Rohit Ranjan & Ram Sundaram.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Public Mirror of Team 15's Code and Reports for RBE 3002 B21

RBE3002 Team 15 Lab Repository Team 15's Repository for all code written for RBE 3002 using the Robotis TurtleBot3 Written By Matthew Haahr, Leo Morri

Matthew Haahr 3 Mar 21, 2022
Set up recurring buys in Gemini

Overview Set up recurring buys in Gemini. Given some keys (Create API Keys), allows you to configure a recurring buy using the reduced API maker and t

Ahmad Abuomar 3 Jan 06, 2022
HTTP API for TON (The Open Network)

HTTP API for The Open Network Since TON nodes uses its own ADNL binary transport protocol, a intermediate service is needed for an HTTP connection. TO

66 Dec 28, 2022
A telegram bot to read RSS feeds

Telegram bot to fetch RSS feeds This is a telegram bot that fetches RSS feeds in regular intervals and send it to you. The feed sources can be added o

Santhosh Thottingal 14 Dec 15, 2022
Light weight Scripts and Apps for checking availability of Covid Vaccines in India. Notifies when vaccine becomes avialable in your area.

vaccine-checker Light weight Scripts and Apps for checking availability of Covid Vaccines in India. Notifies when vaccine becomes avialable in your ar

Abishek V Ashok 8 Jun 16, 2021
A python library built on the API of the coderHub.sa, which helps you to fetch the challenges and more

coderHub A python library built on the API of the coderHub.sa, which helps you to fetch the challenges and more Installation • Features • Usage • Lice

TheAwiteb 5 Nov 04, 2022
Prometheus exporter for CNMC API

CNMC Prometheus exporter It needs a Prometheus Pushgateway Install requirements via pip install -r requirements.txt Export the following environment v

GISCE-TI 1 Oct 20, 2021
A python API wrapper for temp-mail.org

temp-mail Python API Wrapper for temp-mail.ru service. Temp-mail is a service which lets you use anonymous emails for free. You can view full API spec

Denis Veselov 91 Nov 19, 2022
Weather_besac is a French twitter bot that tweet the weather of the city of Besançon in Franche-Comté in France every day at 8am and 4pm.

Weather Bot Besac Weather_besac is a French twitter bot that tweet the weather of the city of Besançon in Franche-Comté in France every day at 8am and

Rgld_ 1 Nov 15, 2021
JAKYM, Just Another Konsole YouTube-Music. A command line based Youtube music player written in Python with spotify and youtube playlist support

Just Another Konsole YouTube-Music Overview I wanted to create this application so that I could use the command line to play music easily. I often pla

Mayank Jha 73 Jan 01, 2023
Telegram music & video bot direct play music

Telegram music & video bot direct play music

noinoi-X 1 Dec 28, 2021
A Telegram bot that searches for the original source of anime, manga, and art

A Telegram bot that searches for the original source of anime, manga, and art How to use the bot Just send a screenshot of the anime, manga or art or

Kira Kormak 9 Dec 28, 2022
A Discord token grabber written in Python3, with awesome obfuscation and anti-debug protection.

☣️ Plague ☣️ Plague is a Discord token grabber written in Python3, obfuscated with Kramer, protected from traffic analysers with Scarecrow and using t

Billy 125 Dec 20, 2022
cipher bot telegram

cipher-bot-telegram cipher bot telegram Telegram bot that encode/decode your messages To work correctly, you must install the latest version of python

anonim 1 Oct 10, 2021
To dynamically change the split direction in I3/Sway so as to split new windows automatically based on the width and height of the focused window

To dynamically change the split direction in I3/Sway so as to split new windows automatically based on the width and height of the focused window Insp

Ritin George 6 Mar 11, 2022
Nasdaq Cloud Data Service (NCDS) provides a modern and efficient method of delivery for realtime exchange data and other financial information. This repository provides an SDK for developing applications to access the NCDS.

Nasdaq Cloud Data Service (NCDS) Nasdaq Cloud Data Service (NCDS) provides a modern and efficient method of delivery for realtime exchange data and ot

Nasdaq 8 Dec 01, 2022
This repository contains free labs for setting up an entire workflow and DevOps environment from a real-world perspective in AWS

DevOps-The-Hard-Way-AWS This tutorial contains a full, real-world solution for setting up an environment that is using DevOps technologies and practic

Mike Levan 1.6k Jan 05, 2023
Tiktok-bot - A tiktok bot with python

Install the requirements pip install selenium pip install pyfiglet==0.7.5 How ca

Ukis 5 Aug 23, 2022
DeKrypt 24 Sep 21, 2022
Pack up to 3MB of data into a tweetable PNG polyglot file.

tweetable-polyglot-png Pack up to 3MB of data into a tweetable PNG polyglot file. See it in action here: https://twitter.com/David3141593/status/13719

David Buchanan 2.4k Dec 29, 2022