Query multiple mongoDB database collections easily

Overview

leakscoop

Perform queries across multiple MongoDB databases and collections, where the field names and the field content structure in each database may vary.

The Problem

Suppose you've got two database collections, "leak1" and "leak2"

In leak1, the schema looks like this:

FIRST_NAME: "JOHN"
LAST_NAME: "DOE"

and in leak2, the schema looks like this:

FName: "John"
LName: "Doe"

A simple program to iterate through all your collections and perform queries wouldn't work, because:

  • the field names are different. Notice that in leak1, the first name field is FIRST_NAME, while in leak2, the first name field is named FName.
  • the field values might be structured differently. In leak1, everything is captialized. In leak2, it's all title-case.

This program lets you write a configuration for each collection, specifying, in JSON, how to query each field.

It's a work in progress, but so far, it works pretty well. It'll probably be easier to understand if you take a look at the config files under ./collections/. Each JSON file under ./collections/ should be an array of objects. The program automatically processes all JSON files under that directory.

Some more info for how the configurations work can be found in the wiki.

Example Usage:

Find all records of a guy named John Doe.

python3 -m dev --firstname John --lastname Doe

Each database will be searched, and results will be put into a new file under ./results/

Find all records for someone with an address of "1234 NW Long St" python3 -m dev --address "1234 NW long st"

Adding a zipcode to the end, or a state/province might speed up the query (depending on how you index your databases)

Screenshot

(information redacted for this person's privacy)

image Configured fields will print to the console, while all the other fields in a result will be saved under ./results/.

Owner
bagel
bagel
PyMongo - the Python driver for MongoDB

PyMongo Info: See the mongo site for more information. See GitHub for the latest source. Documentation: Available at pymongo.readthedocs.io Author: Mi

mongodb 3.7k Jan 08, 2023
A selection of SQLite3 databases to practice querying from.

Dummy SQL Databases This is a collection of dummy SQLite3 databases, for learning and practicing SQL querying, generated with the VS Code extension Ge

1 Feb 26, 2022
python-beryl, a Python driver for BerylDB.

python-beryl, a Python driver for BerylDB.

BerylDB 3 Nov 24, 2021
edaSQL is a library to link SQL to Exploratory Data Analysis and further more in the Data Engineering.

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can giv

Tamil Selvan 8 Dec 12, 2022
Baserow is an open source no-code database tool and Airtable alternative

Baserow is an open source no-code database tool and Airtable alternative

1.3k Jan 01, 2023
Monty, Mongo tinified. MongoDB implemented in Python !

Monty, Mongo tinified. MongoDB implemented in Python ! Inspired by TinyDB and it's extension TinyMongo. MontyDB is: A tiny version of MongoDB, against

David Lai 522 Jan 01, 2023
A fast PostgreSQL Database Client Library for Python/asyncio.

asyncpg -- A fast PostgreSQL Database Client Library for Python/asyncio asyncpg is a database interface library designed specifically for PostgreSQL a

magicstack 5.8k Dec 31, 2022
Pystackql - Python wrapper for StackQL

pystackql - Python Library for StackQL Python wrapper for StackQL Usage from pys

StackQL Studios 6 Jul 01, 2022
A simple python package that perform SQL Server Source Control and Auto Deployment.

deploydb Deploy your database objects automatically when the git branch is updated. Production-ready! ⚙️ Easy-to-use 🔨 Customizable 🔧 Installation I

Mert Güvençli 10 Dec 07, 2022
Python client for Apache Kafka

Kafka Python client Python client for the Apache Kafka distributed stream processing system. kafka-python is designed to function much like the offici

Dana Powers 5.1k Jan 08, 2023
MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)

mongo-connector The mongo-connector project originated as a MongoDB mongo-labs project and is now community-maintained under the custody of YouGov, Pl

YouGov 1.9k Jan 04, 2023
aioodbc - is a library for accessing a ODBC databases from the asyncio

aioodbc aioodbc is a Python 3.5+ module that makes it possible to access ODBC databases with asyncio. It relies on the awesome pyodbc library and pres

aio-libs 253 Dec 31, 2022
aiopg is a library for accessing a PostgreSQL database from the asyncio

aiopg aiopg is a library for accessing a PostgreSQL database from the asyncio (PEP-3156/tulip) framework. It wraps asynchronous features of the Psycop

aio-libs 1.3k Jan 03, 2023
SAP HANA Connector in pure Python

SAP HANA Database Client for Python Important Notice This public repository is read-only and no longer maintained. The active maintained alternative i

SAP Archive 299 Nov 20, 2022
A Redis client library for Twisted Python

txRedis Asynchronous Redis client for Twisted Python. Install Install via pip. Usage examples can be found in the examples/ directory of this reposito

Dorian Raymer 127 Oct 23, 2022
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.

PyPika - Python Query Builder Abstract What is PyPika? PyPika is a Python API for building SQL queries. The motivation behind PyPika is to provide a s

KAYAK 1.9k Jan 04, 2023
A tutorial designed to introduce you to SQlite 3 database using python

SQLite3-python-tutorial A tutorial designed to introduce you to SQlite 3 database using python What is SQLite? SQLite is an in-process library that im

0 Dec 28, 2021
aiomysql is a library for accessing a MySQL database from the asyncio

aiomysql aiomysql is a "driver" for accessing a MySQL database from the asyncio (PEP-3156/tulip) framework. It depends on and reuses most parts of PyM

aio-libs 1.5k Jan 03, 2023
Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python.

MongoDB with Python Implementing basic MongoDB CRUD (Create, Read, Update, Delete) queries, using Python. We can connect to a MongoDB database hosted

MousamSingh 4 Dec 01, 2021
The Database Toolkit for Python

SQLAlchemy The Python SQL Toolkit and Object Relational Mapper Introduction SQLAlchemy is the Python SQL toolkit and Object Relational Mapper that giv

SQLAlchemy 6.5k Jan 01, 2023