A collection of common regular expressions bundled with an easy to use interface.

Overview

CommonRegex

Find all times, dates, links, phone numbers, emails, ip addresses, prices, hex colors, and credit card numbers in a string. We did the hard work so you don't have to.

Pull requests welcome!

Installation

Install via pip

sudo pip install commonregex

or via setup.py

python setup.py install

Usage

>>> from commonregex import CommonRegex
>>> parsed_text = CommonRegex("""John, please get that article on www.linkedin.com to me by 5:00PM 
                               on Jan 9th 2012. 4:00 would be ideal, actually. If you have any 
                               questions, You can reach me at (519)-236-2723x341 or get in touch with
                               my associate at [email protected]""")
>>> parsed_text.times
['5:00PM', '4:00']
>>> parsed_text.dates
['Jan 9th 2012']
>>> parsed_text.links
['www.linkedin.com']
>>> parsed_text.phones
['(519)-236-2727']
>>> parsed_text.phones_with_exts
['(519)-236-2723x341']
>>> parsed_text.emails
['[email protected]']

Alternatively, you can generate a single CommonRegex instance and use it to parse multiple segments of text.

>>> parser = CommonRegex()
>>> parser.times("When are you free?  Do you want to meet up for coffee at 4:00?")
['4:00']

Finally, all regular expressions used are publicly exposed.

>>> from commonregex import email
>>> import re
>>> text = "...get in touch with my associate at [email protected]"
>>> re.sub(email, "[email protected]", text)
'...get in touch with my associate at [email protected]'
>>> from commonregex import time
>>> for m in time.finditer("Does 6:00 or 7:00 work better?"):
>>>     print m.start(), m.group()     
5 6:00 
13 7:00 

Please note that this module is currently English/US specific.

Supported Methods/Attributes

  • obj.dates, obj.dates()
  • obj.times, obj.times()
  • obj.phones, obj.phones()
  • obj.phones_with_exts, obj.phones_with_exts()
  • obj.links, obj.links()
  • obj.emails, obj.emails()
  • obj.ips, obj.ips()
  • obj.ipv6s, obj.ipv6s()
  • obj.prices, obj.prices()
  • obj.hex_colors, obj.hex_colors()
  • obj.credit_cards, obj.credit_cards()
  • obj.btc_addresses, obj.btc_addresses()
  • obj.street_addresses, obj.street_addresses()
  • obj.zip_codes, obj.zip_codes()
  • obj.po_boxes, obj.po_boxes()
  • obj.ssn_number, obj.ssn_number()

CommonRegex Ports:

CommonRegexRust

[CommonRegexJS] (https://github.com/talyssonoc/CommonRegexJS)

[CommonRegexScala] (https://github.com/everpeace/CommonRegexScala)

[CommonRegexJava] (https://github.com/talyssonoc/CommonRegexJava)

[CommonRegexCobra] (https://github.com/PurityLake/CommonRegex-Cobra)

[CommonRegexDart] (https://github.com/aufdemrand/CommonRegexDart)

[CommonRegexRuby] (https://github.com/talyssonoc/CommonRegexRuby)

[CommonRegexPHP] (https://github.com/james2doyle/CommonRegexPHP)

Analytics

Owner
Madison May
Machine Learning Architect at @IndicoDataSolutions
Madison May
A work in progress box containing various Python utilities

python-wipbox A set of modern Python libraries under development to simplify the execution of reusable routines by different projects. Table of Conten

Deepnox 2 Jan 20, 2022
Let's renew the puzzle collection. We'll produce a collection of new puzzles out of the lichess game database.

Let's renew the puzzle collection. We'll produce a collection of new puzzles out of the lichess game database.

Thibault Duplessis 96 Jan 03, 2023
Python code to divide big numbers

divide-big-num Python code to divide big numbers

VuMinhNgoc 1 Oct 15, 2021
Python @deprecat decorator to deprecate old python classes, functions or methods.

deprecat Decorator Python @deprecat decorator to deprecate old python classes, functions or methods. Installation pip install deprecat Usage To use th

12 Dec 12, 2022
Simple yet flexible natural sorting in Python.

natsort Simple yet flexible natural sorting in Python. Source Code: https://github.com/SethMMorton/natsort Downloads: https://pypi.org/project/natsort

Seth Morton 712 Dec 23, 2022
🦩 A Python tool to create comment-free Jupyter notebooks.

Pelikan Pelikan lets you convert notebooks to comment-free notebooks. In other words, It removes Python block and inline comments from source cells in

Hakan Özler 7 Nov 20, 2021
Python Yeelight YLKG07YL/YLKG08YL dimmer handler

With this class you can receive, decrypt and handle Yeelight YLKG07YL/YLKG08YL dimmer bluetooth notifications in your python code.

12 Dec 26, 2022
Create a Web Component (a Custom Element) from a python file

wyc Create a Web Component (a Custom Element) from a python file (transpile python code to javascript (es2015)). Features Use python to define your cu

7 Oct 09, 2022
Python script to launch burp scans automatically

SimpleAutoBurp Python script that takes a config.json file as config and uses Burp Suite Pro to scan a list of websites.

Adan Álvarez 26 Jul 18, 2022
A workflow management tool for numerical models on the NCI computing systems

Payu Payu is a climate model workflow management tool for supercomputing environments. Payu is currently only configured for use on computing clusters

The Payu Organization 11 Aug 25, 2022
Format Norminette Output!

Format Norminette Output!

7 Apr 19, 2022
Customized python validations.

A customized python validations.

Wilfred V. Pine 2 Apr 20, 2022
A Random Password Generator made from Python

Things you need Python Step 1 Download the python file from Releases Step 2 Go to the directory where the python file is and run it Step 3 Type the le

Kavindu Nimsara 3 May 30, 2022
A multipurpose python module

pysherlock pysherlock is a Python library for dealing with web scraping using images, it's a Python application of the rendertron headless browser API

Sachit 2 Nov 11, 2021
Runes - Simple Cookies You Can Extend (similar to Macaroons)

Runes - Simple Cookies You Can Extend (similar to Macaroons) is a paper called "Macaroons: Cookies with Context

Rusty Russell 22 Dec 11, 2022
These scripts look for non-printable unicode characters in all text files in a source tree

find-unicode-control These scripts look for non-printable unicode characters in all text files in a source tree. find_unicode_control.py should work w

Siddhesh Poyarekar 25 Aug 30, 2022
Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.

Retrying Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

Ray Holder 1.9k Dec 29, 2022
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.

Lark - Parsing Library & Toolkit 3.5k Jan 05, 2023
This is Cool Utility tools that you can use in python.

This is Cool Utility tools that you can use in python. There are a few tools that you might find very useful, you can use this on pretty much any project and some utils might help you a lot and save

Senarc Studios 6 Apr 18, 2022
Go through a random file in your favourite open source projects!

Random Source Codes Never be bored again! Staring at your screen and just scrolling the great world wide web? Would you rather read through some code

Mridul Seth 1 Nov 03, 2022