Picka: A Python module for data generation and randomization.

Last update: Nov 30, 2021

Related tags

Overview

Picka: A Python module for data generation and randomization.

Author:	Anthony Long
Version:	1.0.1 - Fixed the broken image stuff. Whoops

What is Picka?

Picka generates randomized data for testing.

Data is generated both from a database of known good data (which is included), or by generating realistic data (valid), using string formatting (behind the scenes).

Picka has a function for any field you would need filled in. With selenium, something like would populate the "field-name-here" box for you, 100 times with random names.

for x in xrange(101):
        self.selenium.type('field-name-here', picka.male_name())

But this is just the beginning. Other ways to implement this, include using dicts:

user_information = {
        "first_name": picka.male_name(),
        "last_name": picka.last_name(),
        "email_address": picka.email(10, extension='example.org'),
        "password": picka.password_numerical(6),
}

This would provide:

{
        "first_name": "Jack",
        "last_name": "Logan",
        "email_address": "[email protected]",
        "password": "485444"
}

Don't forget, since all of the data is considered "clean" or valid - you can also use it to fill selects and other form fields with pre-defined values. For example, if you were to generate a state; picka.state() the result would be "Alabama". You can use this result to directly select a state in an address drop-down box.

Examples:

Selenium

def search_for_garbage():
        selenium.open('http://yahoo.com')
        selenium.type('id=search_box', picka.random_string(10))
        selenium.submit()

def test_search_for_garbage_results():
        search_for_garbage()
        selenium.wait_for_page_to_load('30000')
        assert selenium.get_xpath_count('id=results') == 0

Webdriver

driver = webdriver.Firefox()
driver.get("http://somesite.com")
x = {
        "name": [
                "#name",
                picka.name()
        ]
}
driver.find_element_by_css_selector(
        x["name"][0]).send_keys(x["name"][1]
)

Funcargs / pytest

def pytest_generate_tests(metafunc):
        if "test_string" in metafunc.funcargnames:
                for i in range(10):
                        metafunc.addcall(funcargs=dict(numiter=picka.random_string(20)))

def test_func(test_string):
        assert test_string.isalpha()
        assert len(test_string) == 20

MySQL / SQLite

first, last, age = picka.first_name(), picka.last_name(), picka.age()
cursor.execute(
   "insert into user_data (first_name, last_name, age) VALUES (?, ?, ?)",
   (first, last, age)
)

HTTP

def post(host, data):
        http = httplib.HTTP(host)
        return http.send(data)

def test_post_result():
        post("www.spam.egg/bacon.htm", picka.random_string(10))

Comments

No test suite

Slightly ironic, a test data generation toolkit which doesnt have a test suite.

Also setup.py doesnt declare Python 3 support, hence the need for a test suite to validate it works correctly.

opened by jayvdb 1
Additional Functionality for Testers to Add Their Own Data

Picka provides general data for testing. Leveraging this effort provides custom test data. Test data is not limited to just preconfigured values when it's possible to add custom test data. Data can be accessed sequentially, randomly or completely.

opened by bkuehlhorn 1
Fixed test file, added alternative sentence maker
Fixed usage of number in tests (it takes one arg, not two)

Added sentence_actual, which returns an actual sentence from the Sherlock text.

Added _picka._Book class to hold the text and split sentences read from Sherlock. Users can call sentence() without reading the entire file again and again.

Added test of sentence_actual to picka.tests

The sentence_actual function has some nice features:

You're much less likely to get a sentence fragment

You can specify a minimum and maximum number of words

It should be relatively efficient, because the split sentences are cached by the _Book class.

The sentences aren't always perfect, but I think that has to do with the source. A book other than Sherlock Holmes, preferably one with less dialog, would give more "normal" sentences.
opened by TadLeonard 1
Library does not take locale into account
The library assumes an English locale is used (e.g., English-language hardcoded month names). Ideally the library would use locale-dependent constants so that computations are done correctly (e.g., the duration of a month in month_and_day):

>>> locale.setlocale(locale.LC_ALL, 'it_IT') 'it_IT' >>> picka.month() 'Marzo' >>> picka.month_and_day() 'Maggio 2'
opened by svisser 0
picka.age will return ages outside of the bounds

If I call picka.age(1, 1) repeatedly I get 1 and 2 as results. I would have expected it to always return 1. Note that this situation can occur when passing variables to picka.age, I don't expect people to write this in their code themselves.

I can also get ages outside of the bounds when I call picka.age(0, 1) which resorts to using the default values and can therefore return any age within the default values.

opened by svisser 0
Module name means "cunt"

I'm not sure if this is a real issue, but when I look at this module I cannot do so with a straight face. "Picka" is "cunt" in Serbian, Macedonian, Bosnian, Croatian, and I'm unsure as to whether there are other languages where this holds.

While not grounds for any specific action, I find this largely amusing and just wanted to share.

opened by geomaster 2

Releases(v0.96)

v0.96(Jan 17, 2014)

hex, rbg, image and more.
Source code(tar.gz)
Source code(zip)
picka-0.9.6.tar.gz(8.13 MB)
picka-0.9.6.zip(8.18 MB)

Owner

Anthony

GitHub Repository http://antlong.com

Create HTML profiling reports from pandas DataFrame objects

Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great

10k Jan 01, 2023

Pyspark project that able to do joins on the spark data frames.

SPARK JOINS This project is to perform inner, all outer joins and semi joins. create_df.py: load_data.py : helps to put data into Spark data frames. d

1 Dec 14, 2021

Nobel Data Analysis

Nobel_Data_Analysis This project is for analyzing a set of data about people who have won the Nobel Prize in different fields and different countries

1 Jan 24, 2022

A stock analysis app with streamlit

StockAnalysisApp A stock analysis app with streamlit. You select the ticker of the stock and the app makes a series of analysis by using the price cha

50 Nov 27, 2022

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Web Trader Web Trader is a trading website that consolidates data from Nasdaq, allowing the user to search up the ticker symbol and price of any stock

21 Aug 30, 2022

A collection of learning outcomes data analysis using Python and SQL, from DQLab.

Data Analyst with PYTHON Data Analyst berperan dalam menghasilkan analisa data serta mempresentasikan insight untuk membantu proses pengambilan keputu

6 Oct 11, 2022

Desafio proposto pela IGTI em seu bootcamp de Cloud Data Engineer

Desafio Modulo 4 - Cloud Data Engineer Bootcamp - IGTI Objetivos Criar infraestrutura como código Utuilizando um cluster Kubernetes na Azure Ingestão

4 Jan 23, 2022

Learn machine learning the fun way, with Oracle and RedBull Racing

Red Bull Racing Analytics Hands-On Labs Introduction Are you interested in learning machine learning (ML)? How about doing this in the context of the

55 Oct 24, 2022

ETL flow framework based on Yaml configs in Python

ETL framework based on Yaml configs in Python A light framework for creating data streams. Setting up streams through configuration in the Yaml file.

18 Jul 06, 2022

A set of functions and analysis classes for solvation structure analysis

SolvationAnalysis The macroscopic behavior of a liquid is determined by its microscopic structure. For ionic systems, like batteries and many enzymes,

19 Nov 24, 2022

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format.

2 Dec 01, 2021

Data science/Analysis Health Care Portfolio

Health-Care-DS-Projects Data Science/Analysis Health Care Portfolio Consists Of 3 Projects: Mexico Covid-19 project, analyze the patient medical histo

1 Feb 13, 2022

sportsdataverse python package

sportsdataverse-py See CHANGELOG.md for details. The goal of sportsdataverse-py is to provide the community with a python package for working with spo

37 Dec 27, 2022

This creates a ohlc timeseries from downloaded CSV files from NSE India website and makes a SQLite database for your research.

NSE-timeseries-form-CSV-file-creator-and-SQL-appender- This creates a ohlc timeseries from downloaded CSV files from National Stock Exchange India (NS

1 Oct 02, 2022

Picka: A Python module for data generation and randomization.

Related tags

Overview

Picka: A Python module for data generation and randomization.

What is Picka?

Examples:

Selenium

Webdriver

Funcargs / pytest

MySQL / SQLite

HTTP

Comments

No test suite

Additional Functionality for Testers to Add Their Own Data

Fixed test file, added alternative sentence maker

Library does not take locale into account

picka.age will return ages outside of the bounds

Module name means "cunt"

Releases(v0.96)

v0.96(Jan 17, 2014)

Owner

Anthony

Create HTML profiling reports from pandas DataFrame objects

Pyspark project that able to do joins on the spark data frames.

Nobel Data Analysis

A stock analysis app with streamlit

📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

A collection of learning outcomes data analysis using Python and SQL, from DQLab.

Desafio proposto pela IGTI em seu bootcamp de Cloud Data Engineer

Learn machine learning the fun way, with Oracle and RedBull Racing

ETL flow framework based on Yaml configs in Python

A set of functions and analysis classes for solvation structure analysis

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Data science/Analysis Health Care Portfolio

sportsdataverse python package

This creates a ohlc timeseries from downloaded CSV files from NSE India website and makes a SQLite database for your research.

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

BErt-like Neurophysiological Data Representation

Transform-Invariant Non-Negative Matrix Factorization

NFCDS Workshop Beginners Guide Bioinformatics Data Analysis

We're Team Arson and we're using the power of predictive modeling to combat wildfires.

A neural-based binary analysis tool