Python function to construct an ODS spreadsheet on the fly - without having to store the entire file in memory or disk

Overview

stream-write-ods CircleCI Test Coverage

Python function to construct an ODS (OpenDocument Spreadsheet) on the fly - without having to store the entire file in memory or disk.

Can be used to convert CSV, SQLite, or JSON to ODS format.

Installation

pip install stream-write-ods

Usage

In general, pass a nested iterable to stream_write_ods and it will return an interable of bytes of the ODS file, as follows.

from stream_write_ods import stream_write_ods

def get_sheets():
    def get_rows_of_sheet_1():
        yield 'Value A', 'Value B'
        yield 'Value C', 'Value D'

    yield 'Sheet 1 name', ('col_1_name', 'col_2_name'), get_rows_of_sheet_1()

    def get_rows_of_sheet_2():
        yield 'col_1_value',

    yield 'Sheet 2 name', ('col_1_name',), get_rows_of_sheet_2()

ods_chunks = stream_write_ods(get_sheets())

Usage: Convert CSV to ODS

The following recipe converts a CSV to ODS.

import codecs
import csv
from stream_write_ods import stream_write_ods

# Any iterable that yields the bytes of a CSV file
# Hard coded for the purposes of this example
csv_bytes_iter = (
    b'col_1,col_2\n',
    b'1,"value"\n',
)

def get_sheets(sheet_name, csv_reader):
    yield sheet_name, next(csv_reader), csv_reader

csv_str_iter = codecs.iterdecode(csv_bytes_iter, 'utf-8')
csv_reader = csv.reader(csv_str_iter, csv.QUOTE_NONNUMERIC)
ods_chunks = stream_write_ods(get_sheets('Sheet 1', csv_reader))

Usage: Convert JSON to ODS

Using ijson to stream-parse a JSON file, it's possible to convert JSON data to ODS on the fly:

import ijson
import itertools
from stream_write_ods import stream_write_ods

# Any iterable that yields the bytes of a JSON file
# Hard coded for the purposes of this example
json_bytes_iter = (b'''{
  "data": [
      {"id": 1, "name": "Foo"},
      {"id": 2, "name": "Bar"}
  ]
}''',)

# ijson requires a file-like object
def to_file_like_obj(bytes_iter):
    chunk = b''
    offset = 0
    it = iter(bytes_iter)

    def up_to_iter(num):
        nonlocal chunk, offset

        while num:
            if offset == len(chunk):
                try:
                    chunk = next(it)
                except StopIteration:
                    break
                else:
                    offset = 0
            to_yield = min(num, len(chunk) - offset)
            offset = offset + to_yield
            num -= to_yield
            yield chunk[offset - to_yield:offset]

    class FileLikeObj:
        def read(self, n):
            return b''.join(up_to_iter(n))

    return FileLikeObj()

def get_sheets(json_file):
    columns = None

    def rows():
        nonlocal columns
        for item in ijson.items(json_file, 'data.item'):
            if columns is None:
                columns = list(item.keys())
            yield tuple(item[column] for column in columns)

    # Ensure columns populated
    rows_it = rows()
    first_row = next(rows_it)

    yield 'Sheet 1', columns, itertools.chain((first_row,), rows_it)

json_file = to_file_like_obj(json_bytes_iter)
ods_chunks = stream_write_ods(get_sheets(json_file))

Usage: Convert SQLite to ODS

SQLite isn't particularly streaming-friendly since typically you need random access to the file. But it's still possible to use stream-write-ods to convert SQLite to ODS.

import contextlib
import sqlite3
import tempfile
from stream_write_ods import stream_write_ods

@contextlib.contextmanager
def get_db():
    # Hard coded in memory database for the purposes of this example
    with sqlite3.connect(':memory:') as con:
        cur = con.cursor()
        cur.execute("CREATE TABLE my_table_a (my_col text);")
        cur.execute("CREATE TABLE my_table_b (my_col text);")
        cur.execute("INSERT INTO my_table_a VALUES ('Value A')")
        cur.execute("INSERT INTO my_table_b VALUES ('Value B')")
        yield con

def quote_identifier(value):
    return '"' + value.replace('"', '""') + '"'

def get_sheets(db):
    cur_table = db.cursor()
    cur_table.execute('''
        SELECT name FROM sqlite_master
        WHERE type = "table" AND name NOT LIKE 'sqlite\\_%' ESCAPE '\\'
    ''')
    cur_data = db.cursor()
    for table, in cur_table:
        cur_data.execute(f'SELECT * FROM {quote_identifier(table)} ORDER BY rowid')
        yield table, tuple(col[0] for col in cur_data.description), cur_data

with get_db() as db:
    ods_chunks = stream_write_ods(get_sheets(db))

Types

There are 8 possible data types in an Open Document Spreadsheet: boolean, currency, date, float, percentage, string, time, and void. 4 of these can be output by stream-write-ods, chosen automatically according to the following table.

Python type ODS type
boolean boolean
date date - without time component
datetime date - with time component
int float
float float
str string
NoneType string - as #NA

Limitations

ODS spreadsheets are essentially ZIP archives containing several member files. While in general ZIP archives can be up to 16EiB (exbibyte) in size using ZIP64, LibreOffice does not support ZIP64, and so ODS files are de-facto limited to 4GiB (gibibyte). This limit applies to the size of the entire compressed archive, the compressed size of each member file, and the uncompressed size of each member file.

Owner
Department for International Trade
Department for International Trade
Tinkoff social pulse api wrapper

Tinkoff social pulse api wrapper

Semenov Artur 9 Dec 20, 2022
A simple and stupid Miinto API wrapper

miinto-api-wrapper Miinto API Wrapper is a simple python wrapper for Miinto API. Miinto is a fashion luxury marketplace. For more information see the

Giuseppe Checchia 3 Jan 09, 2022
A corona statistics and information telegram bot.

A corona statistics and information telegram bot.

Fayas Noushad 15 Oct 21, 2022
Simple web-based hcaptcha bypass

Hcaptcha-Bypass !!! If you found this useful, please click the STAR button !!! Simple web-based hcaptcha bypass Just a demonstration right now, and yo

Kieronia 4 Oct 06, 2021
Kevin L. 3 Jul 14, 2022
The Foursquare API client for Python

foursquare Python client for the foursquare API. Philosophy: Map foursquare's endpoints one-to-one Clean, simple, Pythonic calls Only handle raw data,

Mike Lewis 400 Dec 19, 2022
⚡ A really fast and powerful Discord Token Checker

discord-token-checker ⚡ A really fast and powerful Discord Token Checker How To Use? Do pip install -r requirements.txt in your command prompt Make to

vida 25 Feb 26, 2022
A Telegram Bot written in Python for mirroring files on the Internet to your Google Drive or Telegram

Original Repo mirror-leech-telegram-bot This is a Telegram Bot written in Python for mirroring files on the Internet to your Google Drive or Telegram.

0 Jan 03, 2022
The best discord.py template with a changeable prefix

Discord.py Bot Template By noma4321#0035 With A Custom Prefix To Every Guild Function Features Has a custom prefix that is changeable for every guild

Noma4321 5 Nov 24, 2022
CryptoBar - A simple MenuBar app that shows the price of 3 cryptocurrencies

CryptoBar A very simple MenuBar app that shows the price of the following crypto

4 Jul 04, 2022
Azure Neural Speech Service TTS

Written in Python using the Azure Speech SDK. App.py provides an easy way to create an Text-To-Speech request to Azure Speech and download the wav file.

Rodney 1 Oct 11, 2021
Faster Twitch Alerts is a highly customizable, lightning-fast alternative to Twitch's slow mobile notification system

Faster Twitch Alerts What is "Faster Twitch Alerts"? Faster Twitch Alerts is a highly customizable, lightning-fast alternative to Twitch's slow mobile

6 Dec 22, 2022
Shedding a new skin on Dis-Snek's commands.

Molter - WIP Shedding a new skin on Dis-Snek's commands. Currently, its goals are to make message commands more similar to discord.py's message comman

Astrea 7 May 01, 2022
Cogs for Red-DiscordBot

matcha-cogs Cogs for Red-DiscordBot. Installation [p]repo add matcha-cogs

MatchaTeaLeaf 2 Aug 27, 2022
Trading bot - A Trading bot With Python

Trading_bot Trading bot intended for 1) Tracking current prices of tokens 2) Set

Tymur Kotkov 29 Dec 01, 2022
The AWS Lambda Serverless Blind XSS App

Ass The AWS Lambda Serverless Blind XSS App 利用VPS配置XSS平台太麻烦了,如果利用AWS的Lambda那不就是一个域名的事情么?剩下的环境配置、HTTPS证书、隐私性、VPS续费都不用管了, 所以根据xless重写了Lambda平台的XSS,利用sla

cocokey 2 Dec 27, 2021
CloudFormation Drift Remediation - Use Cloud Control API to remediate drift that was detected on a CloudFormation stack

CloudFormation Drift Remediation - Use Cloud Control API to remediate drift that was detected on a CloudFormation stack

Cloudar 36 Dec 11, 2022
A Discord bot written in Python to help with guild administration

forgotten-hydra A Discord bot written in Python to help with guild administration. External libraries Pycord-Development/pycord 1.7.3 djc/couchdb-pyth

1 May 14, 2022
ServiceX DID Finder Girder

ServiceX_DID_Finder_Girder Access datasets for ServiceX from yt Hub Finding datasets This DID finder is designed to take a collection id (https://gird

1 Dec 07, 2021