Scraping script for stats on covid19 pandemic status in Chiba prefecture, Japan

Last update: Nov 29, 2021

Related tags

Overview

About

千葉県の地域別の詳細感染者統計(Excelファイル) をCSVに変換し、かつ地域別の日時感染者集計値を出力するスクリプトです。

Requirement

POSIX互換なシェル, e.g. GNU Bash (1)
curl (1)
python >= 3.8
pandas >= 1.1.3 (debian derivatives: python3-pandas >= 1.1.3)
xlrd >= 1.2.0 (debian derivatives: python3-xlrd >= 1.2.0)

上記以外のバージョンは動作保証の対象外となります。

Usage

取得~変換まで一括

fetchを含む全工程を一括で実施するconv.sh allが便利です。

サーバに過度な負荷をかけることのないよう、手動で行うことをおすすめします。

./conv.sh all

ファイル取得

昨日付で公開された地域別感染者数を含むxlsxファイルを取得します。サーバに過度な負荷をかけることのないよう、手動で行うことをおすすめします。

./conv.sh fetch

取得ファイルの変換

conv.sh target FILE で千葉県の感染者データの解析結果をout配下に出力します。実体としてはconv.py プラグインを呼び出しており、このスクリプトは千葉県専用の実装です。

./conv.sh target data/1013kansensya.xslx

Testing

本スクリプトは、変換後のデータ形式のみをテスト対象としています。 conv.py へのコミットを行う場合には、生成データ(data.csv, data-analyzed.csv) の形式を検証頂きますようお願いします。

データ形式テストには shellspec と GNU grep (1) が必要です。

データの正確性については、現時点で十分に確認できていません。ご協力いただける方はイシューを立てていただけますでしょうか。

Credit

千葉県庁公式のコロナ統計公表ページ：「新型コロナウイルス感染症患者等の県内発生状況について」のページ内リンクより取得したxlsxファイルを利用しています。感染症対策に尽力されている行政職員、医療従事者の皆様に心より敬意を表します。

fixture配下のテスト用データについては千葉県の公表統計に属するため、CC-BY-4.0 にてライセンスされます。 fixture配下を除く本リポジトリの素材はCC-BY-SA-4.0 にて Conv4Japan Contributor によりライセンスされます。

Scraping script for stats on covid19 pandemic status in Chiba prefecture, Japan

Related tags

Overview

About

Requirement

Usage

取得~変換まで一括

ファイル取得

取得ファイルの変換

Testing

Credit

Owner

Conv4Japan

Scrapy, a fast high-level web crawling & scraping framework for Python.

A simple code to fetch comments below an Instagram post and save them to a csv file

Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

🤖 Threaded Scraper to get discord servers from disboard.org written in python3

河南工业大学完美校园自动校外打卡

Tool to scan for secret files on HTTP servers

Extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file.

This is a python api to scrape search results from a url.

Scrapes proxies and saves them to a text file

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

CreamySoup - a helper script for automated SourceMod plugin updates management.

Webservice wrapper for hhursev/recipe-scrapers (python library to scrape recipes from websites)

Python web scrapper

Auto Join: A GitHub action script to automatically invite everyone to the organization who star your repository.

Using Python and Pushshift.io to Track stocks on the WallStreetBets subreddit

A web crawler for recording posts in "sina weibo"

Goblyn is a Python tool focused to enumeration and capture of website files metadata.

A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

👁️ Tool for Data Extraction and Web Requests.

A Python package that scrapes Google News article data while remaining undetected by Google.

Scraping script for stats on covid19 pandemic status in Chiba prefecture, Japan

Related tags

Overview

About

Requirement

Usage

取得~変換まで一括

ファイル取得

取得ファイルの変換

Testing

Credit

Owner

Conv4Japan

Scrapy, a fast high-level web crawling & scraping framework for Python.

A simple code to fetch comments below an Instagram post and save them to a csv file

Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

🤖 Threaded Scraper to get discord servers from disboard.org written in python3

河南工业大学 完美校园 自动校外打卡

Tool to scan for secret files on HTTP servers

Extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file.

This is a python api to scrape search results from a url.

Scrapes proxies and saves them to a text file

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

CreamySoup - a helper script for automated SourceMod plugin updates management.

Webservice wrapper for hhursev/recipe-scrapers (python library to scrape recipes from websites)

Python web scrapper

Auto Join: A GitHub action script to automatically invite everyone to the organization who star your repository.

Using Python and Pushshift.io to Track stocks on the WallStreetBets subreddit

A web crawler for recording posts in "sina weibo"

Goblyn is a Python tool focused to enumeration and capture of website files metadata.

A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

👁️ Tool for Data Extraction and Web Requests.

A Python package that scrapes Google News article data while remaining undetected by Google.

河南工业大学完美校园自动校外打卡