Practical-statistics-for-data-scientists - Code repository for O'Reilly book

Overview

Code repository

Practical Statistics for Data Scientists:

50+ Essential Concepts Using R and Python

by Peter Bruce, Andrew Bruce, and Peter Gedeck

Online

View the notebooks online: nbviewer

Excecute the notebooks in Binder: Binder

This can take some time if the binder environment needs to be rebuilt.

Other language versions

English:
Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python
2020: ISBN 149207294X
Google books, Amazon
Japanese:
データサイエンスのための統計学入門 第2版 ―予測、分類、統計モデリング、統計的機械学習とR/Pythonプログラミング
2020: ISBN 487311926X, Shinya Ohashi (supervised), Toshiaki Kurokawa (translated)
Google books, Amazon
German:
Praktische Statistik für Data Scientists: 50+ essenzielle Konzepte mit R und Python 
2021: ISBN 3960091532, Marcus Fraaß (Übersetzer)
Google books, Amazon
Korean:
Practical Statistics for Data Scientists: 데이터 과학을 위한 통계(2판) 2021: ISBN 9791162244180, Junyong Lee (translation)
Google books, Hanbit media
Polish:
Statystyka praktyczna w data science. 50 kluczowych zagadnien w jezykach R i Python 2021: ISBN 9788328374270
Google books, Amazon, Helion

See also

Setup R and Python environments

R

Run the following commands in R to install all required packages

if (!require(vioplot)) install.packages('vioplot')
if (!require(corrplot)) install.packages('corrplot')
if (!require(gmodels)) install.packages('gmodels')
if (!require(matrixStats)) install.packages('matrixStats')

if (!require(lmPerm)) install.packages('lmPerm')
if (!require(pwr)) install.packages('pwr')

if (!require(FNN)) install.packages('FNN')
if (!require(klaR)) install.packages('klaR')
if (!require(DMwR)) install.packages('DMwR')

if (!require(xgboost)) install.packages('xgboost')

if (!require(ellipse)) install.packages('ellipse')
if (!require(mclust)) install.packages('mclust')
if (!require(ca)) install.packages('ca')

Python

We recommend to use a conda environment to run the Python code.

conda create -n sfds python
conda activate sfds
conda env update -n sfds -f environment.yml
A declarative (epi)genomics visualization library for Python

gos is a declarative (epi)genomics visualization library for Python. It is built on top of the Gosling JSON specification, providing a simplified interface for authoring interactive genomic visualiza

Gosling 107 Dec 14, 2022
Colormaps for astronomers

cmastro: colormaps for astronomers 🔭 This package contains custom colormaps that have been used in various astronomical applications, similar to cmoc

Adrian Price-Whelan 12 Oct 11, 2022
Chem: collection of mostly python code for molecular visualization, QM/MM, FEP, etc

chem: collection of mostly python code for molecular visualization, QM/MM, FEP,

5 Sep 02, 2022
CPG represent!

CoolPandasGroup CPG represent! Arianna Brandon Enne Luan Tracie Project requirements: use Pandas to clean and format datasets use Jupyter Notebook to

Enne 3 Feb 07, 2022
Define fortify and autoplot functions to allow ggplot2 to handle some popular R packages.

ggfortify This package offers fortify and autoplot functions to allow automatic ggplot2 to visualize statistical result of popular R packages. Check o

Sinhrks 504 Dec 23, 2022
Time series visualizer is a flexible extension that provides filling world map by country from real data.

Time-series-visualizer Time series visualizer is a flexible extension that provides filling world map by country from csv or json file. You can know d

Long Ng 3 Jul 09, 2021
Visualize tensors in a plain Python REPL using Sparklines

Visualize tensors in a plain Python REPL using Sparklines

Shawn Presser 43 Sep 03, 2022
A simple interpreted language for creating basic mathematical graphs.

graphr Introduction graphr is a small language written to create basic mathematical graphs. It is an interpreted language written in python and essent

2 Dec 26, 2021
Multi-class confusion matrix library in Python

Table of contents Overview Installation Usage Document Try PyCM in Your Browser Issues & Bug Reports Todo Outputs Dependencies Contribution References

Sepand Haghighi 1.3k Dec 31, 2022
Rick and Morty Data Visualization with python

Rick and Morty Data Visualization For this project I looked at data for the TV show Rick and Morty Number of Episodes at a Certain Location Here is th

7 Aug 29, 2022
A Jupyter - Leaflet.js bridge

ipyleaflet A Jupyter / Leaflet bridge enabling interactive maps in the Jupyter notebook. Usage Selecting a basemap for a leaflet map: Loading a geojso

Jupyter Widgets 1.3k Dec 27, 2022
A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe which runs both payloads.

Update ! ANONFILE MIGHT NOT WORK ! About A Python Binder that merge 2 files with any extension by creating a new python file and compiling it to exe w

Vesper 15 Oct 12, 2022
Simple Python interface for Graphviz

Simple Python interface for Graphviz

Sebastian Bank 1.3k Dec 26, 2022
3D rendered visualization of the austrian monuments registry

Visualization of the Austrian Monuments Visualization of the monument landscape of the austrian monuments registry (Bundesdenkmalamt Denkmalverzeichni

Nikolai Janakiev 3 Oct 24, 2019
Visualise Ansible execution time across playbooks, tasks, and hosts.

ansible-trace Visualise where time is spent in your Ansible playbooks: what tasks, and what hosts, so you can find where to optimise and decrease play

Mark Hansen 81 Dec 15, 2022
EPViz is a tool to aid researchers in developing, validating, and reporting their predictive modeling outputs.

EPViz (EEG Prediction Visualizer) EPViz is a tool to aid researchers in developing, validating, and reporting their predictive modeling outputs. A lig

Jeff 2 Oct 19, 2022
Open-questions - Open questions for Bellingcat technical contributors

Open questions for Bellingcat technical contributors These are difficult, long-term projects that would contribute to open source investigations at Be

Bellingcat 234 Dec 31, 2022
demir.ai Dataset Operations

demir.ai Dataset Operations With this application, you can have the empty values (nan/null) deleted or filled before giving your dataset to machine le

Ahmet Furkan DEMIR 8 Nov 01, 2022
Automatic data visualization in atom with the nteract data-explorer

Data Explorer Interactively explore your data directly in atom with hydrogen! The nteract data-explorer provides automatic data visualization, so you

Ben Russert 65 Dec 01, 2022
Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly

Peloton Stats to Google Sheets with Data Visualization through Seaborn and Plotly Problem: 2 peloton users were looking for a way to track their metri

9 Jul 22, 2022