Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Last update: Nov 08, 2022

Overview

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Intro

This repo contains the python/stan version of the Statistical Rethinking course that Professor Richard McElreath taught on the Max Planck Institute for Evolutionary Anthropology in Leipzig during the Winter of 2019/2020. The original repo for the course, from which this repo is forked, can be found here. The course contains 20 lectures structured in 10 weeks with a series of assignments for each week. The course is an excellent introduction to bayesian modelling in general and to the Rethinking Statistics wonderful book written by Professor McElreath.

How to use this repo

There are ten jupyter notebooks, one for each week of the course. At the beginning of each notebook there are links to the youtube videos of the lectures, the slides used and the original homework questions and answers in R.

How I would use this repo is like this:

Go to the notebook of the week.
Watch the two videos for the lectures of that week. Their URL are at the very top of each notebook.
Read the original problems presented to the students and try to solve them on your own.
Follow the exercises solutions of the notebook with my code and explanations by Professor McElreath.

Installing `CmdStanPy`

The stan code is executed thanks to CmdStanPy. CmdStanPy is a lightweight pure-Python interface to CmdStan which provides access to the Stan compiler and all inference algorithms. It provides the function install_cmdstan() which downloads CmdStan from GitHub and builds the CmdStan utilities. It can be can be called from within Python or from the command line.

import cmdstanpy
cmdstanpy.install_cmdstan()

You can found more information about the installation process here.

Other useful resources

There are a lot of very useful resources for bayesian statistical modelling out there. Specifically centered on Professor McElreath work I would mention:

Original repo for the course.
Original rethinking package repo

Copyright

The present work is a derivative work of Statistical Rethinking: A Bayesian Course Using python and pymc3 by Gabriel Bosque Chacon and Statistical Rethinking: A Bayesian Course Using Python and NumPyro by Andrés Suárez. I made the stan code, the plotnine figures and slightly modifications to his comments.

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Related tags

Overview

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Intro

How to use this repo

Installing `CmdStanPy`

Other useful resources

Copyright

Owner

Andrés Suárez

Python for Data Analysis, 2nd Edition

Detailed analysis on fraud claims in insurance companies, gives you information as to why huge loss take place in insurance companies

An implementation of the largeVis algorithm for visualizing large, high-dimensional datasets, for R

Automated Exploration Data Analysis on a financial dataset

Data Competition: automated systems that can detect whether people are not wearing masks or are wearing masks incorrectly

Probabilistic reasoning and statistical analysis in TensorFlow

Binance Kline Data With Python

Big Data & Cloud Computing for Oceanography

Streamz helps you build pipelines to manage continuous streams of data

A Python package for the mathematical modeling of infectious diseases via compartmental models

PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

Template for a Dataflow Flex Template in Python

A Python Tools to imaging the shallow seismic structure

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

ASTR 302: Python for Astronomy (Winter '22)

Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Working Time Statistics of working hours and working conditions by industry and company

A highly efficient and modular implementation of Gaussian Processes in PyTorch

A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Related tags

Overview

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Intro

How to use this repo

Installing CmdStanPy

Other useful resources

Copyright

Owner

Andrés Suárez

Python for Data Analysis, 2nd Edition

Detailed analysis on fraud claims in insurance companies, gives you information as to why huge loss take place in insurance companies

An implementation of the largeVis algorithm for visualizing large, high-dimensional datasets, for R

Automated Exploration Data Analysis on a financial dataset

Data Competition: automated systems that can detect whether people are not wearing masks or are wearing masks incorrectly

Probabilistic reasoning and statistical analysis in TensorFlow

Binance Kline Data With Python

Big Data & Cloud Computing for Oceanography

Streamz helps you build pipelines to manage continuous streams of data

A Python package for the mathematical modeling of infectious diseases via compartmental models

PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

Template for a Dataflow Flex Template in Python

A Python Tools to imaging the shallow seismic structure

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

ASTR 302: Python for Astronomy (Winter '22)

Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Working Time Statistics of working hours and working conditions by industry and company

A highly efficient and modular implementation of Gaussian Processes in PyTorch

A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

Installing `CmdStanPy`