Python & Julia port of codes in excellent R books

Overview

X4DS

This repo is a collection of

Python & Julia port of codes in the following excellent R books:

Python Stack Julia Stack
Language
Version
v3.9 v1.7
Data
Processing
  • Pandas
  • DataFrames
  • Visualization
  • Matplotlib
  • Seaborn
  • MakiE
  • AlgebraOfGraphics
  • Machine
    Learning
  • Scikit-Learn
  • MLJ
  • Probablistic
    Programming
  • PyMC
  • Turing
  • Code Styles

    2.1. Basics

    • prefer enumerate() over range(len())
    xs = range(3)
    
    # good
    for ind, x in enumerate(xs):
      print(f'{ind}: {x}')
    
    # bad
    for i in range(len(xs)):
      print(f'{i}: {xs[i]}')

    2.2. Matplotlib

    including seaborn

    • prefer Axes object over Figure object
    • use constrained_layout=True when draw subplots
    # good
    _, axes = plt.subplots(1, 2, constrained_layout=True)
    axes[0].plot(x1, y1)
    axes[1].hist(x2, y2)
    
    # bad
    plt.subplot(121)
    plt.plot(x1, y1)
    plt.subplot(122)
    plt.hist(x2, y2)
    • prefer axes.flatten() over plt.subplot() in cases where subplots' data is iterable
    • prefer zip() or enumerate() over range() for iterable objects
    # good
    _, ax = plt.subplots(2, 2, figsize=[12,8],constrained_layout=True)
    
    for ax, x, y in zip(axes.flatten(), xs, ys):
      ax.plot(x, y)
    
    # bad
    for i in range(4):
      ax = plt.subplot(2, 2, i+1)
      ax.plot(x[i], y[i])
    • prefer set() method over set_*() method
    # good
    ax.set(xlabel='x', ylabel='y')
    
    # bad
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    • Prefer despine() over ax.spines[*].set_visible()
    # good
    sns.despine()
    
    # bad
    ax.spines["top"].set_visible(False)
    ax.spines["bottom"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["left"].set_visible(False)

    2.3. Pandas

    • prefer df['col'] over df.col
    # good
    movies['duration']
    
    # bad
    movies.duration
    • prefer df.query over df[] or df.loc[] in simple-selection
    # good
    movies.query('duration >= 200')
    
    # bad
    movies[movies['duration'] >= 200]
    movies.loc[movies['duration'] >= 200, :]
    • prefer df.loc and df.iloc over df[] in multiple-selection
    # good
    movies.loc[movies['duration'] >= 200, 'genre']
    movies.iloc[0:2, :]
    
    # bad
    movies[movies['duration'] >= 200].genre
    movies[0:2]

    LaTeX Styles

    Multiple lines

    Reduce the use of begin{array}...end{array}

    • equations: begin{aligned}...end{aligned}
    $$
    \begin{aligned}
    y_1 = x^2 + 2*x \\
    y_2 = x^3 + x
    \end{aligned}
    $$
    • equations with conditions: begin{cases}...end{cases}
    $$
    \begin{cases}
    y = x^2 + 2*x & x > 0 \\
    y = x^3 + x & x ≤ 0
    \end{cases}
    $$
    • matrix: begin{matrix}...end{matrix}
    $$
    \begin{vmatrix}
      a + a^′ & b + b^′ \\ c & d
      \end{vmatrix}= \begin{vmatrix}
      a & b \\ c & d
      \end{vmatrix} + \begin{vmatrix}
      a^′ & b^′ \\ c & d
    \end{vmatrix}
    $$

    Brackets

    • prefer \Bigg...\Bigg over \left...\right
    $$
    A\Bigg[v_1\ v_2\ \ v_r\Bigg]
    $$
    • prefer \underset{}{} over \underset{}
    $$
    \underset{θ}{\mathrm{argmax}}\ p(x_i|θ)
    $$

    Expressions

    • prefer ^{\top} over ^T for transpose

    $$ 𝐀^⊤ $$

    $$
    𝐀^{\top}
    $$
    • prefer \to over \rightarrow for limit

    $$ \lim_{n → ∞} $$

    $$
    \lim_{n\to \infty}
    $$
    • prefer underset{}{} over \limits_

    $$ \underset{w}{\rm argmin}\ (wx +b) $$

    $$
    \underset{w}{\rm argmin}\ (wx +b)
    $$

    Fonts

    • prefer \mathrm over \mathop or \operatorname
    $$
    θ_{\mathrm{MLE}}=\underset{θ}{\mathrm{argmax}}\ ∑_{i = 1}^{N}\log p(x_i|θ)
    $$

    ISLR

    References

    style <style> table { border-collapse: collapse; text-align: center; } </style>
    Owner
    Gitony
    Gitony
    Plot toolbox based on Matplotlib, simple and elegant.

    Elegant-Plot Plot toolbox based on Matplotlib, simple and elegant. 绘制效果 绘制过程 数据准备 每种图标类型的目录下有data.csv文件,依据样例数据填入自己的数据。

    3 Jul 15, 2022
    High-level geospatial data visualization library for Python.

    geoplot: geospatial data visualization geoplot is a high-level Python geospatial plotting library. It's an extension to cartopy and matplotlib which m

    Aleksey Bilogur 1k Jan 01, 2023
    Regress.me is an easy to use data visualization tool powered by Dash/Plotly.

    Regress.me Regress.me is an easy to use data visualization tool powered by Dash/Plotly. Regress.me.-.Google.Chrome.2022-05-10.15-58-59.mp4 Get Started

    Amar 14 Aug 14, 2022
    Statistical data visualization using matplotlib

    seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing

    Michael Waskom 10.2k Dec 30, 2022
    visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem

    visualize_ML visualize_ML is a python package made to visualize some of the steps involved while dealing with a Machine Learning problem. It is build

    Ayush Singh 164 Dec 12, 2022
    :bowtie: Create a dashboard with python!

    Installation | Documentation | Gitter Chat | Google Group Bowtie Introduction Bowtie is a library for writing dashboards in Python. No need to know we

    Jacques Kvam 753 Dec 22, 2022
    Lightweight data validation and adaptation Python library.

    Valideer Lightweight data validation and adaptation library for Python. At a Glance: Supports both validation (check if a value is valid) and adaptati

    Podio 258 Nov 22, 2022
    Quickly and accurately render even the largest data.

    Turn even the largest data into images, accurately Build Status Coverage Latest dev release Latest release Docs Support What is it? Datashader is a da

    HoloViz 2.9k Dec 28, 2022
    Geospatial Data Visualization using PyGMT

    Example script to visualize topographic data, earthquake data, and tomographic data on a map

    Utpal Kumar 2 Jul 30, 2022
    The official colors of the FAU as matplotlib/seaborn colormaps

    FAU - Colors The official colors of Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) as matplotlib / seaborn colormaps. We support the old colo

    Machine Learning and Data Analytics Lab FAU 9 Sep 05, 2022
    A grammar of graphics for Python

    plotnine Latest Release License DOI Build Status Coverage Documentation plotnine is an implementation of a grammar of graphics in Python, it is based

    Hassan Kibirige 3.3k Jan 01, 2023
    Apache Superset is a Data Visualization and Data Exploration Platform

    Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Rele

    The Apache Software Foundation 50k Jan 06, 2023
    The Python ensemble sampling toolkit for affine-invariant MCMC

    emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

    Dan Foreman-Mackey 1.3k Jan 04, 2023
    Define fortify and autoplot functions to allow ggplot2 to handle some popular R packages.

    ggfortify This package offers fortify and autoplot functions to allow automatic ggplot2 to visualize statistical result of popular R packages. Check o

    Sinhrks 504 Dec 23, 2022
    Matplotlib colormaps from the yt project !

    cmyt Matplotlib colormaps from the yt project ! Colormaps overview The following colormaps, as well as their respective reversed (*_r) versions are av

    The yt project 5 Sep 16, 2022
    Focus on Algorithm Design, Not on Data Wrangling

    The dataTap Python library is the primary interface for using dataTap's rich data management tools. Create datasets, stream annotations, and analyze model performance all with one library.

    Zensors 37 Nov 25, 2022
    Design your own matplotlib stylefile interactively

    Tired of playing with font sizes and other matplotlib parameters every time you start a new project or write a new plotting function? Want all you plots have the same style? Use matplotlib configurat

    yobi byte 207 Dec 08, 2022
    A simple python tool for explore your object detection dataset

    A simple tool for explore your object detection dataset. The goal of this library is to provide simple and intuitive visualizations from your dataset and automatically find the best parameters for ge

    GRADIANT - Centro Tecnolóxico de Telecomunicacións de Galicia 142 Dec 25, 2022
    A command line tool for visualizing CSV/spreadsheet-like data

    PerfPlotter Read data from CSV files using pandas and generate interactive plots using bokeh, which can then be embedded into HTML pages and served by

    Gino Mempin 0 Jun 25, 2022
    Minimalistic tool to visualize how the routes to a given target domain change over time, feat. Python 3.10 & mermaid.js

    Minimalistic tool to visualize how the routes to a given target domain change over time, feat. Python 3.10 & mermaid.js

    Péter Ferenc Gyarmati 1 Jan 17, 2022