plotly scatterplots which show molecule images on hover!

Overview

molplotly

Plotly scatterplots which show molecule images on hovering over the datapoints!

Beautiful :)

Required packages:

➡️ See example.ipynb for an example :)

📜 Usage

import pandas as pd
import plotly.express as px

import molplotly

# load a DataFrame with smiles
df_esol = pd.read_csv('esol.csv')
df_esol['y_pred'] = df_esol['ESOL predicted log solubility in mols per litre']
df_esol['y_true'] = df_esol['measured log solubility in mols per litre']

# generate a scatter plot
fig = px.scatter(df_esol, x="y_true", y="y_pred")

# add molecules to the plotly graph - returns a Dash app
app = molplotly.add_molecules(fig=fig, 
                            df=df_esol, 
                            smiles_col='smiles', 
                            title_col='Compound ID', 
                            )

# run Dash app inline in notebook (or in an external server)
app.run_server(mode='inline', port=8011, height=1000)

Input parameters

  • fig : plotly.graph_objects.Figure object
    a plotly figure object containing datapoints plotted from df
  • df : pandas.DataFrame object
    a pandas dataframe that contains the data plotted in fig
  • smiles_col : str, optional
    name of the column in df containing the smiles plotted in fig (default 'SMILES')
  • show_img : bool, optional
    whether or not to generate the molecule image in the dash app (default True)
  • title_col : str, optional
    name of the column in df to be used as the title entry in the hover box (default None)
  • show_coords : bool, optional
    whether or not to show the coordinates of the data point in the hover box (default True)
  • caption_cols : list, optional
    list of column names in df to be included in the hover box (default None)
  • condition_col : str, optional
    name of the column in df that is used to color the datapoints in df - necessary when there is discrete conditional coloring (default None)
  • wrap : bool, optional
    whether or not to wrap the title text to multiple lines if the length of the text is too long (default True)
  • wraplen : int, optional
    the threshold length of the title text before wrapping begins - adjust when changing the width of the hover box (default 20)
  • width : int, optional
    the width in pixels of the hover box (default 150)
  • fontfamily : str, optional
    the font family used in the hover box (default 'Arial')
  • fontsize : int, optional
    the font size used in the hover box - the font of the title line is fontsize+2 (default 12)

Output parameters

by default a JupyterDash app is returned which can be run inline in a jupyter notebook or deployed on a server via app.run_server()

Acknowledgements

Features to-add:

  1. Individual styles for each caption (fonts, colors etc)
  2. Some way to save the plot
  3. Highlight points by clicking on them
  4. SVG image generation
Comments
  • Erro with dash 2.3.0

    Erro with dash 2.3.0

    Hello,

    First of all thanks for this great package.

    Today, I got the following error while importing molplotly whereas I previously had no issue: ImportError: cannot import name 'Input' from 'dash'

    As my version of dash was a bit old 1.6 I believe, I upgraded it to version 2.3.0 via pip. The import error disappeared but I had another error when trying to run the server "app.run_server(mode='inline', port=8003, height=800)": AttributeError: ('Read-only: can only be set in the Dash constructor or during init_app()', 'requests_pathname_prefix')

    I have managed to get around by downgrading dash to version 2.0.0 as recommended here https://stackoverflow.com/questions/70908709/jupyterdash-app-run-server-error-using-jupyter-notebook, but there may be something to look into...

    Thanks again

    opened by remseven 4
  • Pip install error

    Pip install error

    Great idea this package! I however ran into an issue during installation: UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 1822: character maps to <undefined> This is certainly due to unusal characters in the readme, since I could install the package locally by editing the readme file. Cheers!

    opened by ArnaudGaudry 4
  • Dependency versions for pip install

    Dependency versions for pip install

    Is there some reason for the very specific version requirements for the dependencies? For the pip install I would loosen this up unless there are any particular known issues.

    opened by kjelljorner 3
  • Setup testing + CI

    Setup testing + CI

    Best not to wait too long to start testing. Here's a basic scaffold to get started.

    Probably shouldn't be merged as is but y'all can push more commits here to get this in shape.

    opened by janosh 3
  • Plotting in a running Dash app

    Plotting in a running Dash app

    Hi! I have a small dash app that I use to explore the molecules present in different samples. It is possible to select the samples of interest and then display a plotly scatter plot generated using the structures. I tried to add the molplotly layer on the scatter plot but no molecules are displayed. Any experience on that? Thanks!

    opened by ArnaudGaudry 2
  • Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Hi there, thanks for providing a great and easy to use tool!

    This issue is reproducible with the first example in the documentation:

    df_esol['delY'] = df_esol["y_pred"] - df_esol["y_true"]
    fig_scatter = px.scatter(df_esol,
                             x="y_true",
                             y="y_pred",
                             color='delY',
                             marker='Minimum Degree', # <- addition
                             title='ESOL Regression (default plotly)',
                             labels={'y_pred': 'Predicted Solubility',
                                     'y_true': 'Measured Solubility',
                                     'delY': 'ΔY'},
                             width=1200,
                             height=800)
    
    # This adds a dashed line for what a perfect model _should_ predict
    y = df_esol["y_true"].values
    fig_scatter.add_shape(
        type="line", line=dict(dash='dash'),
        x0=y.min(), y0=y.min(),
        x1=y.max(), y1=y.max()
    )
    
    fig_scatter.update_layout(title='ESOL Regression (with add_molecules!)')
    
    app_scatter = molplotly.add_molecules(fig=fig_scatter,
                                          df=df_esol,
                                          smiles_col='smiles',
                                          title_col='Compound ID',
                                          color_col='delY' # <- addition
                                          )
    
    # change the arguments here to run the dash app on an external server and/or change the size of the app!
    app_scatter.run_server(mode='inline', port=8001, height=1000)
    
    

    This returns

    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/molplotly/main.py in display_hover(
        hoverData={'points': [{'bbox': {'x0': 948.39, 'x1': 950.39, 'y0': 177.7, 'y1': 179.7}, 'curveNumber': 0, 'marker.color': -0.48000000000000004, 'pointIndex': 960, 'pointNumber': 960, 'x': 0.79, 'y': 0.31}]}
    )
        111             df_curve = df[df[color_col] ==
        112                           curve_dict[curve_num]].reset_index(drop=True)
    --> 113             df_row = df_curve.iloc[num]
            df_row = undefined
            df_curve.iloc = <pandas.core.indexing._iLocIndexer object at 0x7f7e3d16c950>
            num = 960
        114         else:
        115             df_row = df.iloc[num]
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960
    )
        929 
        930             maybe_callable = com.apply_if_callable(key, self.obj)
    --> 931             return self._getitem_axis(maybe_callable, axis=axis)
            self._getitem_axis = <bound method _iLocIndexer._getitem_axis of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            maybe_callable = 960
            axis = 0
        932 
        933     def _is_scalar_access(self, key: tuple):
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1564 
       1565             # validate the location
    -> 1566             self._validate_integer(key, axis)
            self._validate_integer = <bound method _iLocIndexer._validate_integer of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            key = 960
            axis = 0
       1567 
       1568             return self.obj._ixs(key, axis=axis)
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_integer(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1498         len_axis = len(self.obj._get_axis(axis))
       1499         if key >= len_axis or key < -len_axis:
    -> 1500             raise IndexError("single positional indexer is out-of-bounds")
            global IndexError = undefined
       1501 
       1502     # -------------------------------------------------------------------
    
    IndexError: single positional indexer is out-of-bounds
    

    Using either only marker or color alone causes no issues with the hoverbox. Also, using Minimum Degree as color_col for add_molecules when both color and symbol are defined gives no issues.

    opened by chertianser 2
  • Removed support for python 3.7

    Removed support for python 3.7

    Thanks for the great library, really useful!

    You pinned the pandas version to ~=1.4.1 which effectively cuts of users that use python<3.7. See the pandas release notes: https://pandas.pydata.org/docs/whatsnew/v1.4.0.html

    Is this intentional? Which novel features from pandas >1.4.0 are strictly necessary to keep the package running? I'll create a PR with a relaxed pandas requirements that works fine for me in a python3.7 env.

    opened by jannisborn 1
  • Input params as table

    Input params as table

    I find list of parameters easier to parse in a table than as list. Here's what that would look like. If you think it's an improvement, could consider joining type and default columns to save horizontal space.

    After

    Screen Shot 2022-03-02 at 09 37 16

    Before

    Screen Shot 2022-03-02 at 09 39 13

    opened by janosh 1
  • Rokas slider

    Rokas slider

    Implemented support for specifying multiple smiles in the smiles_col argument for molplotly.add_molecules:

    • When a single str argument is passed to smiles_col, the function behaves as before.
    • When a list is passed, a slider is created under the plot, which allows the user to decide which column to use to render molecules.

    Also changed the ports in the example.ipynb to start from 8700 and go up. On my system 8000 and 8001 were reserverd already.

    opened by RokasEl 1
  • Error when using with Plotly Subplots

    Error when using with Plotly Subplots

    When trying to use molplotly to generate hover structures with a series of scatterplots generated using make_subplots (generated using different columns of a dataframe for the same RDKit molecule row), molplotly.add_molecules returns

    ValueError: More than one plotly curve in figure - color_col and/or marker_col needs to be specified.

    As these plots are generated using different columns, rather than faceting data in a single column based on values in another, there is no common color or marker column. Is there a way to generate molecular structures for these subplots?

    opened by matthewtoholland 3
  • Matrix distance to scatter plot

    Matrix distance to scatter plot

    Hello everyone,

    My name is Judith and for my PhD studies, I would like to use your beautiful scripts. I get a distance matrix by rmsd between each pose but I don't see how to pass it to a scatter plot of 2 clusters, I tried with pandas but I'm really blocked, I can't select the lines and the columns to generate the scatter plot

    Best Regards, Judith

    opened by JudKil 7
  • Saving interactive plots

    Saving interactive plots

    Thanks for the great package!

    It would be fantastic if the interactive plots could be exported/saved. I understand that this is non-trvial in plotly, but other libraries like mpl3d also allow to export as interactive HTML or SVG. See here for an exemplary plot. Also TMAP and Faerun support this natively. I think it will be a heavily sought-after feature for real usability of this package.

    Possible solutions:

    • separate integration building on top of mpl3d (seems overkill, might be the last resort)
    • Building upon this gist to export the Dash as HTML: https://gist.github.com/exzhawk/33e5dcfc8859e3b6ff4e5269b1ba0ba4
    • Faerun-style solution, see here: https://github.com/reymond-group/faerun-python
    opened by jannisborn 2
Releases(v1.1.5)
  • v1.1.5(Nov 24, 2022)

    Added ability to plot 3D coordinates from RDKit Mol objects as highlighted in issue #20, as well as making facet plots as raised in issue #21.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.4(Jun 14, 2022)

    Loosened package dependencies to address issue #18. Minimum version requirements for dash, jupyter-dash, and werkzeug are specified but everything else e.g. pandas is loosened.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Jun 1, 2022)

  • v1.1.2(Apr 6, 2022)

  • v1.1.1(Mar 1, 2022)

  • v1.1.0(Mar 1, 2022)

    Added features, formatting, and bug fixes :)

    • Simultaneous plotting of multiple smiles columns (pull request #1) can now be done by passing in a list of smiles columns into smiles_col (see examples/multiple_smiles_columns.ipynb for a tutorial).
    • Adjusting of hover box transparency (issue #3) can now be controlled with alpha and mol_alpha arguments (see entry in examples/simple_usage_and_formatting.ipynb for example usage).
    • Usage examples split into multiple notebooks and organised in examples folder.
    • Fixed bug (issue #6) resulting from specifying both color_col and marker_col.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Feb 11, 2022)

    It seems that Dash got updated to 2.1.0 without me realising and of course that broke jupyter-dash 😅 this is a hotfix to the requirements specifying that dash 2.0.0 is required.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Feb 11, 2022)

Standardized plots and visualizations in Python

Standardized plots and visualizations in Python pltviz is a Python package for standardized visualization. Routine and novel plotting approaches are f

Andrew Tavis McAllister 0 Jul 09, 2022
Manim is an animation engine for explanatory math videos.

A community-maintained Python framework for creating mathematical animations.

12.4k Dec 30, 2022
Visualize data of Vietnam's regions with interactive maps.

Plotting Vietnam Development Map This is my personal project that I use plotly to analyse and visualize data of Vietnam's regions with interactive map

1 Jun 26, 2022
Farhad Davaripour, Ph.D. 1 Jan 05, 2022
matplotlib: plotting with Python

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Check out our home page for more inform

Matplotlib Developers 16.7k Jan 08, 2023
flask extension for integration with the awesome pydantic package

Flask-Pydantic Flask extension for integration of the awesome pydantic package with Flask. Installation python3 -m pip install Flask-Pydantic Basics v

249 Jan 06, 2023
Automatically generate GitHub activity!

Commit Bot Automatically generate GitHub activity! We've all wanted to be the developer that commits every day, but that requires a lot of work. Let's

Ricky 4 Jun 07, 2022
Python support for Godot 🐍🐍🐍

Godot Python, because you want Python on Godot ! The goal of this project is to provide Python language support as a scripting module for the Godot ga

Emmanuel Leblond 1.4k Jan 04, 2023
A programming language built on top of Python to easily allow Swahili speakers to get started with programming without ever knowing English

pyswahili A programming language built over Python to easily allow swahili speakers to get started with programming without ever knowing english pyswa

Jordan Kalebu 72 Dec 15, 2022
Automatically visualize your pandas dataframe via a single print! 📊 💡

A Python API for Intelligent Visual Discovery Lux is a Python library that facilitate fast and easy data exploration by automating the visualization a

Lux 4.3k Dec 28, 2022
Implementation of SOMs (Self-Organizing Maps) with neighborhood-based map topologies.

py-self-organizing-maps Simple implementation of self-organizing maps (SOMs) A SOM is an unsupervised method for learning a mapping from a discrete ne

Jonas Grebe 6 Nov 22, 2022
Flow-based visual scripting for Python

A simple visual node editor for Python Ryven combines flow-based visual scripting with Python. It gives you absolute freedom for your nodes and a simp

Leon Thomm 3.1k Jan 06, 2023
An easy to use burndown chart generator for GitHub Project Boards.

Burndown Chart for GitHub Projects An easy to use burndown chart generator for GitHub Project Boards. Table of Contents Features Installation Assumpti

Joseph Hale 15 Dec 28, 2022
Library for exploring and validating machine learning data

TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig

688 Jan 03, 2023
A simple interpreted language for creating basic mathematical graphs.

graphr Introduction graphr is a small language written to create basic mathematical graphs. It is an interpreted language written in python and essent

2 Dec 26, 2021
Python package that generates hardware pinout diagrams as SVG images

PinOut A Python package that generates hardware pinout diagrams as SVG images. The package is designed to be quite flexible and works well for general

336 Dec 20, 2022
Example Code Notebooks for Data Visualization in Python

This repository contains sample code scripts for creating awesome data visualizations from scratch using different python libraries (such as matplotli

Javed Ali 27 Jan 04, 2023
DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

DataVisualization - The evolution of my arduino and python journey. New level of competence achieved

1 Jan 03, 2022
Generate visualizations of GitHub user and repository statistics using GitHub Actions.

GitHub Stats Visualization Generate visualizations of GitHub user and repository statistics using GitHub Actions. This project is currently a work-in-

JoelImgu 3 Dec 14, 2022
An interactive UMAP visualization of the MNIST data set.

Code for an interactive UMAP visualization of the MNIST data set. Demo at https://grantcuster.github.io/umap-explorer/. You can read more about the de

grant 70 Dec 27, 2022