plotly scatterplots which show molecule images on hover!

Overview

molplotly

Plotly scatterplots which show molecule images on hovering over the datapoints!

Beautiful :)

Required packages:

➡️ See example.ipynb for an example :)

📜 Usage

import pandas as pd
import plotly.express as px

import molplotly

# load a DataFrame with smiles
df_esol = pd.read_csv('esol.csv')
df_esol['y_pred'] = df_esol['ESOL predicted log solubility in mols per litre']
df_esol['y_true'] = df_esol['measured log solubility in mols per litre']

# generate a scatter plot
fig = px.scatter(df_esol, x="y_true", y="y_pred")

# add molecules to the plotly graph - returns a Dash app
app = molplotly.add_molecules(fig=fig, 
                            df=df_esol, 
                            smiles_col='smiles', 
                            title_col='Compound ID', 
                            )

# run Dash app inline in notebook (or in an external server)
app.run_server(mode='inline', port=8011, height=1000)

Input parameters

  • fig : plotly.graph_objects.Figure object
    a plotly figure object containing datapoints plotted from df
  • df : pandas.DataFrame object
    a pandas dataframe that contains the data plotted in fig
  • smiles_col : str, optional
    name of the column in df containing the smiles plotted in fig (default 'SMILES')
  • show_img : bool, optional
    whether or not to generate the molecule image in the dash app (default True)
  • title_col : str, optional
    name of the column in df to be used as the title entry in the hover box (default None)
  • show_coords : bool, optional
    whether or not to show the coordinates of the data point in the hover box (default True)
  • caption_cols : list, optional
    list of column names in df to be included in the hover box (default None)
  • condition_col : str, optional
    name of the column in df that is used to color the datapoints in df - necessary when there is discrete conditional coloring (default None)
  • wrap : bool, optional
    whether or not to wrap the title text to multiple lines if the length of the text is too long (default True)
  • wraplen : int, optional
    the threshold length of the title text before wrapping begins - adjust when changing the width of the hover box (default 20)
  • width : int, optional
    the width in pixels of the hover box (default 150)
  • fontfamily : str, optional
    the font family used in the hover box (default 'Arial')
  • fontsize : int, optional
    the font size used in the hover box - the font of the title line is fontsize+2 (default 12)

Output parameters

by default a JupyterDash app is returned which can be run inline in a jupyter notebook or deployed on a server via app.run_server()

Acknowledgements

Features to-add:

  1. Individual styles for each caption (fonts, colors etc)
  2. Some way to save the plot
  3. Highlight points by clicking on them
  4. SVG image generation
Comments
  • Erro with dash 2.3.0

    Erro with dash 2.3.0

    Hello,

    First of all thanks for this great package.

    Today, I got the following error while importing molplotly whereas I previously had no issue: ImportError: cannot import name 'Input' from 'dash'

    As my version of dash was a bit old 1.6 I believe, I upgraded it to version 2.3.0 via pip. The import error disappeared but I had another error when trying to run the server "app.run_server(mode='inline', port=8003, height=800)": AttributeError: ('Read-only: can only be set in the Dash constructor or during init_app()', 'requests_pathname_prefix')

    I have managed to get around by downgrading dash to version 2.0.0 as recommended here https://stackoverflow.com/questions/70908709/jupyterdash-app-run-server-error-using-jupyter-notebook, but there may be something to look into...

    Thanks again

    opened by remseven 4
  • Pip install error

    Pip install error

    Great idea this package! I however ran into an issue during installation: UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 1822: character maps to <undefined> This is certainly due to unusal characters in the readme, since I could install the package locally by editing the readme file. Cheers!

    opened by ArnaudGaudry 4
  • Dependency versions for pip install

    Dependency versions for pip install

    Is there some reason for the very specific version requirements for the dependencies? For the pip install I would loosen this up unless there are any particular known issues.

    opened by kjelljorner 3
  • Setup testing + CI

    Setup testing + CI

    Best not to wait too long to start testing. Here's a basic scaffold to get started.

    Probably shouldn't be merged as is but y'all can push more commits here to get this in shape.

    opened by janosh 3
  • Plotting in a running Dash app

    Plotting in a running Dash app

    Hi! I have a small dash app that I use to explore the molecules present in different samples. It is possible to select the samples of interest and then display a plotly scatter plot generated using the structures. I tried to add the molplotly layer on the scatter plot but no molecules are displayed. Any experience on that? Thanks!

    opened by ArnaudGaudry 2
  • Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Defining color and markers simultaneously in px.scatter causes issues with hoverbox

    Hi there, thanks for providing a great and easy to use tool!

    This issue is reproducible with the first example in the documentation:

    df_esol['delY'] = df_esol["y_pred"] - df_esol["y_true"]
    fig_scatter = px.scatter(df_esol,
                             x="y_true",
                             y="y_pred",
                             color='delY',
                             marker='Minimum Degree', # <- addition
                             title='ESOL Regression (default plotly)',
                             labels={'y_pred': 'Predicted Solubility',
                                     'y_true': 'Measured Solubility',
                                     'delY': 'ΔY'},
                             width=1200,
                             height=800)
    
    # This adds a dashed line for what a perfect model _should_ predict
    y = df_esol["y_true"].values
    fig_scatter.add_shape(
        type="line", line=dict(dash='dash'),
        x0=y.min(), y0=y.min(),
        x1=y.max(), y1=y.max()
    )
    
    fig_scatter.update_layout(title='ESOL Regression (with add_molecules!)')
    
    app_scatter = molplotly.add_molecules(fig=fig_scatter,
                                          df=df_esol,
                                          smiles_col='smiles',
                                          title_col='Compound ID',
                                          color_col='delY' # <- addition
                                          )
    
    # change the arguments here to run the dash app on an external server and/or change the size of the app!
    app_scatter.run_server(mode='inline', port=8001, height=1000)
    
    

    This returns

    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/molplotly/main.py in display_hover(
        hoverData={'points': [{'bbox': {'x0': 948.39, 'x1': 950.39, 'y0': 177.7, 'y1': 179.7}, 'curveNumber': 0, 'marker.color': -0.48000000000000004, 'pointIndex': 960, 'pointNumber': 960, 'x': 0.79, 'y': 0.31}]}
    )
        111             df_curve = df[df[color_col] ==
        112                           curve_dict[curve_num]].reset_index(drop=True)
    --> 113             df_row = df_curve.iloc[num]
            df_row = undefined
            df_curve.iloc = <pandas.core.indexing._iLocIndexer object at 0x7f7e3d16c950>
            num = 960
        114         else:
        115             df_row = df.iloc[num]
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960
    )
        929 
        930             maybe_callable = com.apply_if_callable(key, self.obj)
    --> 931             return self._getitem_axis(maybe_callable, axis=axis)
            self._getitem_axis = <bound method _iLocIndexer._getitem_axis of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            maybe_callable = 960
            axis = 0
        932 
        933     def _is_scalar_access(self, key: tuple):
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1564 
       1565             # validate the location
    -> 1566             self._validate_integer(key, axis)
            self._validate_integer = <bound method _iLocIndexer._validate_integer of <pandas.core.indexing._iLocIndexer object at 0x7f7e3d490c50>>
            key = 960
            axis = 0
       1567 
       1568             return self.obj._ixs(key, axis=axis)
    
    ~/anaconda3/envs/ml/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_integer(
        self=<pandas.core.indexing._iLocIndexer object>,
        key=960,
        axis=0
    )
       1498         len_axis = len(self.obj._get_axis(axis))
       1499         if key >= len_axis or key < -len_axis:
    -> 1500             raise IndexError("single positional indexer is out-of-bounds")
            global IndexError = undefined
       1501 
       1502     # -------------------------------------------------------------------
    
    IndexError: single positional indexer is out-of-bounds
    

    Using either only marker or color alone causes no issues with the hoverbox. Also, using Minimum Degree as color_col for add_molecules when both color and symbol are defined gives no issues.

    opened by chertianser 2
  • Removed support for python 3.7

    Removed support for python 3.7

    Thanks for the great library, really useful!

    You pinned the pandas version to ~=1.4.1 which effectively cuts of users that use python<3.7. See the pandas release notes: https://pandas.pydata.org/docs/whatsnew/v1.4.0.html

    Is this intentional? Which novel features from pandas >1.4.0 are strictly necessary to keep the package running? I'll create a PR with a relaxed pandas requirements that works fine for me in a python3.7 env.

    opened by jannisborn 1
  • Input params as table

    Input params as table

    I find list of parameters easier to parse in a table than as list. Here's what that would look like. If you think it's an improvement, could consider joining type and default columns to save horizontal space.

    After

    Screen Shot 2022-03-02 at 09 37 16

    Before

    Screen Shot 2022-03-02 at 09 39 13

    opened by janosh 1
  • Rokas slider

    Rokas slider

    Implemented support for specifying multiple smiles in the smiles_col argument for molplotly.add_molecules:

    • When a single str argument is passed to smiles_col, the function behaves as before.
    • When a list is passed, a slider is created under the plot, which allows the user to decide which column to use to render molecules.

    Also changed the ports in the example.ipynb to start from 8700 and go up. On my system 8000 and 8001 were reserverd already.

    opened by RokasEl 1
  • Error when using with Plotly Subplots

    Error when using with Plotly Subplots

    When trying to use molplotly to generate hover structures with a series of scatterplots generated using make_subplots (generated using different columns of a dataframe for the same RDKit molecule row), molplotly.add_molecules returns

    ValueError: More than one plotly curve in figure - color_col and/or marker_col needs to be specified.

    As these plots are generated using different columns, rather than faceting data in a single column based on values in another, there is no common color or marker column. Is there a way to generate molecular structures for these subplots?

    opened by matthewtoholland 3
  • Matrix distance to scatter plot

    Matrix distance to scatter plot

    Hello everyone,

    My name is Judith and for my PhD studies, I would like to use your beautiful scripts. I get a distance matrix by rmsd between each pose but I don't see how to pass it to a scatter plot of 2 clusters, I tried with pandas but I'm really blocked, I can't select the lines and the columns to generate the scatter plot

    Best Regards, Judith

    opened by JudKil 7
  • Saving interactive plots

    Saving interactive plots

    Thanks for the great package!

    It would be fantastic if the interactive plots could be exported/saved. I understand that this is non-trvial in plotly, but other libraries like mpl3d also allow to export as interactive HTML or SVG. See here for an exemplary plot. Also TMAP and Faerun support this natively. I think it will be a heavily sought-after feature for real usability of this package.

    Possible solutions:

    • separate integration building on top of mpl3d (seems overkill, might be the last resort)
    • Building upon this gist to export the Dash as HTML: https://gist.github.com/exzhawk/33e5dcfc8859e3b6ff4e5269b1ba0ba4
    • Faerun-style solution, see here: https://github.com/reymond-group/faerun-python
    opened by jannisborn 2
Releases(v1.1.5)
  • v1.1.5(Nov 24, 2022)

    Added ability to plot 3D coordinates from RDKit Mol objects as highlighted in issue #20, as well as making facet plots as raised in issue #21.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.4(Jun 14, 2022)

    Loosened package dependencies to address issue #18. Minimum version requirements for dash, jupyter-dash, and werkzeug are specified but everything else e.g. pandas is loosened.

    Source code(tar.gz)
    Source code(zip)
  • v1.1.3(Jun 1, 2022)

  • v1.1.2(Apr 6, 2022)

  • v1.1.1(Mar 1, 2022)

  • v1.1.0(Mar 1, 2022)

    Added features, formatting, and bug fixes :)

    • Simultaneous plotting of multiple smiles columns (pull request #1) can now be done by passing in a list of smiles columns into smiles_col (see examples/multiple_smiles_columns.ipynb for a tutorial).
    • Adjusting of hover box transparency (issue #3) can now be controlled with alpha and mol_alpha arguments (see entry in examples/simple_usage_and_formatting.ipynb for example usage).
    • Usage examples split into multiple notebooks and organised in examples folder.
    • Fixed bug (issue #6) resulting from specifying both color_col and marker_col.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Feb 11, 2022)

    It seems that Dash got updated to 2.1.0 without me realising and of course that broke jupyter-dash 😅 this is a hotfix to the requirements specifying that dash 2.0.0 is required.

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Feb 11, 2022)

A small collection of tools made by me, that you can use to visualize atomic orbitals in both 2D and 3D in different aspects.

Orbitals in Python A small collection of tools made by me, that you can use to visualize atomic orbitals in both 2D and 3D in different aspects, and o

Prakrisht Dahiya 1 Nov 25, 2021
Automatically generate GitHub activity!

Commit Bot Automatically generate GitHub activity! We've all wanted to be the developer that commits every day, but that requires a lot of work. Let's

Ricky 4 Jun 07, 2022
Python implementation of the Density Line Chart by Moritz & Fisher.

PyDLC - Density Line Charts with Python Python implementation of the Density Line Chart (Moritz & Fisher, 2018) to visualize large collections of time

Charles L. Bérubé 10 Jan 06, 2023
A minimal Python package that produces slice plots through h5m DAGMC geometry files

A minimal Python package that produces slice plots through h5m DAGMC geometry files Installation pip install dagmc_geometry_slice_plotter Python API U

Fusion Energy 4 Dec 02, 2022
Pglive - Pglive package adds support for thread-safe live plotting to pyqtgraph

Live pyqtgraph plot Pglive package adds support for thread-safe live plotting to

Martin Domaracký 15 Dec 10, 2022
Parallel t-SNE implementation with Python and Torch wrappers.

Multicore t-SNE This is a multicore modification of Barnes-Hut t-SNE by L. Van der Maaten with python and Torch CFFI-based wrappers. This code also wo

Dmitry Ulyanov 1.7k Jan 09, 2023
Drug design and development team HackBio internship is a virtual bioinformatics program that introduces students and professional to advanced practical bioinformatics and its applications globally.

-Nyokong. Drug design and development team HackBio internship is a virtual bioinformatics program that introduces students and professional to advance

4 Aug 04, 2022
python partial dependence plot toolbox

PDPbox python partial dependence plot toolbox Motivation This repository is inspired by ICEbox. The goal is to visualize the impact of certain feature

Li Jiangchun 723 Jan 07, 2023
HW_02 Data visualisation task

HW_02 Data visualisation and Matplotlib practice Instructions for HW_02 Idea for data analysis As I was brainstorming ideas and running through databa

9 Dec 13, 2022
A programming language built on top of Python to easily allow Swahili speakers to get started with programming without ever knowing English

pyswahili A programming language built over Python to easily allow swahili speakers to get started with programming without ever knowing english pyswa

Jordan Kalebu 72 Dec 15, 2022
GD-UltraHack - A Mod Menu for Geometry Dash. Specifically a MegahackV5 clone in Python. Only for Windows

GD UltraHack: The Mod Menu that Nobody asked for. This is a mod menu for the gam

zeo 1 Jan 05, 2022
Small project to recursively calculate and plot each successive order of the Hilbert Curve

hilbert-curve Small project to recursively calculate and plot each successive order of the Hilbert Curve. After watching 3Blue1Brown's video on Hilber

Stefan Mejlgaard 2 Nov 15, 2021
Scientific Visualization: Python + Matplotlib

An open access book on scientific visualization using python and matplotlib

Nicolas P. Rougier 8.6k Dec 31, 2022
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
Data-FX is an addon for Blender (2.9) that allows for the visualization of data with different charts

Data-FX Data-FX is an addon for Blender (2.9) that allows for the visualization of data with different charts Currently, there are only 2 chart option

Landon Ferguson 20 Nov 21, 2022
Flipper Zero documentation repo

Flipper Zero Docs Participation To fix a bug or add something new to this repository, you need to open a pull-request. Also, on every page of the site

Flipper Zero (All Repositories will be public soon) 114 Dec 30, 2022
在原神中使用围栏绘图

yuanshen_draw 在原神中使用围栏绘图 文件说明 toLines.py 将一张图片转换为对应的线条集合,视频可以按帧转换。 draw.py 在原神家园里绘制一张线条图。 draw_video.py 在原神家园里绘制视频(自动按帧摆放,截图(win)并回收) cat_to_video.py

14 Oct 08, 2022
Simple Inkscape Scripting

Simple Inkscape Scripting Description In the Inkscape vector-drawing program, how would you go about drawing 100 diamonds, each with a random color an

Scott Pakin 140 Dec 27, 2022
a simple REPL display lib for circuitpython

Circuitpython-termio-lib a simple REPL display lib for circuitpython Fonctions cls clear terminal screen and set cursor on top left : coords 0,0 usage

BeBoXoS 1 Nov 17, 2021
Active Transport Analytics Model (ATAM) is a new strategic transport modelling and data visualization framework for Active Transport as well as emerging micro-mobility modes

{ATAM} Active Transport Analytics Model Active Transport Analytics Model (“ATAM”) is a new strategic transport modelling and data visualization framew

Peter Stephan 0 Jan 12, 2022