DeepStruc is a Conditional Variational Autoencoder which can predict the mono-metallic nanoparticle from a Pair Distribution Function.

Overview

ChemRxiv | [Paper] XXX

DeepStruc

Welcome to DeepStruc, a Deep Generative Model (DGM) that learns the relation between PDF and atomic structure and thereby solves a structure from a PDF!

  1. DeepStruc
  2. Getting started (with Colab)
  3. Getting started (own computer)
    1. Install requirements
    2. Simulate data
    3. Train model
    4. Predict
  4. Author
  5. Cite
  6. Acknowledgments
  7. License

We here apply DeepStruc for the structural analysis of a model system of mono-metallic nanoparticle (MMNPs) with seven different structure types and demonstrate the method for both simulated and experimental PDFs. DeepStruc can reconstruct simulated data with an average mean absolute error (MAE) of the atom xyz-coordinates on 0.093 ± 0.058 Å after fitting a contraction/extraction factor, an ADP and a scale parameter. We demonstrate the generative capability of DeepStruc on a dataset of face-centered cubic (fcc), hexagonal closed packed (hcp) and stacking faulted structures, where DeepStruc can recognize the stacking faulted structures as an interpolation between fcc and hcp and construct new structural models based on a PDF. The MAE is in this example 0.030 ± 0.019 Å.

The MMNPs are provided as a graph-based input to the encoder of DeepStruc. We compare DeepStruc with a similar DGM without the graph-based encoder. DeepStruc is able to reconstruct the structures using a smaller dimension of the latent space thus having a better generative capabillity. We also compare DeepStruc with a brute-force modelling approach and a tree-based classification algorithm. The ML models are significantly faster than the brute-force approach, but DeepStruc can furthermore create a latent space from where synthetic structures can be sampled which the tree-based method cannot! The baseline models can be found in other repositories: brute-force, MetalFinder and CVAE. alt text

Getting started (with Colab)

Using DeepStruc on your own PDFs is straightforward and does not require anything installed or downloaded to your computer. Follow the instructions in our Colab notebook and try to play around.

Getting started (own computer)

Follow these step if you want to train DeepStruc and predict with DeepStruc locally on your own computer.

Install requirements

See the install folder.

Simulate data

See the data folder.

Train model

To train your own DeepStruc model simply run:

python train.py

A list of possible arguments or run the '--help' argument for additional information.
If you are intersted in changing the architecture of the model go to train.py and change the model_arch dictionary.

Arg Description Example
-h or --help Prints help message.
-d or --data_dir Directory containing graph training, validation and test data. str -d ./data/graphs
-s or --save_dir Directory where models will be saved. This is also used for loading a learner. str -s bst_model
-r or --resume_model If 'True' the save_dir model is loaded and training is continued. bool -r True
-e or --epochs Number of maximum epochs. int -e 100
-b or --batch_size Number of graphs in each batch. int -b 20
-l or --learning_rate Learning rate. float -l 1e-4
-B or --beta Initial beta value for scaling KLD. float -B 0.1
-i or --beta_increase Increments of beta when the threshold is met. float -i 0.1
-x or --beta_max Highst value beta can increase to. float -x 5
-t or --reconstruction_th Reconstruction threshold required before beta is increased. float -t 0.001
-n or --num_files Total number of files loaded. Files will be split 60/20/20. If 'None' then all files are loaded. int -n 500
-c or --compute Train model on CPU or GPU. Choices: 'cpu', 'gpu16', 'gpu32' and 'gpu64'. str -c gpu32
-L or --latent_dim Number of latent space dimensions. int -L 3

Predict

To predict a MMNP using DeepStruc or your own model on a PDF:

python predict.py

A list of possible arguments or run the '--help' argument for additional information.

Arg Description Example
-h or --help Prints help message.
-d or --data Path to data or data directory. If pointing to data directory all datasets must have same format. str -d data/experimental_PDFs/JQ_S1.gr
-m or --model Path to model. If 'None' GUI will open. str -m ./models/DeepStruc
-n or --num_samples Number of samples/structures generated for each unique PDF. int -n 10
-s or --sigma Sample to '-s' sigma in the normal distribution. float -s 7
-p or --plot_sampling Plots sampled structures on top of DeepStruc training data. Model must be DeepStruc. bool -p True
-g or --save_path Path to directory where predictions will be saved. bool -g ./best_preds
-i or --index_plot Highlights specific reconstruction in the latent space. --data must be specific file and not directory and '--plot True'. int -i 4
-P or --plot_data If True then the first loaded PDF is plotted and shown after normalization. bool -P ./best_preds

Authors

Andy S. Anker1
Emil T. S. Kjær1
Marcus N. Weng1
Simon J. L. Billinge2, 3
Raghavendra Selvan4, 5
Kirsten M. Ø. Jensen1

1 Department of Chemistry and Nano-Science Center, University of Copenhagen, 2100 Copenhagen Ø, Denmark.
2 Department of Applied Physics and Applied Mathematics Science, Columbia University, New York, NY 10027, USA.
3 Condensed Matter Physics and Materials Science Department, Brookhaven National Laboratory, Upton, NY 11973, USA.
4 Department of Computer Science, University of Copenhagen, 2100 Copenhagen Ø, Denmark.
5 Department of Neuroscience, University of Copenhagen, 2200, Copenhagen N.

Should there be any question, desired improvement or bugs please contact us on GitHub or through email: [email protected] or [email protected].

Cite

If you use our code or our results, please consider citing our papers. Thanks in advance!

@article{kjær2022DeepStruc,
title={DeepStruc: Towards structure solution from pair distribution function data using deep generative models},
author={Emil T. S. Kjær, Andy S. Anker, Marcus N. Weng, Simon J. L. Billinge, Raghavendra Selvan, Kirsten M. Ø. Jensen},
year={2022}}
@article{anker2020characterising,
title={Characterising the atomic structure of mono-metallic nanoparticles from x-ray scattering data using conditional generative models},
author={Anker, Andy Sode and Kjær, Emil TS and Dam, Erik B and Billinge, Simon JL and Jensen, Kirsten MØ and Selvan, Raghavendra},
year={2020}}

Acknowledgments

Our code is developed based on the the following publication:

@article{anker2020characterising,
title={Characterising the atomic structure of mono-metallic nanoparticles from x-ray scattering data using conditional generative models},
author={Anker, Andy Sode and Kjær, Emil TS and Dam, Erik B and Billinge, Simon JL and Jensen, Kirsten MØ and Selvan, Raghavendra},
year={2020}}

License

This project is licensed under the Apache License Version 2.0, January 2004 - see the LICENSE file for details.

Owner
Emil Thyge Skaaning Kjær
Ph.D student in nanoscience at the University of Copenhagen.
Emil Thyge Skaaning Kjær
Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Non-Parametric Prior Actor-Critic (N-PPAC) This repository contains the code for On Pathologies in KL-Regularized Reinforcement Learning from Expert D

Cong Lu 5 May 13, 2022
PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{CV2018, author = {Donny You ( Donny You 40 Sep 14, 2022

[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

IIGROUP 49 Dec 11, 2022
SGoLAM - Simultaneous Goal Localization and Mapping

SGoLAM - Simultaneous Goal Localization and Mapping PyTorch implementation of the MultiON runner-up entry, SGoLAM: Simultaneous Goal Localization and

10 Jan 05, 2023
Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

Doubly Trained Neural Machine Translation System for Adversarial Attack and Data Augmentation Languages Experimented: Data Overview: Source Target Tra

Steven Tan 1 Aug 18, 2022
JFB: Jacobian-Free Backpropagation for Implicit Models

JFB: Jacobian-Free Backpropagation for Implicit Models

Typal Research 28 Dec 11, 2022
Malware Env for OpenAI Gym

Malware Env for OpenAI Gym Citing If you use this code in a publication please cite the following paper: Hyrum S. Anderson, Anant Kharkar, Bobby Fila

ENDGAME 563 Dec 29, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

Mutian He 19 Oct 14, 2022
The code of paper "Block Modeling-Guided Graph Convolutional Neural Networks".

Block Modeling-Guided Graph Convolutional Neural Networks This repository contains the demo code of the paper: Block Modeling-Guided Graph Convolution

22 Dec 08, 2022
Contextual Attention Localization for Offline Handwritten Text Recognition

CALText This repository contains the source code for CALText model introduced in "CALText: Contextual Attention Localization for Offline Handwritten T

0 Feb 17, 2022
Knowledge Management for Humans using Machine Learning & Tags

HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.

Ravn Tech, Inc. 165 Nov 04, 2022
Funnels: Exact maximum likelihood with dimensionality reduction.

Funnels This repository contains the code needed to reproduce the experiments from the paper: Funnels: Exact maximum likelihood with dimensionality re

2 Apr 21, 2022
TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning Authors: Yixuan Su, Fangyu Liu, Zaiqiao Meng, Lei Shu, Ehsan Shareghi, and Nig

Yixuan Su 79 Nov 04, 2022
Official implementation of the NeurIPS'21 paper 'Conditional Generation Using Polynomial Expansions'.

Conditional Generation Using Polynomial Expansions Official implementation of the conditional image generation experiments as described on the NeurIPS

Grigoris 4 Aug 07, 2022
[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Panoptic Segmentation Forecasting Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021 [Link to paper] We propose

Niantic Labs 44 Nov 29, 2022
HNN: Human (Hollywood) Neural Network

HNN: Human (Hollywood) Neural Network Learn the top 1000 actors on IMDB with your very own low cost, highly parallel, CUDAless biological neural netwo

Madhava Jay 0 Dec 21, 2021
BTC-Generator - BTC Generator With Python

Что такое BTC-Generator? Это генератор чеков всеми любимого @BTC_BANKER_BOT Для

DoomGod 3 Aug 24, 2022
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021

PGL-SUM: Combining Global and Local Attention with Positional Encoding for Video Summarization PyTorch Implementation of PGL-SUM From "PGL-SUM: Combin

Evlampios Apostolidis 35 Dec 22, 2022
[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️⚠️

Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples This repository is the official implementation of "Tow

Sungyoon Lee 4 Jul 12, 2022
Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021)

RSCD (BS-RSCD & JCD) Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021) by Zhihang Zhong, Yinqiang Zheng, Imari Sato We co

81 Dec 15, 2022