MPLP: Metapath-Based Label Propagation for Heterogenous Graphs

Related tags

Deep Learningmplp
Overview

MPLP: Metapath-Based Label Propagation for Heterogenous Graphs

Results on MAG240M

Here, we demonstrate the following performance on the MAG240M dataset from [email protected] 2021.

Model Test Acc Validation Acc Parameters Hardware
Our Model 0.7447 0.7669 ± 0.0003 (ensemble 0.7696) 743,449 Tesla V100 (21GB)

Reproducing results

0. Requirements

Here just list python3 packages we used in this competition:

numpy==1.19.2
torch==1.5.1+cu101
dgl-cu101==0.6.0.post1
ogb==1.3.1
sklearn==0.23.2
tqdm==4.46.1

1. Prepare Graph and Features

The preprocess code modifed from dgl baseline. We created graph with 6 different edge types instead of 5.

# Time cost: 3hours,30mins

python3 $MAG_CODE_PATH/preprocess.py
        --rootdir $MAG_INPUT_PATH \
        --author-output-path $MAG_PREP_PATH/author.npy \
        --inst-output-path $MAG_PREP_PATH/inst.npy \
        --graph-output-path $MAG_PREP_PATH \
        --graph-as-homogeneous \
        --full-output-path $MAG_PREP_PATH/full_feat.npy

The graphs and features will be saved in MAG_PREP_PATH , where the MAG_PREP_PATH is specified in run.sh.

Calculate features

The meta-path based features are generated by this script. Details can be found in our technical report.

# Time cost: 2hours,20mins (only generate label related features)

python3 $MAG_CODE_PATH/feature.py
        $MAG_INPUT_PATH \
        $MAG_PREP_PATH/dgl_graph_full_heterogeneous_csr.bin \
        $MAG_FEAT_PATH \
        --seed=42

Train RGAT model and prepare RGAT features

The RGAT model is modifed from dgl baseline. The validation accuracy is 0.701 , as same as described in the dgl baseline github.

# Time cost: 33hours,40mins (20mins for each epoch)

python3 $MAG_CODE_PATH/rgat.py
        --rootdir $MAG_INPUT_PATH \
        --graph-path $MAG_PREP_PATH/dgl_graph_full_homogeneous_csc.bin \
        --full-feature-path $MAG_PREP_PATH/full_feat.npy \
        --output-path $MAG_RGAT_PATH/ \
        --epochs=100 \
        --model-path $MAG_RGAT_PATH/model.pt \
        --submission-path $MAG_RGAT_PATH/

You will get embeddings as input features of the following MPLP models.

2. Train MPLP models

The train process splits to two steps:

  1. train the model with full train samples at a large learning rate (here we use StepLR(lr=0.01, step_size=100, gamma=0.25))
  2. then fine tune the model with latest train samples (eg, paper with year >= 2018) with a small learning rate (0.000625)

You can train the MPLP model by running the following commands:

# Time cost: 2hours,40mins for each seed

for seed in $(seq 0 7);
do
    python3 $MAG_CODE_PATH/mplp.py \
            $MAG_INPUT_PATH \
            $MAG_MPLP_PATH/data/ \
            $MAG_MPLP_PATH/output/seed${seed} \
            --gpu \
            --seed=${seed} \
            --batch_size=10240 \
            --epochs=200 \
            --num_layers=2 \
            --learning_rate=0.01 \
            --dropout=0.5 \
            --num_splits=5
done

3. Ensemble MPLP results

While having all the results with k-fold cross validation training under 8 different seeds, you can average the results by running code below:

python3 $MAG_CODE_PATH/ensemble.py $MAG_MPLP_PATH/output/ $MAG_SUBM_PATH
Owner
Qiuying Peng
Astrophysics -> Data Science
Qiuying Peng
BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation

BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation Installing The Dependencies $ conda create --name beametrics python

7 Jul 04, 2022
ECLARE: Extreme Classification with Label Graph Correlations

ECLARE ECLARE: Extreme Classification with Label Graph Correlations @InProceedings{Mittal21b, author = "Mittal, A. and Sachdeva, N. and Agrawal

Extreme Classification 35 Nov 06, 2022
The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

WSRGlow The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio sa

Kexun Zhang 96 Jan 03, 2023
Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2

CoaDTI Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2 Abstract Environment The test was conducted i

Layne_Huang 7 Nov 14, 2022
Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

CDN Code for our NeurIPS 2021 paper "Mining the Benefits of Two-stage and One-stage HOI Detection". Contributed by Aixi Zhang*, Yue Liao*, Si Liu, Mia

71 Dec 14, 2022
Tutorial to set up TensorFlow Object Detection API on the Raspberry Pi

A tutorial showing how to set up TensorFlow's Object Detection API on the Raspberry Pi

Evan 1.1k Dec 26, 2022
Code for our ALiBi method for transformer language models.

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation This repository contains the code and models for our paper Tra

Ofir Press 211 Dec 31, 2022
[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

DiffHand This repository contains the implementation for the paper An End-to-End Differentiable Framework for Contact-Aware Robot Design (RSS 2021). I

Jie Xu 60 Jan 04, 2023
This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

1 Oct 25, 2021
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Code for NeurIPS 2021 paper 'Spatio-Temporal Variational Gaussian Processes'

Spatio-Temporal Variational GPs This repository is the official implementation of the methods in the publication: O. Hamelijnck, W.J. Wilkinson, N.A.

AaltoML 26 Sep 16, 2022
PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks. Code, based on the PyTorch framework, for reprodu

Asaf 3 Dec 27, 2022
Repository for MDPGT

MD-PGT Repository for implementing and reproducing the results for the paper MDPGT: Momentum-based Decentralized Policy Gradient Tracking. Available E

Xian Yeow Lee 2 Dec 30, 2021
(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing This repository contains pytorch source code for AAAI2020 oral paper: Grapy-ML

54 Aug 04, 2022
DeepStruc is a Conditional Variational Autoencoder which can predict the mono-metallic nanoparticle from a Pair Distribution Function.

ChemRxiv | [Paper] XXX DeepStruc Welcome to DeepStruc, a Deep Generative Model (DGM) that learns the relation between PDF and atomic structure and the

Emil Thyge Skaaning Kjær 13 Aug 01, 2022
This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression problems

Doctoral dissertation of Zheng Zhao This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression pro

Zheng Zhao 21 Nov 14, 2022
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Hrishikesh Kamath 31 Nov 20, 2022
Approaches to modeling terrain and maps in python

topography 🌎 Contains different approaches to modeling terrain and topographic-style maps in python Features Inverse Distance Weighting (IDW) A given

John Gutierrez 1 Aug 10, 2022
Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

SemCo The official pytorch implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

42 Nov 14, 2022