Scalable Graph Neural Networks for Heterogeneous Graphs

Related tags

Deep LearningNARS
Overview

Neighbor Averaging over Relation Subgraphs (NARS)

NARS is an algorithm for node classification on heterogeneous graphs, based on scalable neighbor averaging techniques that have been previously used in e.g. SIGN to heterogeneous scenarios by generating neighbor-averaged features on sampled relation induced subgraphs.

For more details, please check out our paper:

Scalable Graph Neural Networks for Heterogeneous Graphs

Setup

Dependencies

  • torch==1.5.1+cu101
  • dgl-cu101==0.4.3.post2
  • ogb==1.2.1
  • dglke==0.1.0

Docker

We have prepared a dockerfile for building a container with clean environment and all required dependencies. Please checkout instructions in docker.

Data Preparation

Download and pre-process OAG dataset (optional)

If you plan to evaluate on OAG dataset, you need to follow instructions in oag_dataset to download and pre-process dataset.

Generate input for featureless node types

In academic graph datasets (ACM, MAG, OAG) in which only paper nodes are associated with input features. NARS featurizes other node types with TransE relational graph embedding pre-trained on the graph structure.

Please follow instructions in graph_embed to generate embeddings for each dataset.

Sample relation subsets

NARS samples Relation Subsets (see our paper for details). Please follow the instructions in sample_relation_subsets to generate these subsets.

Or you may skip this step and use the example subsets that have added to this repository.

Run NARS Experiments

NARS are evaluated on three academic graph datasets to predict publishing venues and fields of papers.

ACM

python3 train.py --dataset acm --use-emb TransE_acm --R 2 \
    --use-relation-subsets sample_relation_subsets/examples/acm \
    --num-hidden 64 --lr 0.003 --dropout 0.7 --eval-every 1 \
    --num-epochs 100 --input-dropout

OGBN-MAG

python3 train.py --dataset mag --use-emb TransE_mag --R 5 \
    --use-relation-subset sample_relation_subsets/examples/mag \
    --eval-batch-size 50000 --num-hidden 512 --lr 0.001 --batch-s 50000 \
    --dropout 0.5 --num-epochs 1000

OAG (venue prediction)

python3 train.py --dataset oag_venue --use-emb TransE_oag_venue --R 3 \
    --use-relation-subsets sample_relation_subsets/examples/oag_venue \
    --eval-batch-size 25000 --num-hidden 256 --lr 0.001 --batch-size 1000 \
    --data-dir oag_dataset --dropout 0.5 --num-epochs 200

OAG (L1-field prediction)

python3 train.py --dataset oag_L1 --use-emb TransE_oag_L1 --R 3 \
    --use-relation-subsets sample_relation_subsets/examples/oag_L1 \
    --eval-batch-size 25000 --num-hidden 256 --lr 0.001 --batch-size 1000 \
    --data-dir oag_dataset --dropout 0.5 --num-epochs 200

Results

Here is a summary of model performance using example relation subsets:

For ACM and OGBN-MAG dataset, the task is to predict paper publishing venue.

Dataset # Params Test Accuracy
ACM 0.40M 0.9305±0.0043
OGBN-MAG 4.13M 0.5240±0.0016

For OAG dataset, there are two different node predictions tasks: predicting venue (single-label) and L1-field (multi-label). And we follow Heterogeneous Graph Transformer to evaluate using NDCG and MRR metrics.

Task # Params NDCG MRR
Venue 2.24M 0.5214±0.0010 0.3434±0.0012
L1-field 1.41M 0.86420.0022 0.8542±0.0019

Run with limited GPU memory

The above commands were tested on Tesla V100 (32 GB) and Tesla T4 (15GB). If your GPU memory isn't enough for handling large graphs, try the following:

  • add --cpu-process to the command to move preprocessing logic to CPU
  • reduce evaluation batch size with --eval-batch-size. The evaluation result won't be affected since model is fixed.
  • reduce training batch with --batch-size

Run NARS with Reduced CPU Memory Footprint

As mentioned in our paper, using a lot of relation subsets may consume too much CPU memory. To reduce CPU memory footprint, we implemented an optimization in train_partial.py which trains part of our feature aggregation weights at a time.

Using OGBN-MAG dataset as an example, the following command randomly picks 3 subsets from all 8 sampled relation subsets and trains their aggregation weights every 10 epochs.

python3 train_partial.py --dataset mag --use-emb TransE_mag --R 5 \
    --use-relation-subsets sample_relation_subsets/examples/mag \
    --eval-batch-size 50000 --num-hidden 512 --lr 0.001 --batch-size 50000 \
    --dropout 0.5 --num-epochs 1000 --sample-size 3 --resample-every 10

Citation

Please cite our paper with:

@article{yu2020scalable,
    title={Scalable Graph Neural Networks for Heterogeneous Graphs},
    author={Yu, Lingfan and Shen, Jiajun and Li, Jinyang and Lerer, Adam},
    journal={arXiv preprint arXiv:2011.09679},
    year={2020}
}

License

NARS is CC-by-NC licensed, as found in the LICENSE file.

Owner
Facebook Research
Facebook Research
Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment This is a pytorch project for the paper Seeing Dynamic Scene i

DV Lab 21 Nov 28, 2022
Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

Neelesh C A 3 Oct 14, 2022
RTSeg: Real-time Semantic Segmentation Comparative Study

Real-time Semantic Segmentation Comparative Study The repository contains the official TensorFlow code used in our papers: RTSEG: REAL-TIME SEMANTIC S

Mennatullah Siam 592 Nov 18, 2022
3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry

SynergyNet 3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann, CGIT Lab at Unive

Cho-Ying Wu 239 Jan 06, 2023
Pytorch implementation of Nueral Style transfer

Nueral Style Transfer Pytorch implementation of Nueral style transfer algorithm , it is used to apply artistic styles to content images . Content is t

Abhinav 9 Oct 15, 2022
fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

Ali Abdalla 34 Jan 05, 2023
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets

VGGVox models for speaker identification and verification This directory contains code to import and evaluate the speaker identification and verificat

338 Dec 27, 2022
DvD-TD3: Diversity via Determinants for TD3 version

DvD-TD3: Diversity via Determinants for TD3 version The implementation of paper Effective Diversity in Population Based Reinforcement Learning. Instal

3 Feb 11, 2022
Contains modeling practice materials and homework for the Computational Neuroscience course at Okinawa Institute of Science and Technology

A310 Computational Neuroscience - Okinawa Institute of Science and Technology, 2022 This repository contains modeling practice materials and homework

Sungho Hong 1 Jan 24, 2022
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

bottom-up-attention This code implements a bottom-up attention model, based on multi-gpu training of Faster R-CNN with ResNet-101, using object and at

Peter Anderson 1.3k Jan 09, 2023
Code for Contrastive-Geometry Networks for Generalized 3D Pose Transfer

CGTransformer Code for our AAAI 2022 paper "Contrastive-Geometry Transformer network for Generalized 3D Pose Transfer" Contrastive-Geometry Transforme

18 Jun 28, 2022
Pytoydl: A toy deep learning framework built upon numpy.

Documents: https://pytoydl.readthedocs.io/zh/latest/ Pytoydl A toy deep learning framework built upon numpy. You can star this repository to keep trac

28 Dec 10, 2022
Classical OCR DCNN reproduction based on PaddlePaddle framework.

Paddle-SVHN Classical OCR DCNN reproduction based on PaddlePaddle framework. This project reproduces Multi-digit Number Recognition from Street View I

1 Nov 12, 2021
AISTATS 2019: Confidence-based Graph Convolutional Networks for Semi-Supervised Learning

Confidence-based Graph Convolutional Networks for Semi-Supervised Learning Source code for AISTATS 2019 paper: Confidence-based Graph Convolutional Ne

MALL Lab (IISc) 56 Dec 03, 2022
[AAAI 2021] EMLight: Lighting Estimation via Spherical Distribution Approximation and [ICCV 2021] Sparse Needlets for Lighting Estimation with Spherical Transport Loss

EMLight: Lighting Estimation via Spherical Distribution Approximation (AAAI 2021) Update 12/2021: We release our Virtual Object Relighting (VOR) Datas

Fangneng Zhan 144 Jan 06, 2023
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

PCAN for Multiple Object Tracking and Segmentation This is the offical implementation of paper PCAN for MOTS. We also present a trailer that consists

ETH VIS Group 328 Dec 29, 2022
PyTorch implementation of DreamerV2 model-based RL algorithm

PyDreamer Reimplementation of DreamerV2 model-based RL algorithm in PyTorch. The official DreamerV2 implementation can be found here. Features ... Run

118 Dec 15, 2022
[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning [MedIA or Arxiv] and [Demo] This repository pr

Healthcare Intelligence Laboratory 92 Dec 08, 2022
Adaptive Graph Convolution for Point Cloud Analysis

Adaptive Graph Convolution for Point Cloud Analysis This repository contains the implementation of AdaptConv for point cloud analysis. Adaptive Graph

64 Dec 21, 2022
KIDA: Knowledge Inheritance in Data Aggregation

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

24 Sep 08, 2022