On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

Related tags

Deep LearningSOLT-GNN
Overview

On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks

We provide the code (in PyTorch) and datasets for our paper "On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks" (SOLT-GNN for short), which is published in WWW-2022.

1. Descriptions

The repository is organised as follows:

  • dataset/: the original data and sampled subgraphs of the five benchmark datasets.
  • main.py: the main entry of tail graph classificaiton for SOLT-GIN.
  • gin.py: base GIN model.
  • PatternMemory.py: the module of pattern memory.
  • utils.py: contains tool functions for loading the data and data split.
  • subgraph_sample.py: contains codes for subgraph sampling.

2. Requirements

  • Python-3.8.5
  • Pytorch-1.8.1
  • Networkx-2.4
  • numpy-1.18.1

3. Running experiments

Experimental environment

Our experimental environment is Ubuntu 20.04.1 LTS (GNU/Linux 5.8.0-55-generic x86_64), and we train our model using NVIDIA GeForce RTX 1080 GPU with CUDA 11.0.

How to run

(1) First run subgraph_sample.py to complete the step of subgraph sampling before running the main.py. Note that, the sampled subgraph data may occupy some storage space.

  • python subgraph_sample.py

(2) Tail graph classification:

  • python main.py --dataset PTC --K 72 --alpha 0.3 --mu1 1.5 --mu2 1.5
  • python main.py --dataset PROTEINS --K 251 --alpha 0.15 --mu1 2 --mu2 2
  • python main.py --dataset DD --K 228 --alpha 0.1 --mu1 0.5 --mu2 0.5
  • python main.py --dataset FRANK --K 922 --alpha 0.1 --mu1 2 --mu2 0
  • python main.py --dataset IMDBBINARY --K 205 --alpha 0.15 --mu1 1 --mu2 1

Note

  • We repeat the experiments for five times and average the results for report (with standard deviation). Note that, for the five runs, we employ seeds {0, 1, 2, 3, 4} for parameters initialization, respectively.
  • The change of experimental environment (including the Requirements) may result in performance fluctuation for both the baselines and our SOLT-GNN. To reproduce the results in the paper, please set the experimental environment as illustrated above as much as possible. The utilized parameter settings are illustrated in the python commands. Note that, for the possible case of SOLT-GNN performing a bit worse which originates from environment change, the readers can further tune the parameters, including $\mu_1$, $\mu_2$, $\alpha$ and $d_m$. In particular, for these four hyper-parameters, we recommend the authors to tune them in {0.1, 0.5, 1, 1.5, 2}, {0.1, 0.5, 1, 1.5, 2}, {0.05, 0.1, 0.15, 0.2, 0.25, 0.3}, {16, 32, 64, 128}, respectively. As the performance of SOLT-GIN highly relates to GIN, so the tuning of hyper-parameters for GIN is encouraged. When tuning the hyper-parameters for SOLT-GNN, please first fix the configuration of GIN for efficiency.
  • To run the model on your own datasets, please refer to the following part (4. Input Data Format) for the dataset format.
  • The implementation of SOLT-GNN is based on the official implementation of GIN (https://github.com/weihua916/powerful-gnns).
  • To tune the other hyper-parameters, please refer to main.py for more details.
    • In particular, for the number of head graphs (marked as K in the paper) in each dataset, which decides the division of the heads/tails, the readers can tune K to explore the effect of different head/tail divisions.
    • Parameters $n_n$ and $n_g$ are the number of triplets for node- and subgraph-levels we used in the training, respectively. Performance improvement might be achieved by appropriately increasing the training triplets.

4. Input Data Format

In order to run SOLT-GNN on your own datasets, here we provide the input data format for SOLT-GNN as follows.

Each dataset XXX only contains one file, named as XXX.txt. Note that, in each dataset, we have a number of graphs. In particular, for each XXX.txt,

  • The first line only has one column, which is the number of graphs (marked as N) contained in this dataset; and the following part of this XXX.txt file is the data of each graph, including a total of N graphs.
  • In the data of each graph, the first line has two columns, which denote the number of nodes (marked as n) in this graph and the label of this graph, respectively. Following this line, there are n lines, with the i-th line corresponding to the information of node i in this graph (index i starts from 0). In each of these n lines (n nodes), the first column is the node label, the second column is the number of its neighbors (marked as m), and the following m columns correspond to the indeces (ids) of its neighbors.
    • Therefore, each graph has n+1 lines.

5. Cite

@inproceedings{liu2022onsize,
  title={On Size-Oriented Long-Tailed Graph Classification of Graph Neural Networks},
  author={Liu, Zemin and Mao, Qiheng and Liu, Chenghao and Fang, Yuan and Sun, Jianling},
  booktitle={Proceedings of the ACM Web Conference 2022},
  year={2022}
}

6. Contact

If you have any questions on the code and data, please contact Qiheng Mao ([email protected]).

Owner
Zemin Liu
My email address : liuzemin [AT] zju [DOT] edu [DOT] cn, liu [DOT] zemin [AT] hotmail [DOT] com
Zemin Liu
This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation This repo is the official implementation of "DeciWatch: A Simple Baseline for

117 Dec 24, 2022
A PyTorch implementation of " EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks."

EfficientNet A PyTorch implementation of EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. [arxiv] [Official TF Repo] Implemen

AhnDW 298 Dec 10, 2022
prior-based-losses-for-medical-image-segmentation

Repository for papers: Benchmark: Effect of Prior-based Losses on Segmentation Performance: A Benchmark Midl: A Surprisingly Effective Perimeter-based

Rosana EL JURDI 9 Sep 07, 2022
this is a lite easy to use virtual keyboard project for anyone to use

virtual_Keyboard this is a lite easy to use virtual keyboard project for anyone to use motivation I made this for this year's recruitment for RobEn AA

Mohamed Emad 3 Oct 23, 2021
TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification [NeurIPS 2021] Abstract Multiple instance learn

132 Dec 30, 2022
On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

PTvsBT On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021) Citation Please cite a

Sunbow Liu 10 Nov 25, 2022
Code for the paper "Asymptotics of ℓ2 Regularized Network Embeddings"

README Code for the paper Asymptotics of L2 Regularized Network Embeddings. Requirements Requires Stellargraph 1.2.1, Tensorflow 2.6.0, scikit-learm 0

Andrew Davison 0 Jan 06, 2022
Justmagic - Use a function as a method with this mystic script, like in Nim

justmagic Use a function as a method with this mystic script, like in Nim. Just

witer33 8 Oct 08, 2022
Minecraft agent to farm resources using reinforcement learning

BarnyardBot CS 175 group project using Malmo download BarnyardBot.py into the python examples directory and run 'python BarnyardBot.py' in the console

0 Jul 26, 2022
Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning Reference Abeßer, J. & Müller, M. Towards Audio Domain Adapt

Jakob Abeßer 2 Jul 06, 2022
Neurolab is a simple and powerful Neural Network Library for Python

Neurolab Neurolab is a simple and powerful Neural Network Library for Python. Contains based neural networks, train algorithms and flexible framework

152 Dec 06, 2022
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie

Xuan Hieu Duong 7 Jan 12, 2022
An Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering

PC-SOS-SDP: an Exact Solver for Semi-supervised Minimum Sum-of-Squares Clustering PC-SOS-SDP is an exact algorithm based on the branch-and-bound techn

Antonio M. Sudoso 1 Nov 13, 2022
Deep Learning Algorithms for Hedging with Frictions

Deep Learning Algorithms for Hedging with Frictions This repository contains the Forward-Backward Stochastic Differential Equation (FBSDE) solver and

Xiaofei Shi 3 Dec 22, 2022
Adversarial Attacks on Probabilistic Autoregressive Forecasting Models.

Attack-Probabilistic-Models This is the source code for Adversarial Attacks on Probabilistic Autoregressive Forecasting Models. This repository contai

SRI Lab, ETH Zurich 25 Sep 14, 2022
FinEAS: Financial Embedding Analysis of Sentiment 📈

FinEAS: Financial Embedding Analysis of Sentiment 📈 (SentenceBERT for Financial News Sentiment Regression) This repository contains the code for gene

LHF Labs 31 Dec 13, 2022
FishNet: One Stage to Detect, Segmentation and Pose Estimation

FishNet FishNet: One Stage to Detect, Segmentation and Pose Estimation Introduction In this project, we combine target detection, instance segmentatio

1 Oct 05, 2022
Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

ood-text-emnlp Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them" Files fine_tune.py is used to finetune the GPT-2 mo

Udit Arora 19 Oct 28, 2022
This repository contains a CBIR system that uses swin transformer to extract image's feature.

Swin-transformer based CBIR This repository contains a CBIR(content-based image retrieval) system. Here we use Swin-transformer to extract query image

JsHou 12 Nov 17, 2022
Adjust Decision Boundary for Class Imbalanced Learning

Adjusting Decision Boundary for Class Imbalanced Learning This repository is the official PyTorch implementation of WVN-RS, introduced in Adjusting De

Peyton Byungju Kim 16 Jan 04, 2023