HeCo

This repo is for source code of KDD 2021 paper "Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning".
Paper Link: https://arxiv.org/abs/2105.09111

Environment Settings

python==3.8.5
scipy==1.5.4
torch==1.7.0
numpy==1.19.2
scikit_learn==0.24.2

GPU: GeForce RTX 2080 Ti
CPU: Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz

Usage

Fisrt, go into ./code, and then you can use the following commend to run our model:

python main.py acm --gpu=0

Here, "acm" can be replaced by "dblp", "aminer" or "freebase".

Some tips in parameters

We suggest you to carefully select the “pos_num” (existed in ./data/pos.py) to ensure the threshold of postives for every node. This is very important to final results. Of course, more effective way to select positives is welcome.
In ./code/utils/params.py, except "lr" and "patience", meticulously tuning dropout and tau is applaudable.
In our experiments, we only assign target type of nodes with original features, but assign other type of nodes with one-hot. This is because most of datasets used only provide features of target nodes in their original version. So, we believe in that if high-quality features of other type of nodes are provided, the overall results will improve a lot. The AMiner dataset is an example. In this dataset, there are not original features, so every type of nodes are all asigned with one-hot. In other words, every node has the same quality of features, and in this case, our HeCo is far ahead of other baselines. So, we strongly suggest that if you have high-quality features for other type of nodes, try it!

Cite

Contact

If you have any questions, please feel free to contact me with [email protected]

The source code of HeCo

Related tags

Overview

HeCo

Environment Settings

Usage

Some tips in parameters

Cite

Contact

Owner

Nian Liu

FireFlyer Record file format, writer and reader for DL training samples.

Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part

Facilitating the design, comparison and sharing of deep text matching models.

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

A multi-voice TTS system trained with an emphasis on quality

Simple NLP based project without any use of AI

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

GSoC'2021 | TensorFlow implementation of Wav2Vec2

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

Code for the paper PermuteFormer

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

An ActivityWatch watcher to pose questions to the user and record her answers.

A python framework to transform natural language questions to queries in a database query language.

Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique