TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Last update: Oct 26, 2022

Overview

Parameterization of Hypercomplex Multiplications (PHM)

This repository contains the TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication) layers and PHM-Transformers in the paper Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters at ICLR 2021.

Installation

One may install the following libraries before running our code:

tensorflow-gpu (1.14.0)
tensor2tensor (1.14.0)

Usage

The usage of this repository follows the original tensor2tensor repository (e.g., t2t-datagen, t2t-trainer, t2t-avg-all, followed by t2t-decoder). It helps to gain familiarity on tensor2tensor before attempting to run our code. Specifically, setting --t2t_usr_dir=./Parameterization-of-Hypercomplex-Multiplications will allow tensor2tensor to register PHM-Transformers.

Training

For example, to evaluate PHM-Transformer (n=4) on the En-Vi machine translation task (t2t-datagen --problem=translate_envi_iwslt32k), one may set the following flags when training:

t2t-trainer \
--problem=translate_envi_iwslt32k \
--model=light_transformer \
--hparams_set=light_transformer_base_single_gpu \
--hparams="light_mode='random',hidden_size=512,factor=4" \
--train_steps=50000

where light_transformer with light_mode='random' is the alias of the PHM-Transformer in our implementation.

Aggretating Checkpoints

After training, the latest 8 checkpoints are averaged:

t2t-avg-all --model_dir $TRAIN_DIR --output_dir $AVG_DIR --n 8

where $TRAIN_DIR and $AVG_DIR need to be specified by users.

Testing

To decode the target sequence, one has to additionally set the decode_hparams as follows:

t2t-decoder \
--decode_hparams="beam_size=5,alpha=0.6"

Then t2t-bleu is invoked for calculating the BLEU.

PHM Implementations

PHM is implemented with operations in make_random_mul and random_ffn, which are mathematically equivalent to sum of Kronecker products.

Among works that use PHM, some have offered alternative PHM implementations:

Citation

If you find this repository helpful, please cite our paper:

@inproceedings{zhang2021beyond,
  title={Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters},
  author={Zhang, Aston and Tay, Yi and Zhang, Shuai and Chan, Alvin and Luu, Anh Tuan and Hui, ‪Siu Cheung and Fu, Jie},
  booktitle={International Conference on Learning Representations},
  year={2021}
}

TensorFlow implementation of PHM (Parameterization of Hypercomplex Multiplication)

Related tags

Overview

Parameterization of Hypercomplex Multiplications (PHM)

Installation

Usage

Training

Aggretating Checkpoints

Testing

PHM Implementations

Citation

Owner

Aston Zhang

Inflated i3d network with inception backbone, weights transfered from tensorflow

This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

This is the official code for the paper "Tracker Meets Night: A Transformer Enhancer for UAV Tracking".

Dynamic Realtime Animation Control

This repository contains a PyTorch implementation of the paper Learning to Assimilate in Chaotic Dynamical Systems.

A clear, concise, simple yet powerful and efficient API for deep learning.

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

Visualization toolkit for neural networks in PyTorch! Demo -->

Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

Learning to trade under the reinforcement learning framework

E2VID_ROS - E2VID_ROS: E2VID to a real-time system

SpiroMask: Measuring Lung Function Using Consumer-Grade Masks

Eff video representation - Efficient video representation through neural fields

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

Official implementation for the paper: Generating Smooth Pose Sequences for Diverse Human Motion Prediction

Wileless-PDGNet Implementation

Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Self-Supervised Methods for Noise-Removal