This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Last update: Dec 26, 2022

Overview

MultiModal-InfoMax

🔥 If you would be interested in other multimodal works in our DeCLaRe Lab, welcome to visit the clustered repository

Introduction

Multimodal-informax (MMIM) synthesizes fusion results from multi-modality input through a two-level mutual information (MI) maximization. We use BA (Barber-Agakov) lower bound and contrastive predictive coding as the target function to be maximized. To facilitate the computation, we design an entropy estimation module with associated history data memory to facilitate the computation of BA lower bound and the training process.

Usage

Download the CMU-MOSI and CMU-MOSEI dataset from Google Drive or Baidu Disk (extraction code: g3m2). Place them under the folder Multimodal-Infomax/datasets
Set up the environment (need conda prerequisite)

conda env create -f environment.yml
conda activate MMIM

Start training

python main.py --dataset mosi --contrast

Citation

Please cite our paper if you find our work useful for your research:

@article{han2021improving,
  title={Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis},
  author={Han, Wei and Chen, Hui and Poria, Soujanya},
  journal={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2021}
}

Contact

Should you have any question, feel free to contact me through [email protected]

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Related tags

Overview

MultiModal-InfoMax

Introduction

Usage

Citation

Contact

Owner

Deep Cognition and Language Research (DeCLaRe) Lab

An implementation of paper `Real-time Convolutional Neural Networks for Emotion and Gender Classification` with PaddlePaddle.

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

The official github repository for Towards Continual Knowledge Learning of Language Models

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

This is an official source code for implementation on Extensive Deep Temporal Point Process

CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

KDD CUP 2020 Automatic Graph Representation Learning: 1st Place Solution

Official code repository for the work: "The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement"

Production First and Production Ready End-to-End Speech Recognition Toolkit

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Official Datasets and Implementation from our Paper "Video Class Agnostic Segmentation in Autonomous Driving".

Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion

A Pytorch Implementation of Domain adaptation of object detector using scissor-like networks

Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation.

PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

Türkiye Canlı Mobese Görüntülerinde Profesyonel Nesne Takip Sistemi

Face Transformer for Recognition