Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

Last update: Jun 06, 2022

Related tags

Overview

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Abstract

Many applications of generative models rely on the marginalization of their high-dimensional output probability distributions. Normalization functions that yield sparse probability distributions can make exact marginalization more computationally tractable. However, sparse normalization functions usually require alternative loss functions for training because the log-likelihood can be undefined for sparse probability distributions. Furthermore, many sparse normalization functions often collapse the multimodality of distributions. In this work, we present ev-softmax, a sparse normalization function that preserves the multimodality of probability distributions. We derive its properties, including its gradient in closed-form, and introduce a continuous family of approximations to ev-softmax that have full support and can thus be trained with probabilistic loss functions such as negative log-likelihood and Kullback-Leibler divergence. We evaluate our method on a variety of generative models, including variational autoencoders and auto-regressive models. Our method outperforms existing dense and sparse normalization techniques in distributional accuracy and classification performance. We demonstrate that ev-softmax successfully reduces the dimensionality of output probability distributions while maintaining multimodality.

Setup

Required packages are listed in requirements.txt.

Running

The implementation for the ev-softmax function and its loss function can be found in evsoftmax.py.

The MNIST CVAE and VQ-VAE experiments can be run using run_mnist_cvae.sh and run_vqvae.sh, respectively. Instructions for the SSVAE experiment can be found in mnist_ssvae/README.md, and scripts used for preprocessing, training, and evaluating can be found in mnist_ssvae/scripts. Instructions for the translation experiment can be found in translation/README.md, and scripts used for preprocessing, training, and evaluating can be found in translation/scripts/iwslt.

Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

Related tags

Overview

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Abstract

Setup

Running

Owner

Stanford Intelligent Systems Laboratory

A High-Performance Distributed Library for Large-Scale Bundle Adjustment

Fast Differentiable Matrix Sqrt Root

On the Analysis of French Phonetic Idiosyncrasies for Accent Recognition

A 3D Dense mapping backend library of SLAM based on taichi-Lang designed for the aerial swarm.

Supercharging Imbalanced Data Learning WithCausal Representation Transfer

The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021

PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21

Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based Analysis Framework"

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Image Deblurring using Generative Adversarial Networks

Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)

Code for CPM-2 Pre-Train

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

AlphaBot2 Pi Core software for interfacing with the various components.

moving object detection for satellite videos.

Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

Code for "Unsupervised State Representation Learning in Atari"