Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Overview

MaskGIT-pytorch

Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)

Note: this is work in progress

MaskGIT is an extension to the VQGAN paper which improves the second stage transformer part (and leaves the first stage untouched). It switches the unidirectional transformer for a bidirectional transformer. The (second stage) training is pretty similar to BERT by randomly masking out tokens and trying to predict these using the bidirectional transformer (the original work used a GPT architecture randomly replaced tokens by other tokens). Different from BERT, the percentage for the masking is not fixed and uniformly distributed between 0 and 1 for each batch. Furhtermore, a new inference algorithm is suggested in which we start off by a completely masked-out image and then iteratively sample vectors where the model has a high confidence.

If you are only interested in the part of the code that comes from this paper check out transformer.py.

Run the code

The code is ready for training both the VQGAN and the Bidirectional Transformer and can also be used for inference

python training_vqgan.py

python training_transformer.py

(Make sure to edit the path for the dataset etc.)

TODO

  • Implement the gamma functions
  • Implement functions for image editing tasks: inpainting, extrapolation, image manipulation
  • Tune hyperparameters
  • (Provide visual results)
Owner
Dominic Rampas
I started coding in summer 2018 with the age of 17. Half a year later found interest in AI and ML and started to learn it. Mainly coding in Python.
Dominic Rampas
Rule Extraction Methods for Interactive eXplainability

REMIX: Rule Extraction Methods for Interactive eXplainability This repository contains a variety of tools and methods for extracting interpretable rul

Mateo Espinosa Zarlenga 21 Jan 03, 2023
2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案

2020CCF-NER 2020 CCF大数据与计算智能大赛-非结构化商业文本信息中隐私信息识别-第7名方案 bert base + flat + crf + fgm + swa + pu learning策略 + clue数据集 = test1单模0.906 词向量

67 Oct 19, 2022
Visualizing lattice vibration information from phonon dispersion to atoms (For GPUMD)

Phonon-Vibration-Viewer (For GPUMD) Visualizing lattice vibration information from phonon dispersion for primitive atoms. In this tutorial, we will in

Liangting 6 Dec 10, 2022
TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

YeongHyeon Park 7 Aug 28, 2022
Evaluating Cross-lingual Sentence Representations

XNLI: The Cross-Lingual NLI Corpus XNLI is an evaluation corpus for language transfer and cross-lingual sentence classification in 15 languages. New:

Meta Research 395 Dec 19, 2022
Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Readme File for "Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis" by Ham, Imai, and Janson. (2022) All scripts were written and

0 Jan 27, 2022
Distinguishing Commercial from Editorial Content in News

Distinguishing Commercial from Editorial Content in News In this repository you can find the following: An anonymized version of the data used for my

Timo Kats 3 Sep 26, 2022
Unet network with mean teacher for altrasound image segmentation

Unet network with mean teacher for altrasound image segmentation

5 Nov 21, 2022
⚓ Eurybia monitor model drift over time and securize model deployment with data validation

View Demo · Documentation · Medium article 🔍 Overview Eurybia is a Python library which aims to help in : Detecting data drift and model drift Valida

MAIF 172 Dec 27, 2022
Learning Features with Parameter-Free Layers (ICLR 2022)

Learning Features with Parameter-Free Layers (ICLR 2022) Dongyoon Han, YoungJoon Yoo, Beomyoung Kim, Byeongho Heo | Paper NAVER AI Lab, NAVER CLOVA Up

NAVER AI 65 Dec 07, 2022
Hierarchical User Intent Graph Network for Multimedia Recommendation

Hierarchical User Intent Graph Network for Multimedia Recommendation This is our Pytorch implementation for the paper: Hierarchical User Intent Graph

6 Jan 05, 2023
UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering

UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering This repository holds all the code and data for our recent work on

Mohamed El Banani 118 Dec 06, 2022
Minimal But Practical Image Classifier Pipline Using Pytorch, Finetune on ResNet18, Got 99% Accuracy on Own Small Datasets.

PyTorch Image Classifier Updates As for many users request, I released a new version of standared pytorch immage classification example at here: http:

JinTian 106 Nov 06, 2022
Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders"

DECA Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders". All the code is writte

23 Dec 01, 2022
A vision library for performing sliced inference on large images/small objects

SAHI: Slicing Aided Hyper Inference A vision library for performing sliced inference on large images/small objects Overview Object detection and insta

Open Business Software Solutions 2.3k Jan 04, 2023
Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

hippopmonkey 4 Dec 11, 2022
Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

76 Dec 21, 2022
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intel ISL (Intel Intelligent Systems Lab) 1.3k Dec 28, 2022
Commonsense Ability Tests

CATS Commonsense Ability Tests Dataset and script for paper Evaluating Commonsense in Pre-trained Language Models Use making_sense.py to run the exper

XUHUI ZHOU 28 Oct 19, 2022
BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation

BEAMetrics: Benchmark to Evaluate Automatic Metrics in Natural Language Generation Installing The Dependencies $ conda create --name beametrics python

7 Jul 04, 2022