MaskGIT-pytorch

Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)

Note: this is work in progress

MaskGIT is an extension to the VQGAN paper which improves the second stage transformer part (and leaves the first stage untouched). It switches the unidirectional transformer for a bidirectional transformer. The (second stage) training is pretty similar to BERT by randomly masking out tokens and trying to predict these using the bidirectional transformer (the original work used a GPT architecture randomly replaced tokens by other tokens). Different from BERT, the percentage for the masking is not fixed and uniformly distributed between 0 and 1 for each batch. Furhtermore, a new inference algorithm is suggested in which we start off by a completely masked-out image and then iteratively sample vectors where the model has a high confidence.

If you are only interested in the part of the code that comes from this paper check out transformer.py.

Run the code

The code is ready for training both the VQGAN and the Bidirectional Transformer and can also be used for inference

python training_vqgan.py

python training_transformer.py

(Make sure to edit the path for the dataset etc.)

TODO

Implement the gamma functions
Implement functions for image editing tasks: inpainting, extrapolation, image manipulation
Tune hyperparameters
(Provide visual results)

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Related tags

Overview

MaskGIT-pytorch

Note: this is work in progress

Run the code

TODO

Owner

Dominic Rampas

Pytorch implementation AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

Graph Analysis From Scratch

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

PyTorch - Python + Nim

Tracking Progress in Question Answering over Knowledge Graphs

Simply enable or disable your Nvidia dGPU

A Machine Teaching Framework for Scalable Recognition

Auxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

Implementation of SSMF: Shifting Seasonal Matrix Factorization

Pytorch implementation for M^3L

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

Language-Driven Semantic Segmentation

Wind Speed Prediction using LSTMs in PyTorch