My implementation of Image Inpainting - A deep learning Inpainting model

Overview

Image Inpainting

What is Image Inpainting

Image inpainting is a restorative process that allows for the fixing or removal of unwanted parts within images. Typically, this process is done by professionals who use software to change the image to remove the imperfection painstakingly. A deep learning approach bypasses manual labor typically used in this process and applies a neural network to determine the proper fill for the parts of the image.

Examples

To see a higher quality version, click on the images

From left to right: original, interpolated, predicted

alt text alt text

Reasearch and Development

The model architecture is created using a fully convolutional deep residual network. I had pretty good intuition that this type of model would work, as it had on my previous projects for image restoration. I looked into other architectures such as UNET for inpainting but ran into troubles while implementing them.

First, UNET requires you to splice images during inference, meaning that the image splice had to be larger than the white space that the user is trying to inpaint. For example, if the splices you set up for inference were set up to take 64x64 chunks of the image and you managed to get whitespace that fully engulfed this splice, feeding this into the model would result in improper pixels due to the model not having any reference. This would require a different architecture that would detect the size of the white space for images so that you could adequately select the image splice size.

The following architecture I looked into and tried implementing was a GAN (Generative Adversarial Network) based model. I've experimented with GANs and implemented a model that could generate faces using images from the CelebA dataset; however, using GANs for Inpainting proved a much more complex problem. There are issues that I faced with proper ratios of the loss functions being L1 loss and the adversarial loss of the discriminator. Although a GAN-based model would likely drastically improve the output during inference, I could not tune the hyper-parameters enough to balance both the loss functions and the training of the generator and discriminator.

I resolved to use the current architecture described due to its simplicity and relatively adequate results.

Model Architecture

Methods Depth Filters Parameters Training Time
Inpaint Model 50 (49 layers) 192-3 15,945k ~30hrs

Network Architecture:

How do you use this model?

Due to the sheer size of this model, I can't fully upload it onto GitHub. Instead, I have opted to upload it via Google Drive, where you should be able to download it. Place this download '.h5' file and place it inside the 'weights/' directory.

How can you train your own model?

The model is instantiated within network.py. You can play around with hyper-parameters there. First, to train the model, delete the images currently within data/ put your training image data within that file - any large dataset such as ImageNet or an equivalent should work. Finally, mess with hyper-parameters in train.py and run train.py. If youโ€™re training on weaker hardware, Iโ€™d recommend lowering the batch_size below the currently set 4 images.

Qualitative Examples (click on the images for higher quality):

Set 5 Evaluation Set:

Images Left to Right: Original, Interpolated, Predicted alt text alt text alt text alt text

Hardware - Training Statistics

Trained on 3070 ti
Batch Size: 4
Training Image Size: 96x96

Author

Joshua Evans - github/JoshVEvans
Owner
Joshua V Evans
Computer Systems Engineering | Arizona State University '25 | Interested in creating intelligent machines
Joshua V Evans
This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022
This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Word-Level Coreference Resolution This is a repository with the code to reproduce the experiments described in the paper of the same name, which was a

79 Dec 27, 2022
Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

Yuming Jiang 151 Dec 26, 2022
Semantic Segmentation in Pytorch

PyTorch Semantic Segmentation Introduction This repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to

Hengshuang Zhao 1.2k Jan 01, 2023
Imagededup - ๐Ÿ˜Ž Finding duplicate images made easy

imagededup is a python package that simplifies the task of finding exact and near duplicates in an image collection.

idealo 4.3k Jan 07, 2023
StackNet is a computational, scalable and analytical Meta modelling framework

StackNet This repository contains StackNet Meta modelling methodology (and software) which is part of my work as a PhD Student in the computer science

Marios Michailidis 1.3k Dec 15, 2022
Performant, differentiable reinforcement learning

deluca Performant, differentiable reinforcement learning Notes This is pre-alpha software and is undergoing a number of core changes. Updates to follo

Google 114 Dec 27, 2022
We have made you a wrapper you can't refuse

We have made you a wrapper you can't refuse We have a vibrant community of developers helping each other in our Telegram group. Join us! Stay tuned fo

20.6k Jan 09, 2023
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it

Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.

mani 1.2k Jan 07, 2023
The backbone CSPDarkNet of YOLOX.

YOLOX-Backbone The backbone CSPDarkNet of YOLOX. In this project, you can enjoy: CSPDarkNet-S CSPDarkNet-M CSPDarkNet-L CSPDarkNet-X CSPDarkNet-Tiny C

Jianhua Yang 9 Aug 22, 2022
Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

Context-Aware-Healthcare Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs Download

LuChang 9 Dec 26, 2022
Exe-to-xlsm - Simple script to create VBscript of exe and inject to xlsm

๐ŸŽ Exe To Office Executable file injection to Office documents: .xlsm, .docm, .p

3 Jan 25, 2022
Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

48 Dec 16, 2022
Neon-erc20-example - Example of creating SPL token and wrapping it with ERC20 interface in Neon EVM

Example of wrapping SPL token by ERC2-20 interface in Neon Requirements Install

7 Mar 28, 2022
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

DiffWave DiffWave is a fast, high-quality neural vocoder and waveform synthesizer. It starts with Gaussian noise and converts it into speech via itera

LMNT 498 Jan 03, 2023
PyTorch implementation of the cross-modality generative model that synthesizes dance from music.

Dancing to Music PyTorch implementation of the cross-modality generative model that synthesizes dance from music. Paper Hsin-Ying Lee, Xiaodong Yang,

NVIDIA Research Projects 485 Dec 26, 2022
Code of Adverse Weather Image Translation with Asymmetric and Uncertainty aware GAN

Adverse Weather Image Translation with Asymmetric and Uncertainty-aware GAN (AU-GAN) Official Tensorflow implementation of Adverse Weather Image Trans

Jeong-gi Kwak 36 Dec 26, 2022
Official Implementation of SWAD (NeurIPS 2021)

SWAD: Domain Generalization by Seeking Flat Minima (NeurIPS'21) Official PyTorch implementation of SWAD: Domain Generalization by Seeking Flat Minima.

Junbum Cha 97 Dec 20, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Ibai Gorordo 35 Sep 07, 2022
Aligning Latent and Image Spaces to Connect the Unconnectable

About This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model whi

Ivan Skorokhodov 203 Jan 03, 2023