Learned image compression

Last update: Dec 04, 2022

Overview

Pytorch code of our recent work A Unified End-to-End Framework for Efficient Deep Image Compression.

We first release the code for Variational image compression with a scale hyperprior, we will update our code to our full implementaion of our paper.

Prerequisites

You should install the libraries of this repo.

pip install -r requirements.txt

Data Preparation

We need to first prepare the training and validation data. The trainging data is from flicker.com. You can obtain the training data according to description of CompressionData.

The validation data is the popular kodak dataset.

bash data/download_kodak.sh

Training

For high bitrate (4096, 6144, 8192), the out_channel_N is 192 and the out_channel_M is 320 in 'config_high.json'. For low bitrate (256, 512, 1024, 2048), the out_channel_N is 128 and the out_channel_M is 192 in 'config_low.json'.

Details

PSNR experiments.

For high bitrate of 8192, we first train from scratch as follows.

CUDA_VISIBLE_DEVICES=0 python train.py --config examples/example/config_high.json -n baseline_8192 --train flicker_path --val kodak_path

For other high bitrate (4096, 6144), we use the converged model of 8192 as pretrain model and set the learning rate as 1e-5. The training iterations are set as 500000.

The low bitrate (256, 512, 1024, 2048) training process follows the same strategy.

MS-SSIM experiments

You should change the distorsion loss to (1-MS_SSIM), and fine-tune the pretrained model optimized by PSNR to accelerate the training process. You can find more details in our released paper. The training strategy is similar.

If your find our code is helpful for your research, please cite our paper. Besides, this code is only for research.

@article{liu2020unified,
  title={A Unified End-to-End Framework for Efficient Deep Image Compression},
  author={Liu, Jiaheng and Lu, Guo and Hu, Zhihao and Xu, Dong},
  journal={arXiv preprint arXiv:2002.03370},
  year={2020}
}

Learned image compression

Related tags

Overview

Overview

Content

Prerequisites

Data Preparation

Training

Details

PSNR experiments.

MS-SSIM experiments

Owner

Jiaheng Liu

基于YoloX目标检测+DeepSort算法实现多目标追踪Baseline

[AAAI-2022] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning

HAT: Hierarchical Aggregation Transformers for Person Re-identification

Code for "Learning to Regrasp by Learning to Place"

Makes patches from huge resolution .svs slide files using openslide

[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

This repository contains the source code of our work on designing efficient CNNs for computer vision

DeLighT: Very Deep and Light-Weight Transformers

A novel pipeline framework for multi-hop complex KGQA task. About the paper title: Improving Multi-hop Embedded Knowledge Graph Question Answering by Introducing Relational Chain Reasoning

MANO hand model porting for the GraspIt simulator

The first public PyTorch implementation of Attentive Recurrent Comparators

Official repository for Fourier model that can generate periodic signals

TilinGNN: Learning to Tile with Self-Supervised Graph Neural Network (SIGGRAPH 2020)

Distributing reference energies for SMIRNOFF implementations

modelvshuman is a Python library to benchmark the gap between human and machine vision

Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices,

Repository for the Bias Benchmark for QA dataset.

CRF-RNN for Semantic Image Segmentation - PyTorch version

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Code for the Convolutional Vision Transformer (ConViT)