Replication of Pix2Seq with Pretrained Model

Last update: Nov 22, 2022

Related tags

Overview

Pretrained-Pix2Seq

We provide the pre-trained model of Pix2Seq. This version contains new data augmentation. The model is trained for 300 epochs and can acheive 37 mAP without beam search or neucles search.

Installation

Install PyTorch 1.5+ and torchvision 0.6+ (recommend torch1.8.1 torchvision 0.8.0)

Install pycocotools (for evaluation on COCO):

pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

That's it, should be good to train and evaluate detection models.

Data preparation

Download and extract COCO 2017 train and val images with annotations from http://cocodataset.org. We expect the directory structure to be the following:

path/to/coco/
  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images

Training

First link coco dataset to the project folder

ln -s /path/to/coco ./coco

Training

sh train.sh --model pix2seq --output_dir /path/to/save

Evaluation

sh train.sh --model pix2seq --output_dir /path/to/save --resume /path/to/checkpoints --eval

COCO

Method	backbone	Epoch	Batch Size	AP	AP50	AP75	Weights
Pix2Seq	R50	300	32	37.0	53.4	39.4	weight

Contributor

Qiu Han, Peng Gao, Jingqiu Zhou(Beam Search)

Acknowledegement

Pix2Seq, DETR

Replication of Pix2Seq with Pretrained Model

Related tags

Overview

Pretrained-Pix2Seq

Installation

Data preparation

Training

COCO

Contributor

Acknowledegement

Owner

peng gao

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

Focal and Global Knowledge Distillation for Detectors

[UNMAINTAINED] Automated machine learning for analytics & production

Space-event-trace - Tracing service for spaceteam events

Res2Net for Instance segmentation and Object detection using MaskRCNN

Single-Shot Motion Completion with Transformer

BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

PFLD pytorch Implementation

The VeriNet toolkit for verification of neural networks

Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

ParaGen is a PyTorch deep learning framework for parallel sequence generation

A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

Robust, modular and efficient implementation of advanced Hamiltonian Monte Carlo algorithms

The code for Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation

Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)

Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed

Code for testing convergence rates of Lipschitz learning on graphs