YOLOv2 in PyTorch

Overview

YOLOv2 in PyTorch

NOTE: This project is no longer maintained and may not compatible with the newest pytorch (after 0.4.0).

This is a PyTorch implementation of YOLOv2. This project is mainly based on darkflow and darknet.

I used a Cython extension for postprocessing and multiprocessing.Pool for image preprocessing. Testing an image in VOC2007 costs about 13~20ms.

For details about YOLO and YOLOv2 please refer to their project page and the paper: YOLO9000: Better, Faster, Stronger by Joseph Redmon and Ali Farhadi.

NOTE 1: This is still an experimental project. VOC07 test mAP is about 0.71 (trained on VOC07+12 trainval, reported by @cory8249). See issue1 and issue23 for more details about training.

NOTE 2: I recommend to write your own dataloader using torch.utils.data.Dataset since multiprocessing.Pool.imap won't stop even there is no enough memory space. An example of dataloader for VOCDataset: issue71.

NOTE 3: Upgrade to PyTorch 0.4: https://github.com/longcw/yolo2-pytorch/issues/59

Installation and demo

  1. Clone this repository

    git clone [email protected]:longcw/yolo2-pytorch.git
  2. Build the reorg layer (tf.extract_image_patches)

    cd yolo2-pytorch
    ./make.sh
  3. Download the trained model yolo-voc.weights.h5 and set the model path in demo.py

  4. Run demo python demo.py.

Training YOLOv2

You can train YOLO2 on any dataset. Here we train it on VOC2007/2012.

  1. Download the training, validation, test data and VOCdevkit

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  2. Extract all of these tars into one directory named VOCdevkit

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
    tar xvf VOCdevkit_08-Jun-2007.tar
  3. It should have this basic structure

    $VOCdevkit/                           # development kit
    $VOCdevkit/VOCcode/                   # VOC utility code
    $VOCdevkit/VOC2007                    # image sets, annotations, etc.
    # ... and several other directories ...
  4. Since the program loading the data in yolo2-pytorch/data by default, you can set the data path as following.

    cd yolo2-pytorch
    mkdir data
    cd data
    ln -s $VOCdevkit VOCdevkit2007
  5. Download the pretrained darknet19 model and set the path in yolo2-pytorch/cfgs/exps/darknet19_exp1.py.

  6. (optional) Training with TensorBoard.

    To use the TensorBoard, set use_tensorboard = True in yolo2-pytorch/cfgs/config.py and install TensorboardX (https://github.com/lanpa/tensorboard-pytorch). Tensorboard log will be saved in training/runs.

  7. Run the training program: python train.py.

Evaluation

Set the path of the trained_model in yolo2-pytorch/cfgs/config.py.

cd faster_rcnn_pytorch
mkdir output
python test.py

Training on your own data

The forward pass requires that you supply 4 arguments to the network:

  • im_data - image data.
    • This should be in the format C x H x W, where C corresponds to the color channels of the image and H and W are the height and width respectively.
    • Color channels should be in RGB format.
    • Use the imcv2_recolor function provided in utils/im_transform.py to preprocess your image. Also, make sure that images have been resized to 416 x 416 pixels
  • gt_boxes - A list of numpy arrays, where each one is of size N x 4, where N is the number of features in the image. The four values in each row should correspond to x_bottom_left, y_bottom_left, x_top_right, and y_top_right.
  • gt_classes - A list of numpy arrays, where each array contains an integer value corresponding to the class of each bounding box provided in gt_boxes
  • dontcare - a list of lists

License: MIT license (MIT)

Owner
Long Chen
Computer Vision
Long Chen
10th place solution for Google Smartphone Decimeter Challenge at kaggle.

Under refactoring 10th place solution for Google Smartphone Decimeter Challenge at kaggle. Google Smartphone Decimeter Challenge Global Navigation Sat

12 Oct 25, 2022
Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Patches Are All You Need? - ConvMixer ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in t

Sayan Nath 8 Oct 03, 2022
Using Streamlit to host a multi-page tool with model specs and classification metrics, while also accepting user input values for prediction.

Predicitng_viability Using Streamlit to host a multi-page tool with model specs and classification metrics, while also accepting user input values for

Gopalika Sharma 1 Nov 08, 2021
What can linearized neural networks actually say about generalization?

What can linearized neural networks actually say about generalization? This is the source code to reproduce the experiments of the NeurIPS 2021 paper

gortizji 11 Dec 09, 2022
DRLib:A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

DRLib:A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos A concise deep reinforcement learning libr

329 Jan 03, 2023
An efficient PyTorch implementation of the evaluation metrics in recommender systems.

recsys_metrics An efficient PyTorch implementation of the evaluation metrics in recommender systems. Overview • Installation • How to use • Benchmark

Xingdong Zuo 12 Dec 02, 2022
As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

HAKE-Action HAKE-Action (TensorFlow) is a project to open the SOTA action understanding studies based on our Human Activity Knowledge Engine. It inclu

Yong-Lu Li 94 Nov 18, 2022
基于Paddlepaddle复现yolov5,支持PaddleDetection接口

PaddleDetection yolov5 https://github.com/Sharpiless/PaddleDetection-Yolov5 简介 PaddleDetection飞桨目标检测开发套件,旨在帮助开发者更快更好地完成检测模型的组建、训练、优化及部署等全开发流程。 PaddleD

36 Jan 07, 2023
NeurIPS 2021 Datasets and Benchmarks Track

AP-10K: A Benchmark for Animal Pose Estimation in the Wild Introduction | Updates | Overview | Download | Training Code | Key Questions | License Intr

AP-10K 82 Dec 11, 2022
Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Official implementation for paper "Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR"

Ziyue Feng 72 Dec 09, 2022
Implementation of Wasserstein adversarial attacks.

Stronger and Faster Wasserstein Adversarial Attacks Code for Stronger and Faster Wasserstein Adversarial Attacks, appeared in ICML 2020. This reposito

21 Oct 06, 2022
Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

Akash Sengupta 149 Dec 14, 2022
Contrastive Learning for Metagenomic Binning

CLMB A simple framework for CLMB - a novel deep Contrastive Learningfor Metagenomic Binning Created by Pengfei Zhang, senior of Department of Computer

1 Sep 14, 2022
The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

Elon Lin 41 Dec 15, 2022
Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images In this paper, we present an effective Dynamic Enhancement Anchor

13 Dec 09, 2022
La source de mon module 'pyfade' disponible sur Pypi.

Version: 1.2 Introduction Pyfade est un module permettant de créer des dégradés colorés. Il vous permettra de changer chaque ligne de votre texte par

Billy 20 Sep 12, 2021
[ECE NTUA] 👁 Computer Vision - Lab Projects & Theoretical Problem Sets (2020-2021)

Computer Vision - NTUA (2020-2021) This repository hosts the lab projects and theoretical problem sets of the Computer Vision course held by ECE NTUA

Dimitris Dimos 6 Jul 21, 2022
Doods2 - API for detecting objects in images and video streams using Tensorflow

DOODS2 - Return of DOODS Dedicated Open Object Detection Service - Yes, it's a b

Zach 101 Jan 04, 2023
Python-kafka-reset-consumergroup-offset-example - Python Kafka reset consumergroup offset example

Python Kafka reset consumergroup offset example This is a simple example of how

Willi Carlsen 1 Feb 16, 2022
The Rich Get Richer: Disparate Impact of Semi-Supervised Learning

The Rich Get Richer: Disparate Impact of Semi-Supervised Learning Preprocess file of the dataset used in implicit sub-populations: (Demographic groups

<a href=[email protected]"> 4 Oct 14, 2022