Minimal PyTorch implementation of YOLOv3

Last update: Dec 29, 2022

Related tags

Deep Learning PyTorch-YOLOv3

Overview

PyTorch-YOLOv3

A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.

Installation

Installing from source

For normal training and evaluation we recommend installing the package from source using a poetry virtual enviroment.

git clone https://github.com/eriklindernoren/PyTorch-YOLOv3
cd PyTorch-YOLOv3/
pip3 install poetry --user
poetry install

You need to join the virtual enviroment by runing poetry shell in this directory before running any of the following commands without the poetry run prefix. Also have a look at the other installing method, if you want to use the commands everywhere without opening a poetry-shell.

Download pretrained weights

./weights/download_weights.sh

Download COCO

./data/get_coco_dataset.sh

Install via pip

This installation method is recommended, if you want to use this package as a dependency in another python project. This method only includes the code, is less isolated and may conflict with other packages. Weights and the COCO dataset need to be downloaded as stated above. See API for further information regarding the packages API. It also enables the CLI tools yolo-detect, yolo-train, and yolo-test everywhere without any additional commands.

pip3 install pytorchyolo --user

Test

Evaluates the model on COCO test dataset. To download this dataset as well as weights, see above.

poetry run yolo-test --weights weights/yolov3.weights

Model	mAP (min. 50 IoU)
YOLOv3 608 (paper)	57.9
YOLOv3 608 (this impl.)	57.3
YOLOv3 416 (paper)	55.3
YOLOv3 416 (this impl.)	55.5

Inference

Uses pretrained weights to make predictions on images. Below table displays the inference times when using as inputs images scaled to 256x256. The ResNet backbone measurements are taken from the YOLOv3 paper. The Darknet-53 measurement marked shows the inference time of this implementation on my 1080ti card.

Backbone	GPU	FPS
ResNet-101	Titan X	53
ResNet-152	Titan X	37
Darknet-53 (paper)	Titan X	76
Darknet-53 (this impl.)	1080ti	74

poetry run yolo-detect --images data/samples/

Train

For argument descriptions have a lock at poetry run yolo-train --help

Example (COCO)

To train on COCO using a Darknet-53 backend pretrained on ImageNet run:

poetry run yolo-train --data config/coco.data  --pretrained_weights weights/darknet53.conv.74

Tensorboard

Track training progress in Tensorboard:

Initialize training
Run the command below
Go to http://localhost:6006/

poetry run tensorboard --logdir='logs' --port=6006

Storing the logs on a slow drive possibly leads to a significant training speed decrease.

You can adjust the log directory using --logdir when running tensorboard and yolo-train.

Train on Custom Dataset

Custom model

Run the commands below to create a custom model definition, replacing with the number of classes in your dataset.

./config/create_custom_model.sh <num-classes>  # Will create custom model 'yolov3-custom.cfg'

Classes

Add class names to data/custom/classes.names. This file should have one row per class name.

Image Folder

Move the images of your dataset to data/custom/images/.

Annotation Folder

Move your annotations to data/custom/labels/. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height. The coordinates should be scaled [0, 1], and the label_idx should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names.

Define Train and Validation Sets

In data/custom/train.txt and data/custom/valid.txt, add paths to images that will be used as train and validation data respectively.

Train

To train on the custom dataset run:

poetry run yolo-train --model config/yolov3-custom.cfg --data config/custom.data

Add --pretrained_weights weights/darknet53.conv.74 to train using a backend pretrained on ImageNet.

API

You are able to import the modules of this repo in your own project if you install the pip package pytorchyolo.

An example prediction call from a simple OpenCV python script would look like this:

import cv2
from pytorchyolo import detect, models

# Load the YOLO model
model = models.load_model(
  "/yolov3.cfg", 
  "/yolov3.weights")

# Load the image as an numpy array
img = cv2.imread("")

# Runs the YOLO model on the image 
boxes = detect.detect_image(model, img)

print(boxes)

For more advanced usage look at the method's doc strings.

Credit

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

Abstract
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at https://pjreddie.com/yolo/.

[Paper] [Project Webpage] [Authors' Implementation]

@article{yolov3,
  title={YOLOv3: An Incremental Improvement},
  author={Redmon, Joseph and Farhadi, Ali},
  journal = {arXiv},
  year={2018}
}

Minimal PyTorch implementation of YOLOv3

Related tags

Overview

PyTorch-YOLOv3

Installation

Installing from source

Download pretrained weights

Download COCO

Install via pip

Test

Inference

Train

Example (COCO)

Tensorboard

Train on Custom Dataset

Custom model

Classes

Image Folder

Annotation Folder

Define Train and Validation Sets

Train

API

Credit

YOLOv3: An Incremental Improvement

Owner

Erik Linder-Norén

Multi-Task Deep Neural Networks for Natural Language Understanding

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

Shape-Adaptive Selection and Measurement for Oriented Object Detection

Neon-erc20-example - Example of creating SPL token and wrapping it with ERC20 interface in Neon EVM

Code release for "Transferable Semantic Augmentation for Domain Adaptation" (CVPR 2021)

Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

pip install python-office

HistoKT: Cross Knowledge Transfer in Computational Pathology

Implementation of the SUMO (Slim U-Net trained on MODA) model

Predicting the duration of arrival delays for commercial flights.

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

How to train a CNN to 99% accuracy on MNIST in less than a second on a laptop

Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos

An easier way to build neural search on the cloud