Minimal PyTorch implementation of YOLOv3

Overview

PyTorch-YOLOv3

A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.

Ubuntu CI PyPI pyversions PyPI license

Installation

Installing from source

For normal training and evaluation we recommend installing the package from source using a poetry virtual enviroment.

git clone https://github.com/eriklindernoren/PyTorch-YOLOv3
cd PyTorch-YOLOv3/
pip3 install poetry --user
poetry install

You need to join the virtual enviroment by runing poetry shell in this directory before running any of the following commands without the poetry run prefix. Also have a look at the other installing method, if you want to use the commands everywhere without opening a poetry-shell.

Download pretrained weights

./weights/download_weights.sh

Download COCO

./data/get_coco_dataset.sh

Install via pip

This installation method is recommended, if you want to use this package as a dependency in another python project. This method only includes the code, is less isolated and may conflict with other packages. Weights and the COCO dataset need to be downloaded as stated above. See API for further information regarding the packages API. It also enables the CLI tools yolo-detect, yolo-train, and yolo-test everywhere without any additional commands.

pip3 install pytorchyolo --user

Test

Evaluates the model on COCO test dataset. To download this dataset as well as weights, see above.

poetry run yolo-test --weights weights/yolov3.weights
Model mAP (min. 50 IoU)
YOLOv3 608 (paper) 57.9
YOLOv3 608 (this impl.) 57.3
YOLOv3 416 (paper) 55.3
YOLOv3 416 (this impl.) 55.5

Inference

Uses pretrained weights to make predictions on images. Below table displays the inference times when using as inputs images scaled to 256x256. The ResNet backbone measurements are taken from the YOLOv3 paper. The Darknet-53 measurement marked shows the inference time of this implementation on my 1080ti card.

Backbone GPU FPS
ResNet-101 Titan X 53
ResNet-152 Titan X 37
Darknet-53 (paper) Titan X 76
Darknet-53 (this impl.) 1080ti 74
poetry run yolo-detect --images data/samples/

Train

For argument descriptions have a lock at poetry run yolo-train --help

Example (COCO)

To train on COCO using a Darknet-53 backend pretrained on ImageNet run:

poetry run yolo-train --data config/coco.data  --pretrained_weights weights/darknet53.conv.74

Tensorboard

Track training progress in Tensorboard:

poetry run tensorboard --logdir='logs' --port=6006

Storing the logs on a slow drive possibly leads to a significant training speed decrease.

You can adjust the log directory using --logdir when running tensorboard and yolo-train.

Train on Custom Dataset

Custom model

Run the commands below to create a custom model definition, replacing with the number of classes in your dataset.

./config/create_custom_model.sh <num-classes>  # Will create custom model 'yolov3-custom.cfg'

Classes

Add class names to data/custom/classes.names. This file should have one row per class name.

Image Folder

Move the images of your dataset to data/custom/images/.

Annotation Folder

Move your annotations to data/custom/labels/. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height. The coordinates should be scaled [0, 1], and the label_idx should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names.

Define Train and Validation Sets

In data/custom/train.txt and data/custom/valid.txt, add paths to images that will be used as train and validation data respectively.

Train

To train on the custom dataset run:

poetry run yolo-train --model config/yolov3-custom.cfg --data config/custom.data

Add --pretrained_weights weights/darknet53.conv.74 to train using a backend pretrained on ImageNet.

API

You are able to import the modules of this repo in your own project if you install the pip package pytorchyolo.

An example prediction call from a simple OpenCV python script would look like this:

import cv2
from pytorchyolo import detect, models

# Load the YOLO model
model = models.load_model(
  "/yolov3.cfg", 
  "/yolov3.weights")

# Load the image as an numpy array
img = cv2.imread("")

# Runs the YOLO model on the image 
boxes = detect.detect_image(model, img)

print(boxes)

For more advanced usage look at the method's doc strings.

Credit

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

Abstract
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at https://pjreddie.com/yolo/.

[Paper] [Project Webpage] [Authors' Implementation]

@article{yolov3,
  title={YOLOv3: An Incremental Improvement},
  author={Redmon, Joseph and Farhadi, Ali},
  journal = {arXiv},
  year={2018}
}
Owner
Erik Linder-Norén
ML engineer at Apple. Excited about machine learning, basketball and building things.
Erik Linder-Norén
Saeed Lotfi 28 Dec 12, 2022
Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

PARE: Part Attention Regressor for 3D Human Body Estimation [ICCV 2021] PARE: Part Attention Regressor for 3D Human Body Estimation, Muhammed Kocabas,

Muhammed Kocabas 277 Jan 03, 2023
This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

This is the replication package for paper submission: Towards Training Reproducible Deep Learning Models.

0 Feb 02, 2022
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22) Paper Link | Project Page Abstract : Manual an

Mohamed Afham 152 Dec 23, 2022
A Topic Modeling toolbox

Topik A Topic Modeling toolbox. Introduction The aim of topik is to provide a full suite and high-level interface for anyone interested in applying to

Anaconda, Inc. (formerly Continuum Analytics, Inc.) 93 Dec 01, 2022
9th place solution in "Santa 2020 - The Candy Cane Contest"

Santa 2020 - The Candy Cane Contest My solution in this Kaggle competition "Santa 2020 - The Candy Cane Contest", 9th place. Basic Strategy In this co

toshi_k 22 Nov 26, 2021
Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

Angular Desktop UI This is a collection for native desktop like user interface components in Angular, especially useful for Electron apps. It starts w

Marc J. Schmidt 49 Dec 22, 2022
This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

A Workbook for the Qiskit Developer Certification Exam Hello everyone! This is Bartu, a fellow Qiskitter. I have recently taken the Certification exam

Bartu Bisgin 66 Dec 10, 2022
Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.

Stock-Prediction- In this project, we aim to enhance the prediction of stock market movements using sentiment analysis and deep learning. We divide th

5 Jan 25, 2022
A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

AlphaFold Analyser This program produces high quality visualisations of predicted structures produced by AlphaFold. These visualisations allow the use

Oliver Powell 3 Nov 13, 2022
TLoL (Python Module) - League of Legends Deep Learning AI (Research and Development)

TLoL-py - League of Legends Deep Learning Library TLoL-py is the Python component of the TLoL League of Legends deep learning library. It provides a s

7 Nov 29, 2022
SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

SASM (SimpleASM) - простая кроссплатформенная среда разработки для языков ассемблера NASM, MASM, GAS, FASM с подсветкой синтаксиса и отладчиком. В SA

Dmitriy Manushin 5.6k Jan 06, 2023
WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

RNNT loss in Pytorch - Numba JIT compiled (warprnnt_numba) Warp RNN Transducer Loss for ASR in Pytorch, ported from HawkAaron/warp-transducer and a re

Somshubra Majumdar 15 Oct 22, 2022
PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

PSTR (CVPR2022) This code is an official implementation of "PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)". End-to-end one-step

Jiale Cao 28 Dec 13, 2022
Out of Distribution Detection on Natural Adversarial Examples

OOD-on-NAE Research project on out of distribution detection for the Computer Vision course by Prof. Rob Fergus (CSCI-GA 2271) Paper out on arXiv - ht

Anugya 1 Jun 08, 2022
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
Python project to take sound as input and output as RGB + Brightness values suitable for DMX

sound-to-light Python project to take sound as input and output as RGB + Brightness values suitable for DMX Current goals: Get one pixel working: Vary

Bobby Cox 1 Nov 17, 2021
Semi-supervised Learning for Sentiment Analysis

Neural-Semi-supervised-Learning-for-Text-Classification-Under-Large-Scale-Pretraining Code, models and Datasets for《Neural Semi-supervised Learning fo

47 Jan 01, 2023
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision

Learning to Generate Grounded Visual Captions without Localization Supervision This is the PyTorch implementation of our paper: Learning to Generate G

Chih-Yao Ma 41 Nov 17, 2022
Lux AI environment interface for RLlib multi-agents

Lux AI interface to RLlib MultiAgentsEnv For Lux AI Season 1 Kaggle competition. LuxAI repo RLlib-multiagents docs Kaggle environments repo Please let

Jaime 12 Nov 07, 2022