YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

Last update: Jan 06, 2023

Related tags

Overview

YOLTv4

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4. YOLTv4 is designed to detect objects in aerial or satellite imagery in arbitrarily large images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

This repository is built upon the impressive work of AlexeyAB's YOLOv4 implementation, which improves both speed and detection performance compared to YOLOv3 (which is implemented in SIMRDWN). We use YOLOv4 insead of "YOLOv5", since YOLOv4 is endorsed by the original creators of YOLO, whereas "YOLOv5" is not; furthermore YOLOv4 appears to have superior performance.

Below, we provide examples of how to use this repository with the open-source Rareplanes dataset.

Running YOLTv4

0. Installation

YOLTv4 is built to execute within a docker container on a GPU-enabled machine. The docker command creates an Ubuntu 16.04 image with CUDA 9.2, python 3.6, and conda.

Clone this repository (e.g. to /yoltv4/).
Download model weights to yoltv4/darknet/weights). See: https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137 https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-csp.conv.142
Install nvidia-docker.

Build docker file.

 nvidia-docker build -t yoltv4_image /yoltv4/docker

Spin up the docker container (see the docker docs for options).

 NV_GPU=0 nvidia-docker run -it -v /local_data:/local_data -v /yoltv4:/yoltv4 -ti --ipc=host --name yoltv4_gpu0 yoltv4_image

Compile the Darknet C program.

First Set GPU=1 CUDNN=1, CUDNN_HALF=1, OPENCV=1 in /yoltv4/darknet/Makefile, then make:
```
 cd /yoltv4/darknet
 make
```

1. Train

A. Prepare Data

Make YOLO images and labels (see yoltv4/notebooks/train_test_pipeline.ipynb for further details).
Create a txt file listing the training images.
Create file obj.names file with each desired object name on its own line.

Create file obj.data in the directory yoltv4/darknet/data containing necessary files. For example:

/yoltv4/darknet/data/rareplanes_train.data

 classes = 30
 train =  /local_data/cosmiq/wdata/rareplanes/train/txt/train.txt
 valid =  /local_data/cosmiq/wdata/rareplanes/train/txt/valid.txt
 names =  /yoltv4/darknet/data/rareplanes.name
 backup = backup/

Prepare config files.

See instructions here, or tweak /yoltv4/darknet/cfg/yoltv4_rareplanes.cfg.

B. Execute Training

Execute.

 cd /yoltv4/darknet
 time ./darknet detector train data/rareplanes_train.data  cfg/yoltv4_rareplanes.cfg weights/yolov4.conv.137  -dont_show -mjpeg_port 8090 -map

Review progress (plotted at: /yoltv4/darknet/chart_yoltv4_rareplanes.png).

2. Test

A. Prepare Data

Make sliced images (see yoltv4/notebooks/train_test_pipeline.ipynb for further details).
Create a txt file listing the training images.
Create file obj.data in the directory yoltv4/darknet/data containing necessary files. For example:

/yoltv4/darknet/data/rareplanes_test.data classes = 30 train = valid = /local_data/cosmiq/wdata/rareplanes/test/txt/test.txt names = /yoltv4/darknet/data/rareplanes.name backup = backup/

B. Execute Testing

Execute (proceeds at >80 frames per second on a Tesla P100):

 cd /yoltv4/darknet
 time ./darknet detector valid data/rareplanes_test.data cfg/yoltv4_rareplanes.cfg backup/ yoltv4_rareplanes_best.weights

Post-process detections:

A. Move detections into results directory

 mkdir /yoltv4/darknet/results/rareplanes_preds_v0
 mkdir  /yoltv4/darknet/results/rareplanes_preds_v0/orig_txt
 mv /yoltv4/darknet/results/comp4_det_test_*  /yoltv4/darknet/results/rareplanes_preds_v0/orig_txt/

B. Stitch detections back together and make plots

 time python /yoltv4/yoltv4/post_process.py \
     --pred_dir=/yoltv4/darknet/results/rareplanes_preds_v0/orig_txt/ \
     --raw_im_dir=/local_data/cosmiq/wdata/rareplanes/test/images/ \
     --sliced_im_dir=/local_data/cosmiq/wdata/rareplanes/test/yoltv4/images_slice/ \
     --out_dir= /yoltv4/darknet/results/rareplanes_preds_v0 \
     --detection_thresh=0.25 \
     --slice_size=416} \
     --n_plots=8

Outputs will look something like the figures below:

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

Related tags

Overview

YOLTv4

Running YOLTv4

0. Installation

1. Train

A. Prepare Data

B. Execute Training

2. Test

A. Prepare Data

B. Execute Testing

Owner

Adam Van Etten

Find the Heart simple Python Game

Аналитика доходности инвестиционного портфеля в Тинькофф брокере

(NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductive few-shot classification"

Udacity's CS101: Intro to Computer Science - Building a Search Engine

A hifiasm fork for metagenome assembly using Hifi reads.

Codebase for Diffusion Models Beat GANS on Image Synthesis.

A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION

Official Implementation for the paper DeepFace-EMD: Re-ranking Using Patch-wise Earth Mover’s Distance Improves Out-Of-Distribution Face Identification

Compare outputs between layers written in Tensorflow and layers written in Pytorch

Learning What and Where to Draw

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

Find-Lane-Line - Use openCV library and Python to detect the road-lane-line

Exadel CompreFace is a free and open-source face recognition GitHub project

Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

Revealing and Protecting Labels in Distributed Training

Code for the submitted paper Surrogate-based cross-correlation for particle image velocimetry

A unified 3D Transformer Pipeline for visual synthesis

Code for Ditto: Building Digital Twins of Articulated Objects from Interaction