This repository is based on Ultralytics/yolov5, with adjustments to enable polygon prediction boxes.

Overview

Polygon-Yolov5

This repository is based on Ultralytics/yolov5, with adjustments to enable polygon prediction boxes.

Section I. Description

The codes are based on Ultralytics/yolov5, and several functions are added and modified to enable polygon prediction boxes.

The modifications compared with Ultralytics/yolov5 and their brief descriptions are summarized below:

  1. data/polygon_ucas.yaml : Exemplar UCAS-AOD dataset to test the effects of polygon boxes

  2. data/images/UCAS-AOD : For the inference of polygon-yolov5s-ucas.pt

  3. models/common.py :
    3.1. class Polygon_NMS : Non-Maximum Suppression (NMS) module for Polygon Boxes
    3.2. class Polygon_AutoShape : Polygon Version of Original AutoShape, input-robust polygon model wrapper for passing cv2/np/PIL/torch inputs. Includes preprocessing, inference and Polygon_NMS
    3.3. class Polygon_Detections : Polygon detections class for Polygon-YOLOv5 inference results

  4. models/polygon_yolov5s_ucas.yaml : Configuration file of polygon yolov5s for exemplar UCAS-AOD dataset

  5. models/yolo.py :
    5.1. class Polygon_Detect : Detect head for polygon yolov5 models with polygon box prediction
    5.2. class Polygon_Model : Polygon yolov5 models with polygon box prediction

  6. utils/iou_cuda : CUDA extension for iou computation of polygon boxes
    6.1. extensions.cpp : CUDA extension file
    6.2. inter_union_cuda.cu : CUDA code for computing iou of polygon boxes
    6.3. setup.py : for building CUDA extensions module polygon_inter_union_cuda, with two functions polygon_inter_union_cuda and polygon_b_inter_union_cuda

  7. utils/autoanchor.py :
    7.1. def polygon_check_anchors : Polygon version of original check_anchors
    7.2. def polygon_kmean_anchors : Create kmeans-evolved anchors from polygon-enabled training dataset, use minimum outter bounding box as approximations

  8. utils/datasets.py :
    8.1. def polygon_random_perspective : Data augmentation for datasets with polygon boxes (augmentation effects: HSV-Hue, HSV-Saturation, HSV-Value, rotation, translation, scale, shear, perspective, flip up-down, flip left-right, mosaic, mixup)
    8.2. def polygon_box_candidates : Polygon version of original box_candidates
    8.3. class Polygon_LoadImagesAndLabels : Polygon version of original LoadImagesAndLabels
    8.4. def polygon_load_mosaic : Loads images in a 4-mosaic, with polygon boxes
    8.5. def polygon_load_mosaic9 : Loads images in a 9-mosaic, with polygon boxes
    8.6. def polygon_verify_image_label : Verify one image-label pair for polygon datasets
    8.7. def create_dataloader : Has been modified to include polygon datasets

  9. utils/general.py :
    9.1. def xyxyxyxyn2xyxyxyxy : Convert normalized xyxyxyxy or segments into pixel xyxyxyxy or segments
    9.2. def polygon_segment2box : Convert 1 segment label to 1 polygon box label
    9.3. def polygon_segments2boxes : Convert segment labels to polygon box labels
    9.4. def polygon_scale_coords : Rescale polygon coords (xyxyxyxy) from img1_shape to img0_shape
    9.5. def polygon_clip_coords : Clip bounding polygon xyxyxyxy bounding boxes to image shape (height, width)
    9.6. def polygon_inter_union_cpu : iou computation (polygon) with cpu
    9.7. def polygon_box_iou : Compute iou of polygon boxes via cpu or cuda
    9.8. def polygon_b_inter_union_cpu : iou computation (polygon) with cpu for class Polygon_ComputeLoss in loss.py
    9.9. def polygon_bbox_iou : Compute iou of polygon boxes for class Polygon_ComputeLoss in loss.py via cpu or cuda
    9.10. def polygon_non_max_suppression : Runs Non-Maximum Suppression (NMS) on inference results for polygon boxes
    9.11. def polygon_nms_kernel : Non maximum suppression kernel for polygon-enabled boxes
    9.12. def order_corners : Return sorted corners for loss.py::class Polygon_ComputeLoss::build_targets

  10. utils/loss.py :
    10.1. class Polygon_ComputeLoss : Compute loss for polygon boxes

  11. utils/metrics.py :
    11.1. class Polygon_ConfusionMatrix : Polygon version of original ConfusionMatrix

  12. utils/plots.py :
    12.1. def polygon_plot_one_box : Plot one polygon box on image
    12.2. def polygon_plot_one_box_PIL : Plot one polygon box on image via PIL
    12.3. def polygon_output_to_target : Convert model output to target format (batch_id, class_id, x1, y1, x2, y2, x3, y3, x4, y4, conf)
    12.4. def polygon_plot_images : Polygon version of original plot_images
    12.5. def polygon_plot_test_txt : Polygon version of original plot_test_txt
    12.6. def polygon_plot_targets_txt : Polygon version of original plot_targets_txt
    12.7. def polygon_plot_labels : Polygon version of original plot_labels

  13. polygon_train.py : For training polygon-yolov5 models

  14. polygon_test.py : For testing polygon-yolov5 models

  15. polygon_detect.py : For detecting polygon-yolov5 models

  16. requirements.py : Added python model shapely

Section II. How Does Polygon Boxes Work? How Does Polygon Boxes Different from Axis-Aligned Boxes?

  1. build_targets in class Polygon_ComputeLoss & forward in class Polygon_Detect

2. order_corners in general.py

3. Illustrations of box loss of polygon boxes

Section III. Installation

For the CUDA extension to be successfully built without error, please use CUDA version >= 11.2. The codes have been verified in Ubuntu 16.04 with Tesla K80 GPU.

# The following codes install CUDA 11.2 from scratch on Ubuntu 16.04, if you have installed it, please ignore
# If you are using other versions of systems, please check https://tutorialforlinux.com/2019/12/01/how-to-add-cuda-repository-for-ubuntu-based-oses-2/
# Install Ubuntu kernel head
sudo apt install linux-headers-$(uname -r)

# Pinning CUDA repo wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-ubuntu1604.pin sudo mv cuda-ubuntu1604.pin /etc/apt/preferences.d/cuda-repository-pin-600
# Add CUDA GPG key sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
# Setting up CUDA repo sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/ /"
# Refresh apt repositories sudo apt update
# Installing CUDA 11.2 sudo apt install cuda-11-2 -y sudo apt install cuda-toolkit-11-2 -y
# Setting up path echo 'export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}' >> $HOME/.bashrc # You are done installing CUDA 11.2
# Check NVIDIA nvidia-smi # Update all apts sudo apt-get update sudo apt-get -y upgrade
# Begin installing python 3.7 curl -o ~/miniconda.sh -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh chmod +x ~/miniconda.sh ./miniconda.sh -b echo "PATH=~/miniconda3/bin:$PATH" >> ~/.bashrc source ~/.bashrc conda install -y python=3.7 # You are done installing python

The following codes set you up with the Polygon Yolov5.

# clone git repo
git clone https://github.com/XinzeLee/PolygonObjectDetection
cd PolygonObjectDetection/polygon-yolov5
# install python package requirements
pip install -r requirements.txt
# install CUDA extensions
cd utils/iou_cuda
python setup.py install
# cd back to polygon-yolov5 folder
cd .. && cd ..

Section IV. Polygon-Tutorial 1: Deploy the Polygon Yolov5s

Try Polygon Yolov5s Model by Following Polygon-Tutorial 1

  1. Inference
     $ python polygon_detect.py --weights polygon-yolov5s-ucas.pt --img 1024 --conf 0.75 \
         --source data/images/UCAS-AOD --iou-thres 0.4 --hide-labels

  2. Test
     $ python polygon_test.py --weights polygon-yolov5s-ucas.pt --data polygon_ucas.yaml \
         --img 1024 --iou 0.65 --task val

  3. Train
     $ python polygon_train.py --weights polygon-yolov5s-ucas.pt --cfg polygon_yolov5s_ucas.yaml \
         --data polygon_ucas.yaml --hyp hyp.ucas.yaml --img-size 1024 \
         --epochs 3 --batch-size 12 --noautoanchor --polygon --cache
  4. Performance
    4.1. Confusion Matrix

    4.2. Precision Curve

    4.3. Recall Curve

    4.4. Precision-Recall Curve

    4.5. F1 Curve

Section V. Polygon-Tutorial 2: Transform COCO Dataset to Polygon Labels Using Segmentation

Transform COCO Dataset to Polygon Labels by Following [Polygon-Tutorial 2](https://github.com/XinzeLee/PolygonObjectDetection/blob/main/polygon-yolov5/Polygon-Tutorial2.ipynb]

Transformed Exemplar Figure

Section VI. Expansion to More Than Four Corners


Section VII. References

Comments
  • NMS time limit 10.0s exceeded

    NMS time limit 10.0s exceeded

    Thanks for sharing great works!

    I am trying to train coco dataset. When calculating mAP for val data, I got bellow warning.

    WARNING: NMS time limit 10.0s exceeded

    I think when found many bbox, nms cost is too much. then time limit exceeded.

    So, I set/change conf_thres=0.1(default is 0.001), it's work no warning.

    But, I am afraid that this change will affect learning performance. What do you think?

    opened by tak-s 10
  • Strange behaviour in overlapping bounding boxes

    Strange behaviour in overlapping bounding boxes

    @XinzeLee I have an issue when there are two adjacent or overlapping objects that I want to detect. I created a diagram to give an example.

    PolygonProblem

    The objects I am trying to detect have a rectangular shape and are represented in the image by black rectangles with grey outlines. The bounding boxes predicted by the model are in red.

    As you can see, box 1 and 3 are correct and box 2 is incorrect. Increasing the confidence or decreasing the IoU threshold cause only box 2 to be visible.

    This happens in almost every case where 2 or more objects are close together.

    Any idea on the root of the problem?

    opened by AntMorais 2
  • TypeError: test() got an unexpected keyword argument 'polygon'

    TypeError: test() got an unexpected keyword argument 'polygon'

    https://github.com/XinzeLee/PolygonObjectDetection/blob/f3333f560a08b7fccba4285f0c99cd5af03dc45a/polygon-yolov5/polygon_train.py#L444

    Not defined parameter 'polygon' at test() https://github.com/XinzeLee/PolygonObjectDetection/blob/f3333f560a08b7fccba4285f0c99cd5af03dc45a/polygon-yolov5/polygon_test.py#L25

    opened by tak-s 2
  • What is polygon yolov5 mAP on UCAS dataset

    What is polygon yolov5 mAP on UCAS dataset

    Firstly, thanks for your work. I have question. Did you test poly-yolo5 model on UCAS? As I want to compare it 's result with other object detectors. There are currently some sota satellite image related object detectors 's results in this repo https://github.com/ming71/UCAS-AOD-benchmark

    opened by vpeopleonatank 2
  • about multi-scale argument during training

    about multi-scale argument during training

    Does the argument help improve mAP? I'm asking because we are generating anchors for a fixed image size. Since the multi-scale arg varies the image size by -/+50%, will it have a negative effect on mAP?

    opened by nsabir2011 1
  • Expected all tensors to be on the same device, but found at least two devices, cuda: 0 and cpu!

    Expected all tensors to be on the same device, but found at least two devices, cuda: 0 and cpu!

    When following the first tutorial on Google Colab, I am trying to run !python polygon_test.py --weights polygon-yolov5s-ucas.pt --data polygon_ucas.yaml --img 1024 --iou 0.65 --task val --device 0 as in the example. I get the following error:

    Traceback (most recent call last): File "polygon_test.py", line 325, in <module> test(**vars(opt)) File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "polygon_test.py", line 224, in test for j in (ious > iouv[index_ap50]).nonzero(as_tuple=False): RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

    I have not modified any code or data and cannot figure out where the issue is. Any help would be much appreciated. Thank you!

    opened by sac3tf 1
  • custom data polygon transform using Polygon-Tutorial2.ipynb

    custom data polygon transform using Polygon-Tutorial2.ipynb

    hello @XinzeLee ,

    I'm using my custom dataset and having json file in COCO format. I'm trying to use Polygon-Tutorial2.ipynb for it. But somehow it throws an error. Can you please help me how to run and get rotated bounding boxes from polygon annoations. Thank you in advance.

    This is the error I'm getting: Traceback (most recent call last): File "C:/yolo/tranform.py", line 173, in main() File "C:/yolo/tranform.py", line 170, in main seg2poly(r'C:\Users\exp', plot=True) File "C:/yolo/tranform.py", line 62, in seg2poly img_dir = img_dir / prefix UnboundLocalError: local variable 'prefix' referenced before assignment

    opened by apanand14 1
  • How to use this along with basic yolov5

    How to use this along with basic yolov5

    image My project is reading the license plate, and I used your polygon model to detect 4 corner of the plate to transform it , then use the basic yolov5 model to detect the characters. But I got this error when use both model in 1 runtime. It's no problem if I use them separately.

    opened by NMT201 0
  • Two error

    Two error

    1. 'Upsample' object has no attribute 'recompute_scale_factor' Edit torch==1.10.0 in requirements.txt

    2. ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part. For CPU polygon_detect.py, In line numbers 813, 815 utils/general.py, Edit boxes1[i, :].view(4,2) -> boxes1[i, :].view(4,2).numpy() Edit boxes2[j, :].view(4,2) -> boxes2[j, :].view(4,2).numpy()

    opened by tdat97 0
  • Invalid sos parameters for sequential JPEG and zero box value during training

    Invalid sos parameters for sequential JPEG and zero box value during training

    image

    • i'm using jpg image and get "Invalid SOS parameters for sequential JPEG"
    • Get zero for box value ,P,R ,map when training

    image

    • how do i choose this value and if i commented it, will getting error
    opened by Aun0124 0
  • RuntimeError: result type Float can't be cast to the desired output type long int

    RuntimeError: result type Float can't be cast to the desired output type long int

    Facing error while training. Training command: !python polygon_train.py --weights yolov5s.pt --cfg polygon_yolov5s_ucas.yaml
    --data data/custom.yaml --hyp hyp.ucas.yaml --img-size 1024
    --epochs 3 --batch-size 12 --noautoanchor --polygon --cache

    image

    opened by poojatambe 0
  • when I train my own dataset.P,R,map is zero

    when I train my own dataset.P,R,map is zero

    I annotated some images,and find that this project train can successfully start,but P,R,map is zero all the time,I train 200 epoch.

    Epoch gpu_mem box obj cls total labels img_size 64/199 3.17G 0.03754 0.01328 0 0.05082 11 640: 100%|█| 21/21 [00:02<00:00, 9. Class Images Labels P R [email protected] [email protected]:.95: 100%|█| 2/2 [00:00<00:00, all 19 0 0 0 0 0

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
    65/199     3.17G   0.03064    0.0133         0   0.04394         9       640: 100%|█| 21/21 [00:02<00:00,  9.
               Class     Images     Labels          P          R     [email protected] [email protected]:.95: 100%|█| 2/2 [00:00<00:00,
                 all         19          0          0          0          0          0
    
     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
    66/199     3.17G   0.03182   0.01224         0   0.04405        12       640:  62%|▌| 13/21 [00:01<00:00,  9.    66/199     3.17G   0.03182   0.01224         0   0.04405        12       640:  62%|▌| 13/21 [00:01<00:00,  9.
    
    opened by futureflsl 1
Releases(v1.0)
Owner
xinzelee
xinzelee
A general and strong 3D object detection codebase that supports more methods, datasets and tools (debugging, recording and analysis).

ALLINONE-Det ALLINONE-Det is a general and strong 3D object detection codebase built on OpenPCDet, which supports more methods, datasets and tools (de

Michael.CV 5 Nov 03, 2022
Neural Style and MSG-Net

PyTorch-Style-Transfer This repo provides PyTorch Implementation of MSG-Net (ours) and Neural Style (Gatys et al. CVPR 2016), which has been included

Hang Zhang 904 Dec 21, 2022
InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing

InsTrim The paper: InsTrim: Lightweight Instrumentation for Coverage-guided Fuzzing Build Prerequisite llvm-8.0-dev clang-8.0 cmake = 3.2 Make git cl

75 Dec 23, 2022
Official Pytorch implementation of Meta Internal Learning

Official Pytorch implementation of Meta Internal Learning

10 Aug 24, 2022
A multi-scale unsupervised learning for deformable image registration

A multi-scale unsupervised learning for deformable image registration Shuwei Shao, Zhongcai Pei, Weihai Chen, Wentao Zhu, Xingming Wu and Baochang Zha

ShuweiShao 2 Apr 13, 2022
In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台 本项目是一个简易的人脸检测识别平台,提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js,后端采用 pytorch,

Xuhua Huang 5 Aug 02, 2022
Music Generation using Neural Networks Streamlit App

Music_Gen_Streamlit "Music Generation using Neural Networks" Streamlit App TO DO: Make a run_app.sh Introduction [~5 min] (Sohaib) Team Member names/i

Muhammad Sohaib Arshid 6 Aug 09, 2022
Ladder Variational Autoencoders (LVAE) in PyTorch

Ladder Variational Autoencoders (LVAE) PyTorch implementation of Ladder Variational Autoencoders (LVAE) [1]: where the variational distributions q at

Andrea Dittadi 63 Dec 22, 2022
Anchor-free Oriented Proposal Generator for Object Detection

Anchor-free Oriented Proposal Generator for Object Detection Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han, Intro

jbwang1997 56 Nov 15, 2022
VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition

VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition Usage First, install PyTorch 1.7.1+, torchvision 0.8.2

40 Dec 12, 2022
Easy and Efficient Object Detector

EOD Easy and Efficient Object Detector EOD (Easy and Efficient Object Detection) is a general object detection model production framework. It aim on p

381 Jan 01, 2023
auto-tuning momentum SGD optimizer

YellowFin YellowFin is an auto-tuning optimizer based on momentum SGD which requires no manual specification of learning rate and momentum. It measure

Jian Zhang 288 Nov 19, 2022
Neural network graphs and training metrics for PyTorch, Tensorflow, and Keras.

HiddenLayer A lightweight library for neural network graphs and training metrics for PyTorch, Tensorflow, and Keras. HiddenLayer is simple, easy to ex

Waleed 1.7k Dec 31, 2022
Game Agent Framework. Helping you create AIs / Bots that learn to play any game you own!

Serpent.AI - Game Agent Framework (Python) Update: Revival (May 2020) Development work has resumed on the framework with the aim of bringing it into 2

Serpent.AI 6.4k Jan 05, 2023
2D Human Pose estimation using transformers. Implementation in Pytorch

PE-former: Pose Estimation Transformer Vision transformer architectures perform very well for image classification tasks. Efforts to solve more challe

Panteleris Paschalis 23 Oct 17, 2022
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning

Autoregressive Predictive Coding This repository contains the official implementation (in PyTorch) of Autoregressive Predictive Coding (APC) proposed

iamyuanchung 173 Dec 18, 2022
NumQMBasic - A mini-course offered to Undergrad physics students

The best way to use this material is by forking it by click the Fork button at the top, right corner. Then you will get your own copy to play with! Th

Raghu 35 Dec 05, 2022
SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

SASM (SimpleASM) - простая кроссплатформенная среда разработки для языков ассемблера NASM, MASM, GAS, FASM с подсветкой синтаксиса и отладчиком. В SA

Dmitriy Manushin 5.6k Jan 06, 2023
Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

google-drive-to-sqlite Create a SQLite database containing metadata from Google

Simon Willison 140 Dec 04, 2022
PyTorch implementation of MICCAI 2018 paper "Liver Lesion Detection from Weakly-labeled Multi-phase CT Volumes with a Grouped Single Shot MultiBox Detector"

Grouped SSD (GSSD) for liver lesion detection from multi-phase CT Note: the MICCAI 2018 paper only covers the multi-phase lesion detection part of thi

Sang-gil Lee 36 Oct 12, 2022