这是一个yolox-pytorch的源码,可以用于训练自己的模型。

Overview

YOLOX:You Only Look Once目标检测模型在Pytorch当中的实现


目录

  1. 性能情况 Performance
  2. 实现的内容 Achievement
  3. 所需环境 Environment
  4. 小技巧的设置 TricksSet
  5. 文件下载 Download
  6. 训练步骤 How2train
  7. 预测步骤 How2predict
  8. 评估步骤 How2eval
  9. 参考资料 Reference

性能情况

训练数据集 权值文件名称 测试数据集 输入图片大小 mAP 0.5:0.95 mAP 0.5
COCO-Train2017 yolox_s.pth COCO-Val2017 640x640 38.2 57.7
COCO-Train2017 yolox_m.pth COCO-Val2017 640x640 44.8 63.9
COCO-Train2017 yolox_l.pth COCO-Val2017 640x640 47.9 66.6
COCO-Train2017 yolox_x.pth COCO-Val2017 640x640 49.0 67.7

实现的内容

  • 主干特征提取网络:使用了Focus网络结构。
  • 分类回归层:Decoupled Head,在YoloX中,Yolo Head被分为了分类回归两部分,最后预测的时候才整合在一起。
  • 训练用到的小技巧:Mosaic数据增强、CIOU(原版是IOU和GIOU,CIOU效果类似,都是IOU系列的,甚至更新一些)、学习率余弦退火衰减。
  • Anchor Free:不使用先验框
  • SimOTA:为不同大小的目标动态匹配正样本。

所需环境

pytorch==1.2.0

小技巧的设置

在train.py文件下:
1、mosaic参数可用于控制是否实现Mosaic数据增强。
2、Cosine_scheduler可用于控制是否使用学习率余弦退火衰减。
3、label_smoothing可用于控制是否Label Smoothing平滑。

文件下载

训练所需的权值可在百度网盘中下载。
链接: https://pan.baidu.com/s/1OnM-uWKETFJh_uFCAK6Vlg 提取码: b6km

VOC数据集下载地址如下:
VOC2007+2012训练集
链接: https://pan.baidu.com/s/16pemiBGd-P9q2j7dZKGDFA 提取码: eiw9

VOC2007测试集
链接: https://pan.baidu.com/s/1BnMiFwlNwIWG9gsd4jHLig 提取码: dsda

训练步骤

a、数据集的准备

1、本文使用VOC格式进行训练,训练前需要自己制作好数据集,如果没有自己的数据集,可以通过Github连接下载VOC12+07的数据集尝试下。
2、训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。
3、训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。

b、数据集的预处理

1、训练数据集时,在model_data文件夹下建立一个cls_classes.txt,里面写所需要区分的类别。
2、设置根目录下的voc_annotation.py里的一些参数。第一次训练可以仅修改classes_path,classes_path用于指向检测类别所对应的txt,即:

classes_path = 'model_data/cls_classes.txt'

model_data/cls_classes.txt文件内容为:

cat
dog
...

3、设置完成后运行voc_annotation.py,生成训练所需的2007_train.txt以及2007_val.txt。

c、开始网络训练

1、通过voc_annotation.py,我们已经生成了2007_train.txt以及2007_val.txt,此时我们可以开始训练了。
2、设置根目录下的train.py里的一些参数。第一次训练可以仅修改classes_path,classes_path用于指向检测类别所对应的txt,设置方式与b、数据集的预处理类似。训练自己的数据集必须要修改!
3、设置完成后运行train.py开始训练了,在训练多个epoch后,权值会生成在logs文件夹中。
4、训练的参数较多,大家可以在下载库后仔细看注释,其中最重要的部分依然是train.py里的classes_path。

d、训练结果预测

1、训练结果预测需要用到两个文件,分别是yolo.py和predict.py。
2、设置根目录下的yolo.py里的一些参数。第一次预测可以仅修改model_path以及classes_path。训练自己的数据集必须要修改。model_path指向训练好的权值文件,在logs文件夹里。classes_path指向检测类别所对应的txt。
3、设置完成后运行predict.py开始预测了,具体细节查看预测步骤。
4、预测的参数较多,大家可以在下载库后仔细看注释,其中最重要的部分依然是yolo.py里的model_path以及classes_path。

预测步骤

a、使用预训练权重

1、下载完库后解压,在百度网盘下载各个权值,放入model_data,默认使用yolox_s.pth,其它可调整,运行predict.py,输入

img/street.jpg

2、在predict.py里面进行设置可以进行video视频检测、fps测试、批量文件测试与保存。

b、使用自己训练的权重

1、按照训练步骤训练。
2、在yolo.py文件里面,在如下部分修改model_path和classes_path使其对应训练好的文件;model_path对应logs文件夹下面的权值文件,classes_path是model_path对应分的类

_defaults = {
    #--------------------------------------------------------------------------#
    #   使用自己训练好的模型进行预测一定要修改model_path和classes_path!
    #   model_path指向logs文件夹下的权值文件,classes_path指向model_data下的txt
    #   如果出现shape不匹配,同时要注意训练时的model_path和classes_path参数的修改
    #--------------------------------------------------------------------------#
    "model_path"        : 'model_data/yolox_s.pth',
    "classes_path"      : 'model_data/coco_classes.txt',
    #---------------------------------------------------------------------#
    #   输入图片的大小,必须为32的倍数。
    #---------------------------------------------------------------------#
    "input_shape"       : [640, 640],
    #---------------------------------------------------------------------#
    #   所使用的YoloX的版本。s、m、l、x
    #---------------------------------------------------------------------#
    "phi"               : 's',
    #---------------------------------------------------------------------#
    #   只有得分大于置信度的预测框会被保留下来
    #---------------------------------------------------------------------#
    "confidence"        : 0.5,
    #---------------------------------------------------------------------#
    #   非极大抑制所用到的nms_iou大小
    #---------------------------------------------------------------------#
    "nms_iou"           : 0.3,
    #---------------------------------------------------------------------#
    #   该变量用于控制是否使用letterbox_image对输入图像进行不失真的resize,
    #   在多次测试后,发现关闭letterbox_image直接resize的效果更好
    #---------------------------------------------------------------------#
    "letterbox_image"   : True,
    #-------------------------------#
    #   是否使用Cuda
    #   没有GPU可以设置成False
    #-------------------------------#
    "cuda"              : True,
}

3、运行predict.py,输入

img/street.jpg

4、在predict.py里面进行设置可以进行video视频检测、fps测试、批量文件测试与保存。

评估步骤

1、本文使用VOC格式进行评估。
2、划分测试集,如果在训练前已经运行过voc_annotation.py文件,代码会自动将数据集划分成训练集、验证集和测试集。
3、如果想要修改测试集的比例,可以修改voc_annotation.py文件下的trainval_percent。trainval_percent用于指定(训练集+验证集)与测试集的比例,默认情况下 (训练集+验证集):测试集 = 9:1。train_percent用于指定(训练集+验证集)中训练集与验证集的比例,默认情况下 训练集:验证集 = 9:1。
4、设置根目录下的yolo.py里的一些参数。第一次评估可以仅修改model_path以及classes_path。训练自己的数据集必须要修改。model_path指向训练好的权值文件,在logs文件夹里。classes_path指向检测类别所对应的txt。
5、设置根目录下的get_map.py里的一些参数。第一次评估可以仅修改classes_path,classes_path用于指向检测类别所对应的txt,评估自己的数据集必须要修改。与yolo.py中分开设置的原因是可以让使用者自己选择评估什么类别,而非所有类别。
6、运行get_map.py即可获得评估结果,评估结果会保存在map_out文件夹中。

Reference

https://github.com/Megvii-BaseDetection/YOLOX

Comments
  • 在使用YOLOX模型 对视频进行预测时,出现了如下错误

    在使用YOLOX模型 对视频进行预测时,出现了如下错误

    在使用YOLOX模型 对视频进行预测时,出现了如下错误: cv2.error: OpenCV(4.5.4-dev) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cv::cvtColor' image

    请问如何解决呀?

    opened by MasterMiao919 7
  • 训练自己数据,MAP出现问题

    训练自己数据,MAP出现问题

    hi,博主 关于训练自己的数据集, 已经将对应格式的文件放到相同路径下的文件夹内,新添了自己的cls.txt。训练完成后,也有训练框。 但是,测试map时,仍然有原voc的测试类别,想问一下这是什么情况呢? ( classes_path = 'model_data/fire.txt'已经就改)

    opened by theDeep1nteresting 3
  • 网络输出的代码错了吧?output      = torch.cat([reg_output, obj_output, cls_output], 1)

    网络输出的代码错了吧?output = torch.cat([reg_output, obj_output, cls_output], 1)

    复现代码是 output = torch.cat([reg_output, obj_output, cls_output], 1) 源代码是 output = torch.cat( [reg_output, obj_output.sigmoid(), cls_output.sigmoid()], 1 ) 复现代码没加激活函数啊?

    opened by mepleleo 1
  • ModuleNotFoundError: No module named 'models'

    ModuleNotFoundError: No module named 'models'

    Traceback (most recent call last): File "O:\graduate\yolov7-bubbliiiing\predict.py", line 15, in yolo = YOLO() File "O:\graduate\yolov7-bubbliiiing\yolo.py", line 95, in init self.generate() File "O:\graduate\yolov7-bubbliiiing\yolo.py", line 108, in generate self.net.load_state_dict(torch.load(self.model_path, map_location=device)) File "D:\Anaconda\envs\pytorch-gpu\lib\site-packages\torch\serialization.py", line 592, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "D:\Anaconda\envs\pytorch-gpu\lib\site-packages\torch\serialization.py", line 851, in _load result = unpickler.load() ModuleNotFoundError: No module named 'models'

    这是跑大佬您的yolov7时运行predict.py的问题,发错地方了😂

    opened by lip111 1
  • involution卷积替换问题

    involution卷积替换问题

    up,最近看了一些论文显示involution卷积效果不错,想来替换试试,但是involution官方代码,参数和yolox的不太匹配,调整了好久都一直报错,能麻烦up指点一下 参数该如何修改呢qwq? import torch.nn as nn from mmcv.cnn import ConvModule

    class involution(nn.Module):

    def __init__(self,
                 channels,
                 kernel_size,
                 stride):
        super(involution, self).__init__()
        self.kernel_size = kernel_size
        self.stride = stride
        self.channels = channels
        reduction_ratio = 4
        self.group_channels = 16
        self.groups = self.channels // self.group_channels
        self.conv1 = ConvModule(
            in_channels=channels,
            out_channels=channels // reduction_ratio,
            kernel_size=1,
            conv_cfg=None,
            norm_cfg=dict(type='BN'),
            act_cfg=dict(type='ReLU'))
        self.conv2 = ConvModule(
            in_channels=channels // reduction_ratio,
            out_channels=kernel_size**2 * self.groups,
            kernel_size=1,
            stride=1,
            conv_cfg=None,
            norm_cfg=None,
            act_cfg=None)
        if stride > 1:
            self.avgpool = nn.AvgPool2d(stride, stride)
        self.unfold = nn.Unfold(kernel_size, 1, (kernel_size-1)//2, stride)
    
    def forward(self, x):
        weight = self.conv2(self.conv1(x if self.stride == 1 else self.avgpool(x)))
        b, c, h, w = weight.shape
        weight = weight.view(b, self.groups, self.kernel_size**2, h, w).unsqueeze(2)
        out = self.unfold(x).view(b, self.groups, self.group_channels, self.kernel_size**2, h, w)
        out = (weight * out).sum(dim=3).view(b, self.channels, h, w)
        return out
    
    opened by right135 3
Releases(v2.1)
Owner
Bubbliiiing
Bubbliiiing
PlenOctree Extraction algorithm

PlenOctrees_NeRF-SH This is an implementation of the Paper PlenOctrees for Real-time Rendering of Neural Radiance Fields. Not only the code provides t

49 Nov 05, 2022
Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

xTune Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning. Environment DockerFile: dancingsoul/pytorch:xTune Install the f

Bo Zheng 42 Dec 09, 2022
Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

SML (ICCV 2021, Oral) : Official Pytorch Implementation This repository provides the official PyTorch implementation of the following paper: Standardi

SangHun 61 Dec 27, 2022
Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Getting Started This repository contains the code used for the following publications: Probabilistic Guarantees for Safe Deep Reinforcement Learning (

Edoardo Bacci 5 Aug 31, 2022
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

ObjProp Introduction This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Insta

Anirudh S Chakravarthy 6 May 03, 2022
Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments Paper: arXiv (ICRA 2021) Video : https://youtu.be/CC

Sachini Herath 68 Jan 03, 2023
Flower classification model that classifies flowers in 10 classes made using transfer learning (~85% accuracy).

flower-classification-inceptionV3 Flower classification model that classifies flowers in 10 classes. Training and validation are done using a pre-anot

Ivan R. Mršulja 1 Dec 12, 2021
基于pytorch构建cyclegan示例

cyclegan-demo 基于Pytorch构建CycleGAN示例 如何运行 准备数据集 将数据集整理成4个文件,分别命名为 trainA, trainB:训练集,A、B代表两类图片 testA, testB:测试集,A、B代表两类图片 例如 D:\CODE\CYCLEGAN-DEMO\DATA

Koorye 3 Oct 18, 2022
ivadomed is an integrated framework for medical image analysis with deep learning.

Repository on the collaborative IVADO medical imaging project between the Mila and NeuroPoly labs.

144 Dec 19, 2022
Simple cross-platform application for DaVinci surgical video frame annotation

About DaVid is a simple cross-platform GUI for annotating robotic and endoscopic surgical actions for use in deep-learning research. Features Simple a

Cyril Zakka 4 Oct 09, 2021
A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

来自丹麦的天籁 10 Dec 06, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

📖 Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) 🔥 If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 503 Jan 04, 2023
Cobalt Strike teamserver detection.

Cobalt-Strike-det Cobalt Strike teamserver detection. usage: cobaltstrike_verify.py [-l TARGETS] [-t THREADS] optional arguments: -h, --help show this

TimWhite 17 Sep 27, 2022
PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick."

PixelPick This is an official implementation of the paper "All you need are a few pixels: semantic segmentation with PixelPick." [Project page] [Paper

Gyungin Shin 59 Sep 25, 2022
Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

APR The repo for the paper Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study. Environment setu

ielab 8 Nov 26, 2022
ReferFormer - Official Implementation of ReferFormer

The official implementation of the paper: Language as Queries for Referring Video Object Segmentation Language as Queries for Referring Video Object S

Jonas Wu 232 Dec 29, 2022
Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Johannes G. 24 Dec 07, 2022
OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

OrienMask This repository implements the framework OrienMask for real-time instance segmentation. It achieves 34.8 mask AP on COCO test-dev at the spe

45 Dec 13, 2022
Implementations of orthogonal and semi-orthogonal convolutions in the Fourier domain with applications to adversarial robustness

Orthogonalizing Convolutional Layers with the Cayley Transform This repository contains implementations and source code to reproduce experiments for t

CMU Locus Lab 36 Dec 30, 2022
Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini!

ConversorDeMedidas_CapuccinoGelado Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini! Requirem

Arthur Ottoni Ribeiro 48 Nov 15, 2022