object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

Overview

赛题背景

在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛 (ACM MM2021 Robust Logo Detection)。挑战赛需要参赛者处理小目标检测长尾对象类别对抗性干扰图像。这一挑战集中于现实世界机器学习和多媒体系统中安全问题的最新研究和未来方向。

前言

  • 本方案初赛排名 TOP 6,复赛排名 TOP 2,综合排名 TOP 2
  • 特别感谢阿里安全组织的这场比赛,一方面锻炼了我们的能力,一方面给了我们探索实际业务场景中各种困难的机会;

赛题攻关

检测模型的选取:

我们选择了detectoRS作为我们的检测模型,该模型当前在目标检测coco排行榜排名第七,我们选择了其级联的框架。Cascade_detectoRS网络结构如下:

framework

其中SAC (Switchable Atrous Convolution) 表示可切换的空洞卷积,它可以自适应选择感受野 ,其中 RFP (Recursive Feature Pyramid)采用循环结构来反复利用和精炼提取的特征,“H”表示检测头,我们使用了广泛使用的三级级联结构,“B”表示回归框,“C”表示类别预测结果,“B0”表示RPN网络的输出结果。除此之外,整个网络模型还使用了基于注意力的特征融合机制和类似SENet的全局上下文模块来增强网络的表征能力。显而易见,该网络结构基本集合了常见的检测涨点的tricks。对于backbone的选取,一般而言,越大越复杂的特征提取网络,往往具有较好的性能表现,通过对网络复杂程度与训练耗时tradeoff的多次实验和思考,我们最终选择了ResNet50作为基本的网络,并将网络中标准的3*3卷积换成SAC卷积模块。

小目标处理策略:

对于小目标问题,我们并没有单独地去设计网络模块或者对数据进行小目标增广处理。由于网络模型的backbone是基于ResNet50的,所以我们可以使用较大尺度的输入,比如800,900,1000等。在兼顾小目标检测问题的同时,如何提升检测性能也是很重要的,在实验中我们采用了多尺度训练的机制,训练尺度最大边长为1333,短边尺度范围为800~1100。

长尾分布处理:

对于长尾数据的处理,容易想到的处理方案有重采样和均衡损失这两种。正如前面所介绍的那样,重采样会增加训练的时间,而且采样率也不容易设置,对于大数据集很不友好。均衡损失,往往是通过对类别项概率进行权重矫正实现的。这种方案的缺陷在于忽略了背景候选区域的影响。此外,固定的权重值不一定适合不同训练阶段网络的学习。因此,我们使用了EQ-Loss V2作为检测模型中的分类损失函数,RPN处的分类损失,我们依然使用交叉熵损失函数。EQL_V2 loss的特点是基于梯度引导的,它根据正梯度与负梯度的累积比,分别对正负梯度进行加权矫正。

对抗干扰的处理:

从初赛到复赛,我们都使用了multi-scale testing。尺度范围在训练尺度范围的基础我增加了(1333,1200),这种较大尺度的使用主要作用在于提升小目标检测的recall值以及提升模型鲁棒性。当然我们也尝试了一系列图像数字领域防御的方法,例如 JPEG compression,quantization,denoise 等经典的手段,这些方法都会造成检测性能的下降,其主要的原因在于测试数据集中包含图像数字攻击的对抗样本很少,绝大多数的对抗样本基本都是无限制攻击下产生的扰动,这些扰动的生成和常见的防御机理存在很大程度的不一致性,因此这些方法基本没有任何防御效果。甚至由于防御造成的信息损失,导致最终正常图片性能的下降。 因此,对于对抗干扰的可能有效的处理方法,是模拟当前噪声的生成,通过增广数据来提升模型的鲁棒性,我们尝试加Gaussian噪声、加雨、加雾、图像模糊化等方法,有一定程度的性能提升效果。

检测效果图:

results

一些思考:

通过此次比赛,我们发现其实每个检测模型都可以很鲁棒,这种鲁棒性并不需要特定的网络结构,模型鲁棒性很大程度取决于训练后网络收敛的位置点,不同的收敛点往往具有较大的鲁棒性差别。如何研究出有效的网络训练策略或许会是网络鲁棒性的基本保证! 💥

Paper Link: https://arxiv.org/abs/2108.00422

checkpoint: Link:https://pan.baidu.com/s/1u12MAVgoIke6HobJLfbebg

Password:6j2o

Please cite:

@misc{jia2021effective, title={An Effective and Robust Detector for Logo Detection}, author={Xiaojun Jia and Huanqian Yan and Yonglin Wu and Xingxing Wei and Xiaochun Cao and Yong Zhang}, year={2021}, eprint={2108.00422}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Owner
student
Reimplementation of Dynamic Multi-scale filters for Semantic Segmentation.

Paddle implementation of Dynamic Multi-scale filters for Semantic Segmentation.

Hongqiang.Wang 2 Nov 01, 2021
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" ([email protected])

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

Wanyu Du 18 Dec 29, 2022
A demo of how to use JAX to create a simple gravity simulation

JAX Gravity This repo contains a demo of how to use JAX to create a simple gravity simulation. It uses JAX's experimental ode package to solve the dif

Cristian Garcia 16 Sep 22, 2022
Source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree.

self-driving-car In this repository I will share the source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree. Hope this might

Andrea Palazzi 2.4k Dec 29, 2022
Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

StableNet StableNet is a deep stable learning method for out-of-distribution generalization. This is the official repo for CVPR21 paper "Deep Stable L

120 Dec 28, 2022
PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer.

Unsupervised_IEPGAN This is the PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer. Ha

25 Oct 26, 2022
The code uses SegFormer for Semantic Segmentation on Drone Dataset.

SegFormer_Segmentation The code uses SegFormer for Semantic Segmentation on Drone Dataset. The details for the SegFormer can be obtained from the foll

Dr. Sander Ali Khowaja 1 May 08, 2022
SuRE Evaluation: A Supplementary Material

SuRE Evaluation: A Supplementary Material This repository contains supplementary material regarding the evaluations presented in the paper Visual Expl

NYU Visualization Lab 0 Dec 14, 2021
Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

DeepCreamPy Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak. A deep learning-based tool to automatically replace censored a

616 Jan 06, 2023
Bayesian optimisation library developped by Huawei Noah's Ark Library

Bayesian Optimisation Research This directory contains official implementations for Bayesian optimisation works developped by Huawei R&D, Noah's Ark L

HUAWEI Noah's Ark Lab 395 Dec 30, 2022
这是一个facenet-pytorch的库,可以用于训练自己的人脸识别模型。

Facenet:人脸识别模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download 预测步骤 How2predict 训练步骤 How2train 参考资料 Reference 性能情况 训练数据

Bubbliiiing 210 Jan 06, 2023
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

SEDE SEDE (Stack Exchange Data Explorer) is new dataset for Text-to-SQL tasks with more than 12,000 SQL queries and their natural language description

Rupert. 83 Nov 11, 2022
Dense Prediction Transformers

Vision Transformers for Dense Prediction This repository contains code and models for our paper: Vision Transformers for Dense Prediction René Ranftl,

Intel ISL (Intel Intelligent Systems Lab) 1.3k Dec 28, 2022
Subdivision-based Mesh Convolutional Networks

Subdivision-based Mesh Convolutional Networks The official implementation of SubdivNet in our paper, Subdivion-based Mesh Convolutional Networks Requi

Zheng-Ning Liu 181 Dec 28, 2022
This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

1 Dec 24, 2021
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
Self-Guided Contrastive Learning for BERT Sentence Representations

Self-Guided Contrastive Learning for BERT Sentence Representations This repository is dedicated for releasing the implementation of the models utilize

Taeuk Kim 16 Dec 04, 2022
Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

1 Feb 14, 2022
Proto-RL: Reinforcement Learning with Prototypical Representations

Proto-RL: Reinforcement Learning with Prototypical Representations This is a PyTorch implementation of Proto-RL from Reinforcement Learning with Proto

Denis Yarats 74 Dec 06, 2022
Migration of Edge-based Distributed Federated Learning

FedFly: Towards Migration in Edge-based Distributed Federated Learning About the research Due to mobility, a device participating in Federated Learnin

qub-blesson 11 Nov 13, 2022