A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

Overview

A Light and Fast Face Detector for Edge Devices

Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended to use LFD instead !!! Visit LFD Repo here. This repo will not be maintained from now on.

Recent Update

  • 2019.07.25 This repos is first online. Face detection code and trained models are released.
  • 2019.08.15 This repos is formally released. Any advice and error reports are sincerely welcome.
  • 2019.08.22 face_detection: latency evaluation on TX2 is added.
  • 2019.08.25 face_detection: RetinaFace-MobileNet-0.25 is added for comparison (both accuracy and latency).
  • 2019.09.09 LFFD is ported to NCNN (link) and MNN (link) by SyGoing, great thanks to SyGoing.
  • 2019.09.10 face_detection: important bug fix: vibration offset should be subtracted by shift in data iterator. This bug may result in lower accuracy, inaccurate bbox prediction and bbox vibration in test phase. We will upgrade v1 and v2 as soon as possible (should have higher accuracy and more stable).
  • 2019.09.17 face_detection: model v2 is upgraded! After fixing the bug, we have fine-tuned the old v2 model. The accuracy on WIDER FACE is improved significantly! Please try new v2.
  • 2019.09.18 pedestrian_detection: preview version of model v1 for Caltech Pedestrian Dataset is released.
  • 2019.09.23 head_detection: model v1 for brainwash dataset is released.
  • 2019.10.02 license_plate_detection: model v1 for CCPD dataset is released. (The accuracy is very high and the latency is very short! Have a try.)
  • 2019.10.02 Currently, we have provided some application-oriented detectors. Subsequently, we will put most energy to next generation framework for single-class detection. Any feedback is welcome.
  • 2019.10.16 face_detection: the preview of PyTorch version is ready (link). Any feedback is welcome.
  • 2019.10.16 Tips: data preparation is important, irrational values of (x,y,w,h) may introduce nan in training; we trained models with convs followed by BNs. But we found that the convergence is not stable, and can not reach a good point.
  • 2019.11.08 face_detection: caffe version of LFFD is provided by vicwer (great thanks). Guys who are familiar with caffe can navigate to /face_detection/caffemodel for details.
  • 2020.03.27 license_plate_detection: model v1_small for CCPD dataset is released. v1_small has much less parameters than v1, hence it is much faster. The AP of v1_small is 0.982 (vs v1-0.989). Please check README.md. Besides, a commercial-ready license plate recognition repo which adopted LFFD as the detector is hightly recommended!

Introduction

This repo releases the source code of paper "LFFD: A Light and Fast Face Detector for Edge Devices". Our paper presents a light and fast face detector (LFFD) for edge devices. LFFD considerably balances both accuracy and latency, resulting in small model size, fast inference speed while achieving excellent accuracy. Understanding the essence of receptive field makes detection networks interpretable.

In practical, we have deployed it in cloud and edge devices (like NVIDIA Jetson series and ARM-based embedding system). The comprehensive performance of LFFD is robust enough to support our applications.

In fact, our method is a general detection framework that applicable to one class detection, such as face detection, pedestrian detection, head detection, vehicle detection and so on. In general, an object class, whose average ratio of the longer side and the shorter side is less than 5, is appropriate to apply our framework for detection.

Several practical advantages:

  1. large scale coverage, and easy to extend to larger scales by adding more layers without much latency gain.
  2. detect small objects (as small as 10 pixels) in images with extremely large resolution (8K or even larger) in only one inference.
  3. easy backbone with very common operators makes it easy to deploy anywhere.

Accuracy and Latency

We train LFFD on train set of WIDER FACE benchmark. All methods are evaluated on val/test sets under the SIO schema (please refer to the paper for details).

  • Accuracy on val set of WIDER FACE (The values in () are results from the original papers):
Method Easy Set Medium Set Hard Set
DSFD 0.949(0.966) 0.936(0.957) 0.850(0.904)
PyramidBox 0.937(0.961) 0.927(0.950) 0.867(0.889)
S3FD 0.923(0.937) 0.907(0.924) 0.822(0.852)
SSH 0.921(0.931) 0.907(0.921) 0.702(0.845)
FaceBoxes 0.840 0.766 0.395
FaceBoxes3.2× 0.798 0.802 0.715
LFFD 0.910 0.881 0.780
  • Accuracy on test set of WIDER FACE (The values in () are results from the original papers):
Method Easy Set Medium Set Hard Set
DSFD 0.947(0.960) 0.934(0.953) 0.845(0.900)
PyramidBox 0.926(0.956) 0.920(0.946) 0.862(0.887)
S3FD 0.917(0.928) 0.904(0.913) 0.821(0.840)
SSH 0.919(0.927) 0.903(0.915) 0.705(0.844)
FaceBoxes 0.839 0.763 0.396
FaceBoxes3.2× 0.791 0.794 0.715
LFFD 0.896 0.865 0.770
  • Accuracy on FDDB:
Method Disc ROC curves score
DFSD 0.984
PyramidBox 0.982
S3FD 0.981
SSH 0.977
FaceBoxes3.2× 0.905
FaceBoxes 0.960
LFFD 0.973

In the paper, three hardware platforms are used for latency evaluation: NVIDIA GTX TITAN Xp, NVIDIA TX2 and Rasberry Pi 3 Model B+ (ARM A53).

We report the latency of inference only (for NVIDIA hardwares, data transfer is included), excluding pre-processing and post-processing. The batchsize is set to 1 for all evaluations.

  • Latency on NVIDIA GTX TITAN Xp (MXNet+CUDA 9.0+CUDNN7.1):
Resolution-> 640×480 1280×720 1920×1080 3840×2160
DSFD 78.08ms(12.81 FPS) 187.78ms(5.33 FPS) 392.82ms(2.55 FPS) 1562.50ms(0.64 FPS)
PyramidBox 50.51ms(19.08 FPS) 143.34ms(6.98 FPS) 331.93ms(3.01 FPS) 1344.07ms(0.74 FPS)
S3FD 21.75ms(45.95 FPS) 55.73ms(17.94 FPS) 119.53ms(8.37 FPS) 471.31ms(2.21 FPS)
SSH 22.44ms(44.47 FPS) 55.29ms(18.09 FPS) 118.43ms(8.44 FPS) 463.10ms(2.16 FPS)
FaceBoxes3.2× 6.80ms(147.00 FPS) 12.96ms(77.19 FPS) 25.37ms(39.41 FPS) 111.98ms(8.93 FPS)
LFFD 7.60ms(131.40 FPS) 16.37ms(61.07 FPS) 31.27ms(31.98 FPS) 87.79ms(11.39 FPS)
  • Latency on NVIDIA TX2 (MXNet+CUDA 9.0+CUDNN7.1) presented in the paper:
Resolution-> 160×120 320×240 640×480
FaceBoxes3.2× 11.20ms(89.29 FPS) 19.62ms(50.97 FPS) 72.74ms(13.75 FPS)
LFFD 7.30ms(136.99 FPS) 19.64ms(50.92 FPS) 64.70ms(15.46 FPS)
  • Latency on Respberry Pi 3 Model B+ (ncnn) presented in the paper:
Resolution-> 160×120 320×240 640×480
FaceBoxes3.2× 167.20ms(5.98 FPS) 686.19ms(1.46 FPS) 3232.26ms(0.31 FPS)
LFFD 118.45ms(8.44 FPS) 409.19ms(2.44 FPS) 4114.15ms(0.24 FPS)

On NVIDIA platform, TensorRT is the best choice for inference. So we conduct additional latency evaluations using TensorRT (the latency is dramatically decreased!!!). As for ARM based platform, we plan to use MNN and Tengine for latency evaluation. Details can be found in the sub-project face_detection.

Getting Started

We implement the proposed method using MXNet Module API.

Prerequirements (global)

  • Python>=3.5
  • numpy>=1.16 (lower versions should work as well, but not tested)
  • MXNet>=1.4.1 (install guide)
  • cv2=3.x (pip3 install opencv-python==3.4.5.20, other version should work as well, but not tested)

Tips:

  • use MXNet with cudnn.
  • build numpy from source with OpenBLAS. This will improve the training efficiency.
  • make sure cv2 links to libjpeg-turbo, not libjpeg. This will improve the jpeg decode efficiency.

Sub-directory description

  • face_detection contains the code of training, evaluation and inference for LFFD, the main content of this repo. The trained models of different versions are provided for off-the-shelf deployment.
  • head_detection contains the trained models for head detection. The models are obtained by the proposed general one class detection framework.
  • pedestrian_detection contains the trained models for pedestrian detection. The models are obtained by the proposed general one class detection framework.
  • vehicle_detection contains the trained models for vehicle detection. The models are obtained by the proposed general one class detection framework.
  • ChasingTrainFramework_GeneralOneClassDetection is a simple wrapper based on MXNet Module API for general one class detection.

Installation

  1. Download the repo:
git clone https://github.com/YonghaoHe/A-Light-and-Fast-Face-Detector-for-Edge-Devices.git
  1. Refer to the corresponding sub-project for detailed usage.

Citation

If you benefit from our work in your research and product, please kindly cite the paper

@inproceedings{LFFD,
title={LFFD: A Light and Fast Face Detector for Edge Devices},
author={He, Yonghao and Xu, Dezhong and Wu, Lifang and Jian, Meng and Xiang, Shiming and Pan, Chunhong},
booktitle={arXiv:1904.10633},
year={2019}
}

To Do List

Contact

Yonghao He

E-mails: [email protected] / [email protected]

If you are interested in this work, any innovative contributions are welcome!!!

Internship is open at NLPR, CASIA all the time. Send me your resumes!

Owner
YonghaoHe
Assistant Professor
YonghaoHe
Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

NeurMips: Neural Mixture of Planar Experts for View Synthesis This is the official repo for PyTorch implementation of paper "NeurMips: Neural Mixture

James Lin 101 Dec 13, 2022
Library of various Few-Shot Learning frameworks for text classification

FewShotText This repository contains code for the paper A Neural Few-Shot Text Classification Reality Check Environment setup # Create environment pyt

Thomas Dopierre 47 Jan 03, 2023
Software associated to AAAI paper "Planning with Biological Neurons and Synapses"

jBrain Software associated with the AAAI 2022 paper Francesco D'Amore, Daniel Mitropolsky, Pierluigi Crescenzi, Emanuele Natale, Christos H. Papadimit

Pierluigi Crescenzi 1 Apr 10, 2022
Self-Supervised Learning for Domain Adaptation on Point-Clouds

Self-Supervised Learning for Domain Adaptation on Point-Clouds Introduction Self-supervised learning (SSL) allows to learn useful representations from

Idan Achituve 66 Dec 20, 2022
Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.

Illumination_Decomposition Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources. This code implements the

QAY 7 Nov 15, 2020
天勤量化开发包, 期货量化, 实时行情/历史数据/实盘交易

TqSdk 天勤量化交易策略程序开发包 TqSdk 是一个由信易科技发起并贡献主要代码的开源 python 库. 依托快期多年积累成熟的交易及行情服务器体系, TqSdk 支持用户使用极少的代码量构建各种类型的量化交易策略程序, 并提供包含期货、期权、股票的 历史数据-实时数据-开发调试-策略回测-

信易科技 2.8k Dec 30, 2022
A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Notice(2019.11.2) This repo was built back two years ago when there were no pytorch detection implementation that can achieve reasonable performance.

Ruotian(RT) Luo 1.8k Jan 01, 2023
Tensorboard for pytorch (and chainer, mxnet, numpy, ...)

tensorboardX Write TensorBoard events with simple function call. The current release (v2.3) is tested on anaconda3, with PyTorch 1.8.1 / torchvision 0

Tzu-Wei Huang 7.5k Dec 28, 2022
Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Protein GLM (wip) Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capabil

Phil Wang 17 May 06, 2022
A transformer model to predict pathogenic mutations

MutFormer MutFormer is an application of the BERT (Bidirectional Encoder Representations from Transformers) NLP (Natural Language Processing) model wi

Wang Genomics Lab 2 Nov 29, 2022
GUPNet - Geometry Uncertainty Projection Network for Monocular 3D Object Detection

GUPNet This is the official implementation of "Geometry Uncertainty Projection Network for Monocular 3D Object Detection". citation If you find our wo

Yan Lu 103 Dec 28, 2022
This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

RobustFreqCNN About This repository contains the implementation of the paper "Towards Frequency-Based Explanation for Robust CNN" arxiv. It primarly d

Sarosij Bose 2 Jan 23, 2022
Curated list of awesome GAN applications and demo

gans-awesome-applications Curated list of awesome GAN applications and demonstrations. Note: General GAN papers targeting simple image generation such

Minchul Shin 4.5k Jan 07, 2023
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022
Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

Microsoft 11.3k Dec 30, 2022
基于AlphaPose的TensorRT加速

1. Requirements CUDA 11.1 TensorRT 7.2.2 Python 3.8.5 Cython PyTorch 1.8.1 torchvision 0.9.1 numpy 1.17.4 (numpy版本过高会出报错 this issue ) python-package s

52 Dec 06, 2022
The official PyTorch code implementation of "Personalized Trajectory Prediction via Distribution Discrimination" in ICCV 2021.

Personalized Trajectory Prediction via Distribution Discrimination (DisDis) The official PyTorch code implementation of "Personalized Trajectory Predi

25 Dec 20, 2022
Object recognition using Azure Custom Vision AI and Azure Functions

Step by Step on how to create an object recognition model using Custom Vision, export the model and run the model in an Azure Function

El Bruno 11 Jul 08, 2022
Accurate identification of bacteriophages from metagenomic data using Transformer

PhaMer is a python library for identifying bacteriophages from metagenomic data. PhaMer is based on a Transorfer model and rely on protein-based vocab

Kenneth Shang 9 Nov 30, 2022
Convert Mission Planner (ArduCopter) Waypoint Missions to Litchi CSV Format to execute on DJI Drones

Mission Planner to Litchi Convert Mission Planner (ArduCopter) Waypoint Surveys to Litchi CSV Format to execute on DJI Drones Litchi doesn't support S

Yaros 24 Dec 09, 2022