CVNets: A library for training computer vision networks

Last update: Jan 03, 2023

Related tags

Overview

CVNets: A library for training computer vision networks

This repository contains the source code for training computer vision models. Specifically, it contains the source code of the MobileViT paper for the following tasks:

Image classification on the ImageNet dataset
Object detection using SSD
Semantic segmentation using Deeplabv3

Note: Any image classification backbone can be used with object detection and semantic segmentation models

Training can be done with two samplers:

Standard distributed sampler
Mulit-scale distributed sampler

We recommend to use multi-scale sampler as it improves generalization capability and leads to better performance. See MobileViT for details.

Installation

CVNets can be installed in the local python environment using the below command:

    git clone [email protected]:apple/ml-cvnets.git
    cd ml-cvnets
    pip install -r requirements.txt
    pip install --editable .

We recommend to use Python 3.6+ and PyTorch (version >= v1.8.0) with conda environment. For setting-up python environment with conda, see here.

Getting Started

General instructions for training and evaluation different models are given here.
Examples for a training and evaluating a specific model are provided in the examples folder. Right now, we support following models.
For converting PyTorch models to CoreML, see README-pytorch-to-coreml.md.

Citation

If you find our work useful, please cite the following paper:

@article{mehta2021mobilevit,
  title={MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer},
  author={Mehta, Sachin and Rastegari, Mohammad},
  journal={arXiv preprint arXiv:2110.02178},
  year={2021}
}

CVNets: A library for training computer vision networks

Related tags

Overview

CVNets: A library for training computer vision networks

Installation

Getting Started

Citation

Owner

Apple

Repo for Photon-Starved Scene Inference using Single Photon Cameras, ICCV 2021

FS2KToolbox FS2K Dataset Towards the translation between Face

A visualization tool to show a TensorFlow's graph like TensorBoard

Project of 'TBEFN: A Two-branch Exposure-fusion Network for Low-light Image Enhancement '

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

A rule learning algorithm for the deduction of syndrome definitions from time series data.

Toolbox to analyze temporal context invariance of deep neural networks

Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

Official Repository for our ECCV2020 paper: Imbalanced Continual Learning with Partitioning Reservoir Sampling

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Differentiable simulation for system identification and visuomotor control

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

PyTorch implementation of Higher Order Recurrent Space-Time Transformer

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .

Veri Setinizi Yolov5 Formatına Dönüştürün

UT-Sarulab MOS prediction system using SSL models

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

CVNets: A library for training computer vision networks

Related tags

Overview

CVNets: A library for training computer vision networks

Installation

Getting Started

Citation

Owner

Apple

Repo for Photon-Starved Scene Inference using Single Photon Cameras, ICCV 2021

FS2KToolbox FS2K Dataset Towards the translation between Face

A visualization tool to show a TensorFlow's graph like TensorBoard

Project of 'TBEFN: A Two-branch Exposure-fusion Network for Low-light Image Enhancement '

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

A rule learning algorithm for the deduction of syndrome definitions from time series data.

Toolbox to analyze temporal context invariance of deep neural networks

Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

Official Repository for our ECCV2020 paper: Imbalanced Continual Learning with Partitioning Reservoir Sampling

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Differentiable simulation for system identification and visuomotor control

LLVIP: A Visible-infrared Paired Dataset for Low-light Vision

Code for the SIGGRAPH 2022 paper "DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds."

PyTorch implementation of Higher Order Recurrent Space-Time Transformer

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

Veri Setinizi Yolov5 Formatına Dönüştürün

UT-Sarulab MOS prediction system using SSL models

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .