PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

Overview

SimTrans-Weak-Shot-Classification

This repository contains the official PyTorch implementation of the following paper:

Weak-shot Fine-grained Classification via Similarity Transfer

Junjie Chen, Li Niu, Liu Liu, Liqing Zhang
MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
https://arxiv.org/abs/2009.09197
Accepted by NeurIPS2021.

Abstract

Recognizing fine-grained categories remains a challenging task, due to the subtle distinctions among different subordinate categories, which results in the need of abundant annotated samples. To alleviate the data-hungry problem, we consider the problem of learning novel categories from web data with the support of a clean set of base categories, which is referred to as weak-shot learning. In this setting, we propose to transfer pairwise semantic similarity from base categories to novel categories. Specifically, we firstly train a similarity net on clean data, and then leverage the transferred similarity to denoise web training data using two simple yet effective strategies. In addition, we apply adversarial loss on similarity net to enhance the transferability of similarity. Comprehensive experiments on three fine-grained datasets demonstrate the effectiveness of our setting and method.

1. Setting

In practice, we often have a set of base categories with sufficient well-labeled data, and the problem is how to learn novel categories with less expense, in which base categories and novel categories have no overlap. Such problem motivates zero-shot learning, few-shot learning, as well as our setting. To bridge the gap between base categories and novel categories, zero-shot learning requires category-level semantic representation for all categories, while few-shot learning requires a few clean examples for novel categories. Considering the drawbacks of zero/few-shot learning and the accessibility of free web data, we intend to learn novel categories by virtue of web data with the support of a clean set of base categories.

2. Our Method

Specifically, our framework consists of two training phases. Firstly, we train a similarity net (SimNet) on base training set, which feeds in two images and outputs the semantic similarity. Secondly, we apply the trained SimNet to obtain the semantic similarities among web images. In this way, the similarity is transferred from base categories to novel categories. Based on the transferred similarities, we design two simple yet effective methods to assist in learning the main classifier on novel training set. (1) Sample weighting (i.e., assign small weights to the images dissimilar to others) reduces the impact of outliers (web images with incorrect labels) and thus alleviates the problem of noise overfitting. (2) Graph regularization (i.e., pull close the features of semantically similar samples) prevents the feature space from being disturbed by noisy labels. In addition, we propose to apply adversarial loss on SimNet to make it indistinguishable for base categories and novel categories, so that the transferability of similarity is strengthened.

3. Results

Extensive experiments on three fine-grained datasets have demonstrated the potential of our learning scenario and the effectiveness of our method. For qualitative analysis, on the one hand, the clean images are assigned with high weights, while the images belonging to outlier are assigned with low weights; on the other hand, the transferred similarities accurately portray the semantic relations among web images.

4. Experiment Codebase

4.1 Data

We provide the packages of CUB, Car, FGVC, and WebVision at Baidu Cloud (access code: BCMI).

The original packages are split by split -b 10G ../CUB.zip CUB.zip., thus we need merge by cat CUB.zip.a* > CUB.zip before decompression.

The ImageNet dataset is publicly available, and all data files are configured as:

├── CUB
├── Car
├── Air
├── WebVision
├── ImageNet:
  ├── train
      ├── ……
  ├── val
      ├── ……
  ├── ILSVRC2012_validation_ground_truth.txt
  ├── meta.mat
  ├── train_files.txt

Just employ --data_path ANY_PATH/CUB to specify the data dir.

4.2 Install

See requirement.txt.

4.3 Evaluation

The trained models are released as trained_models.zip at Baidu Cloud (access code: BCMI).

The command in _scripts/DATASET_NAME/eval.sh is used to evaluate the model.

4.4 Training

We provide the full scripts for CUB dataset in _scripts/CUB/ dir as an example.

For other datasets, just change the data path, i.e., --data_path ANY_PATH/WebVision.

Bibtex

If you find this work is useful for your research, please cite our paper using the following BibTeX [pdf] [supp] [arxiv]:

@inproceedings{SimTrans2021,
title={Weak-shot Fine-grained Classification via Similarity Transfer},
author={Chen, Junjie and Niu, Li and Liu, Liu and Zhang, Liqing},
booktitle={NeurIPS},
year={2021}}
Owner
BCMI
Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University.
BCMI
GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

GemNet: Universal Directional Graph Neural Networks for Molecules Reference implementation in PyTorch of the geometric message passing neural network

Data Analytics and Machine Learning Group 124 Dec 30, 2022
Transformer in Computer Vision

Transformer-in-Vision A paper list of some recent Transformer-based CV works. If you find some ignored papers, please open issues or pull requests. **

506 Dec 26, 2022
PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

hierarchical-multi-label-text-classification-pytorch Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network Approach This

Mingu Kang 17 Dec 13, 2022
Bridging Vision and Language Model

BriVL BriVL (Bridging Vision and Language Model) 是首个中文通用图文多模态大规模预训练模型。BriVL模型在图文检索任务上有着优异的效果,超过了同期其他常见的多模态预训练模型(例如UNITER、CLIP)。 BriVL论文:WenLan: Bridgi

235 Dec 27, 2022
Real time Human Detection Counting

In this python project, we are going to build the Human Detection and Counting System through Webcam or you can give your own video or images. This is a deep learning project on computer vision, whic

Mir Nawaz Ahmad 2 Jun 17, 2022
Ladder Variational Autoencoders (LVAE) in PyTorch

Ladder Variational Autoencoders (LVAE) PyTorch implementation of Ladder Variational Autoencoders (LVAE) [1]: where the variational distributions q at

Andrea Dittadi 63 Dec 22, 2022
A platform for intelligent agent learning based on a 3D open-world FPS game developed by Inspir.AI.

Wilderness Scavenger: 3D Open-World FPS Game AI Challenge This is a platform for intelligent agent learning based on a 3D open-world FPS game develope

46 Nov 24, 2022
A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Transcription Factor binding predictions with Attention and Transformers A repository with exploration into using transformers to predict DNA ↔ transc

Phil Wang 62 Dec 20, 2022
NeoPlay is the project dedicated to ESport events.

NeoPlay is the project dedicated to ESport events. On this platform users can participate in tournaments with prize pools as well as create their own tournaments.

3 Dec 18, 2021
PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM)

Neuro-Symbolic Sudoku Solver PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM). Please n

Ashutosh Hathidara 60 Dec 10, 2022
Pun Detection and Location

Pun Detection and Location “The Boating Store Had Its Best Sail Ever”: Pronunciation-attentive Contextualized Pun Recognition Yichao Zhou, Jyun-yu Jia

lawson 3 May 13, 2022
TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat

Mandar Sharma 7 Oct 31, 2021
CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.

CausalNLP CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable. Install pip install -U

Arun S. Maiya 95 Jan 03, 2023
Augmented CLIP - Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.

Train aug_clip against laion400m-embeddings found here: https://laion.ai/laion-400-open-dataset/ - note that this used the base ViT-B/32 CLIP model. S

Peter Baylies 55 Sep 13, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 EAMLP will come soon Jitto

MenghaoGuo 357 Dec 11, 2022
This project is used for the paper Differentiable Programming of Isometric Tensor Network

This project is used for the paper "Differentiable Programming of Isometric Tensor Network". (arXiv:2110.03898)

Chenhua Geng 15 Dec 13, 2022
Code for PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing

PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing CVPR 2021. Project page: https://kai-46.github.io/

Kai Zhang 141 Dec 14, 2022
Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them

TensorFlow Serving + Streamlit! ✨ 🖼️ Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them! This is a pretty simple S

Álvaro Bartolomé 18 Jan 07, 2023
Skipgram Negative Sampling in PyTorch

PyTorch SGNS Word2Vec's SkipGramNegativeSampling in Python. Yet another but quite general negative sampling loss implemented in PyTorch. It can be use

Jamie J. Seol 287 Dec 14, 2022