PASSL包含 SimCLR,MoCo,BYOL,CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer,Swin-Transformer,BEiT,CVT,T2T,MLP_Mixer等视觉Transformer算法

Overview

PASSL

Introduction

PASSL is a Paddle based vision library for state-of-the-art Self-Supervised Learning research with PaddlePaddle. PASSL aims to accelerate research cycle in self-supervised learning: from designing a new self-supervised task to evaluating the learned representations.

  • Reproducible implementation of SOTA in Self-Supervision: Existing SOTA in Self-Supervision are implemented - SimCLR, MoCo(v1),MoCo(v2), MoCo-BYOL, CLIP. BYOL is coming soon. Also supports supervised trainings.
  • Modular: Easy to build new tasks and reuse the existing components from other tasks (Trainer, models and heads, data transforms, etc.).

Installation

Implemented Models

Benchmark Linear Image Classification on ImageNet-1K

epochs official results passl results Backbone Model
MoCo 200 60.6 60.64 ResNet-50 download
SimCLR 100 64.5 65.3 ResNet-50 download
MoCo v2 200 67.7 67.72 ResNet-50 download
MoCo-BYOL 300 71.56 72.10 ResNet-50 download
BYOL 300 72.50 71.62 ResNet-50 download

Getting Started

Please see GETTING_STARTED.md for the basic usage of PASSL.

Tutorials

Comments
  • MLP-Mixer: An all-MLP Architecture for Vision

    MLP-Mixer: An all-MLP Architecture for Vision

    readme文件里的两个模型的TOP1 是不是写反了?模型大的准确度比模型小的准确度小一些?

    Arch | Weight | Top-1 Acc | Top-5 Acc | Crop ratio | # Params -- | -- | -- | -- | -- | -- mlp_mixer_b16_224 | pretrain 1k | 76.60 | 92.23 | 0.875 | 60.0M mlp_mixer_l16_224 | pretrain 1k | 72.06 | 87.67 | 0.875 | 208.2M

    opened by gaorui999 3
  • 我很关注图像分类的自监督进展

    我很关注图像分类的自监督进展

    小弟想问问,对于图像分类的自监督,目前是什么进展呢?比如猫狗分类这种典型的二分类准确率如何?imagenet1k分类准确率如何?PASSL里面的关于图像分类的自监督算法或者模型,有哪些?能给个例子,让我知道如何使用吗?目前看到PASSLissues才1条,文档完全没看到.方便加个微信或者QQ聊几句吗?小弟对于图像分类的自监督高度重视.还有一个疑问,关于图像分类的自监督模型,是不是我给一堆图片,模型运行后,就会把图片归类呢?我需不需要给出类别的数量呢?说白了,我想知道图像分类的自监督的一个使用流程.现在都1.0了,该有点用处了吧.如果一个模型运行后,图像就分好类了,归纳为N类,我有什么办法判断分类的正确性呢?这方面有算法吗? 提了很多问题,跪求每个问题都回答一下,谢谢大佬.

    opened by yuwoyizhan 2
  • Unintended behavior in clip_logit_scale

    Unintended behavior in clip_logit_scale

    https://github.com/PaddlePaddle/PASSL/blob/83c49e6a5ba3444cee7f054122559d7759152764/passl/modeling/backbones/clip.py#L317

    check this issue for reference https://github.com/PaddlePaddle/Paddle/issues/43710

    Suggested approach (with non-public API)

    logit_scale_buffer = self.logit_scale.clip(-4.6, 4.6)
    logit_scale_buffer._share_buffer_to(self.logit_scale)
    
    opened by minogame 1
  • 建议

    建议

    1.passl很多文字都是英文的,包括快速使用等文档,希望可以提供中文文档. 2.希望知道图像分类自监督学习的技术研究目前到达什么程度了.比如猫狗这种二分类准确率如何,imagenet准确率如何,使用passl进行图像分类,需要给类别总数量吗? 3.能加个QQ或者微信聊几句吗?有些疑问,拜托了,大佬. QQ:1226194560 微信:18820785964

    opened by yuwoyizhan 1
  • fix bug of mixup for DeiT

    fix bug of mixup for DeiT

    DeiT/B-16 pretrained on ImageNet1K:

    [01/21 02:54:46] passl.engine.trainer INFO: Validate Epoch [290] acc1 (81.336), acc5 (95.544)
    [01/21 03:02:31] passl.engine.trainer INFO: Validate Epoch [291] acc1 (81.328), acc5 (95.580)
    [01/21 03:10:20] passl.engine.trainer INFO: Validate Epoch [292] acc1 (81.390), acc5 (95.608)
    [01/21 03:18:10] passl.engine.trainer INFO: Validate Epoch [293] acc1 (81.484), acc5 (95.636)
    [01/21 03:26:00] passl.engine.trainer INFO: Validate Epoch [294] acc1 (81.452), acc5 (95.600)
    [01/21 03:33:52] passl.engine.trainer INFO: Validate Epoch [295] acc1 (81.354), acc5 (95.528)
    [01/21 03:41:38] passl.engine.trainer INFO: Validate Epoch [296] acc1 (81.338), acc5 (95.562)
    [01/21 03:49:25] passl.engine.trainer INFO: Validate Epoch [297] acc1 (81.344), acc5 (95.542)
    [01/21 03:57:15] passl.engine.trainer INFO: Validate Epoch [298] acc1 (81.476), acc5 (95.550)
    [01/21 04:05:03] passl.engine.trainer INFO: Validate Epoch [299] acc1 (81.476), acc5 (95.572)
    [01/21 04:12:51] passl.engine.trainer INFO: Validate Epoch [300] acc1 (81.386), acc5 (95.536)
    
    opened by GuoxiaWang 1
  • BYOL的预训练中好像使用了gt_label?

    BYOL的预训练中好像使用了gt_label?

    • 在byol的config 中设置了 num_classes=1000: https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/configs/byol/byol_r50_IM.yaml#L34
    • 在model中设置了self.classifier = nn.Linear(embedding_dim, num_classes),并且forward中将classif_out和label一起传给了head

    image

    https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/passl/modeling/architectures/BYOL.py#L263

    • 在L2 Head中将对比loss和有监督的CE loss加在了一起返回

    image

    https://github.com/PaddlePaddle/PASSL/blob/9d7a9fd4af41772e29120553dddab1c162e4cb70/passl/modeling/heads/l2_head.py#L43

    opened by youqingxiaozhua 0
  • [飞桨论文复现挑战赛(第六期)] (85) Emerging Properties in Self-Supervised Vision Transformers

    [飞桨论文复现挑战赛(第六期)] (85) Emerging Properties in Self-Supervised Vision Transformers

    PR types

    New features

    PR changes

    APIs

    Describe

    • Task: https://github.com/PaddlePaddle/Paddle/issues/41482
    • 添加 passl.model.architectures.dino

    Peformance

    | Model | Official | Passl | | ---- | ---- | ---- | | DINO | 74.0 | 73.6 |

    • [x] 预训练和linear probe代码
    • [ ] 预训练和linear probe权重
    • [ ] 文档
    • [ ] TIPC
    opened by fuqianya 0
Releases(v1.0.0)
  • v1.0.0(Feb 24, 2022)

    • 新增 XCiT 视觉 Transformer 模型 xcit_nano_12_p8_224 蒸馏模型训练指标对齐,感谢 @BrilliantYuKaimin 的高质量贡献 🎉 🎉 🎉

    PASSL飞桨自监督领域核心学习库,提供大量高精度的视觉自监督模型、视觉 Transformer 模型,并支持超大视觉模型分布式训练功能,旨在提升飞桨开发者在自监督领域建模效率,并提供基于飞桨框架2.2的超大视觉模型领域最佳实践

    Source code(tar.gz)
    Source code(zip)
This repository contains the source code for the paper First Order Motion Model for Image Animation

!!! Check out our new paper and framework improved for articulated objects First Order Motion Model for Image Animation This repository contains the s

13k Jan 09, 2023
A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

large-scale-ITE-UM-benchmark This repository contains code and data to reproduce the results of the paper "A Large Scale Benchmark for Individual Trea

10 Nov 19, 2022
Interactive dimensionality reduction for large datasets

BlosSOM 🌼 BlosSOM is a graphical environment for running semi-supervised dimensionality reduction with EmbedSOM. You can use it to explore multidimen

19 Dec 14, 2022
TextureGAN in Pytorch

TextureGAN This code is our PyTorch implementation of TextureGAN [Project] [Arxiv] TextureGAN is a generative adversarial network conditioned on sketc

Patsorn 147 Dec 14, 2022
Train Dense Passage Retriever (DPR) with a single GPU

Gradient Cached Dense Passage Retrieval Gradient Cached Dense Passage Retrieval (GC-DPR) - is an extension of the original DPR library. We introduce G

Luyu Gao 92 Jan 02, 2023
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Beanie - is an asynchronous ODM for MongoDB, based on Motor and Pydantic. It uses an abstraction over Pydantic models and Motor collections to work wi

295 Dec 29, 2022
Learning nonlinear operators via DeepONet

DeepONet: Learning nonlinear operators The source code for the paper Learning nonlinear operators via DeepONet based on the universal approximation th

Lu Lu 239 Jan 02, 2023
Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.

SARS-CoV-2 processing requests Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide. Prerequisites This autom

useGalaxy.eu 17 Aug 13, 2022
Bayesian algorithm execution (BAX)

Bayesian Algorithm Execution (BAX) Code for the paper: Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mut

Willie Neiswanger 38 Dec 08, 2022
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

Matterport, Inc 22.5k Jan 04, 2023
Self-Supervised Learning with Kernel Dependence Maximization

Self-Supervised Learning with Kernel Dependence Maximization This is the code for SSL-HSIC, a self-supervised learning loss proposed in the paper Self

DeepMind 29 Dec 29, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference This repository contains PyTorch evaluation code, training code and pretrained

Facebook Research 504 Jan 02, 2023
Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Clothes Parsing Overview This code provides an implementation of the research paper: A High Performance CRF Model for Clothes Parsing Edgar Simo-S

Edgar Simo-Serra 119 Nov 21, 2022
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code

Networks Learning 11 Dec 09, 2022
Code for Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022)

Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022) We consider how a user of a web servi

joisino 20 Aug 21, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
Learning to Estimate Hidden Motions with Global Motion Aggregation

Learning to Estimate Hidden Motions with Global Motion Aggregation (GMA) This repository contains the source code for our paper: Learning to Estimate

Shihao Jiang (Zac) 221 Dec 18, 2022