Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

Overview

SW-CV-ModelZoo

Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset


Framework: TF/Keras 2.7

Training SQLite DB built using fire-egg's tools: https://github.com/fire-eggs/Danbooru2019

Currently training on Danbooru2021, 512px SFW subset (sans the rating:q images that had been included in the 2022-01-21 release of the dataset)

Reference:

Anonymous, The Danbooru Community, & Gwern Branwen; “Danbooru2021: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”, 2022-01-21. Web. Accessed 2022-01-28 https://www.gwern.net/Danbooru2021


Journal

06/02/2022: great news crew! TRC allowed me to use a bunch of TPUs!

To make better use of this amount of compute I had to overhaul a number of components, so a bunch of things are likely to have fallen to bitrot in the process. I can only guarantee NFNet can work pretty much as before with the right arguments.
NFResNet changes should have left it retrocompatible with the previous version.
ResNet has been streamlined to be mostly in line with the Bag-of-Tricks paper (arXiv:1812.01187) with the exception of the stem. It is not compatible with the previous version of the code.

The training labels have been included in the 2021_0000_0899 folder for convenience.
The list of files used for training is going to be uploaded as a GitHub Release.

Now for some numbers:
compared to my previous best run, the one that resulted in NFNetL1V1-100-0.57141:

  • I'm using 1.86x the amount of images: 2.8M vs 1.5M
  • I'm training bigger models: 61M vs 45M params
  • ... in less time: 232 vs 700 hours of processor time
  • don't get me started on actual wall clock time
  • with a few amenities thrown in: ECA for channel attention, SiLU activation

And it's all thanks to the folks at TRC, so shout out to them!

I currently have a few runs in progress across a couple of dimensions:

  • effect of model size with NFNet L0/L1/L2, with SiLU and ECA for all three of them
  • effect of activation function with NFNet L0, with SiLU/HSwish/ReLU, no ECA

Once the experiments are over, the plan is to select the network definitions that lay on the Pareto curve between throughput and F1 score and release the trained weights.

One last thing.
I'd like to call your attention to the tools/cleanlab_stuff.py script.
It reads two files: one with the binarized labels from the database, the other with the predicted probabilities.
It then uses the cleanlab package to estimate whether if an image in a set could be missing a given label. At the end it stores its conclusions in a json file.
This file could, potentially, be used in some tool to assist human intervention to add the missing tags.

You might also like...
Human head pose estimation using Keras over TensorFlow.
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

Hyperparameter Optimization for TensorFlow, Keras and PyTorch
Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Hyperparameter Optimization for Keras Talos • Key Features • Examples • Install • Support • Docs • Issues • License • Download Talos radically changes

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Deep GPs built on top of TensorFlow/Keras and GPflow

GPflux Documentation | Tutorials | API reference | Slack What does GPflux do? GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hier

tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

tf2onnx converts TensorFlow (tf-1.x or tf-2.x), tf.keras and tflite models to ONNX via command line or python api.

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Releases(models_db2021_5500_2022_10_21)
  • models_db2021_5500_2022_10_21(Oct 21, 2022)

    ConvNext B, ViT B16
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    top 5500 tags (2021_0000_0899_5500/selected_tags.csv)
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_09_25_2022_05h13m55s | B | 93.2M | 448 | 0.3673 | 0.6941 | | ViTB16_09_25_2022_04h53m38s | B16 | 90.5M | 448 | 0.3663 | 0.6918 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_09_25_2022_05h13m55s.7z(322.58 MB)
    ViTB16_09_25_2022_04h53m38s.7z(312.96 MB)
  • convnexts_db2021_2022_03_22(Mar 22, 2022)

    ConvNext, T/S/B
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_03_10_2022_21h41m23s | B | 90.01M | 448 | 0.3372 | 0.6892 | | ConvNextSV1_03_11_2022_17h49m56s | S | 51.28M | 384 | 0.3301 | 0.6798 | | ConvNextTV1_03_05_2022_15h56m42s | T | 29.65M | 320 | 0.3259 | 0.6595 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_03_10_2022_21h41m23s.7z(311.29 MB)
    ConvNextSV1_03_11_2022_17h49m56s.7z(177.36 MB)
    ConvNextTV1_03_05_2022_15h56m42s.7z(102.96 MB)
  • nfnets_db2021_2022_03_04(Mar 4, 2022)

    NFNet, L0/L1/L2 (based on timm Lx model definitions) Trained on Danbooru2021 512px SFW subset, modulos 0000-0899 alpha to white padding to make the image square is white channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | NFNetL2V1_02_20_2022_10h27m08s | L2 | 60.96M | 448 | 0.3231 | 0.6785 | | NFNetL1V1_02_17_2022_20h18m38s | L1 | 45.65M | 384 | 0.3259 | 0.6691 | | NFNetL0V1_02_10_2022_17h50m14s | L0 | 27.32M | 320 | 0.3190 | 0.6509 |

    Source code(tar.gz)
    Source code(zip)
    NFNetL0V1_02_10_2022_17h50m14s.7z(94.98 MB)
    NFNetL1V1_02_17_2022_20h18m38s.7z(157.97 MB)
    NFNetL2V1_02_20_2022_10h27m08s.7z(210.49 MB)
  • nfnet_tpu_training(Feb 6, 2022)

  • NFNetL1V1-100-0.57141(Dec 31, 2021)

    • NFNet, L1 (based on timm Lx model definitions), 100 epochs, F1 @ 0.4 at the end of the 100th epoch was 0.57141
    • Trained on Danbooru2020 512px SFW subset, modulos 0000-0599
    • 320px per side
    • alpha to white
    • padding to make the image square is white
    • channel order is BGR, scaled to 0-1
    • mixup alpha = 0.2 during epochs 76-100
    • analyze_metrics on Danbooru2020 original set, modulos 0984-0999: {'thres': 0.3485, 'F1': 0.6133, 'F2': 0.6133, 'MCC': 0.6094, 'A': 0.9923, 'R': 0.6133, 'P': 0.6133}
    • analyze_metrics on image IDs 4970000-5000000: {'thres': 0.3148, 'F1': 0.5942, 'F2': 0.5941, 'MCC': 0.5892, 'A': 0.9901, 'R': 0.5940, 'P': 0.5943}
    Source code(tar.gz)
    Source code(zip)
    NFNetL1V1-100-0.57141.7z(158.09 MB)
TorchIO is a Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Medical image preprocessing and augmentation toolkit for deep learning. Part of the PyTorch Ecosystem.

Fernando Pérez-García 1.6k Jan 06, 2023
Learning from graph data using Keras

Steps to run = Download the cora dataset from this link : https://linqs.soe.ucsc.edu/data unzip the files in the folder input/cora cd code python eda

Mansar Youness 64 Nov 16, 2022
The implementation of the algorithm in the paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020.

DS3L This is the code for paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020. Setups The code is implem

Guolz 36 Oct 19, 2022
Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

Wenqi Zhao 87 Dec 27, 2022
This repository provides the official code for GeNER (an automated dataset Generation framework for NER).

GeNER This repository provides the official code for GeNER (an automated dataset Generation framework for NER). Overview of GeNER GeNER allows you to

DMIS Laboratory - Korea University 50 Nov 30, 2022
DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene. We achieve NeRF-comparable novel-view synthesis quality with super-fast convergence.

sunset 709 Dec 31, 2022
Notes taking website build with Docker + Django + React.

Notes website. Try it in browser! / But how to run? Description. This is monorepository with notes website. Website provides web interface for creatin

Kirill Zhosul 2 Jul 27, 2022
Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

Ursa Zrimsek 2 Dec 14, 2022
Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging This repository contains an implementation

Computational Photography Lab @ SFU 1.1k Jan 02, 2023
KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

KUIELAB-MDX-Net got the 2nd place on the Leaderboard A and the 3rd place on the Leaderboard B in the MDX-Challenge ISMIR 2021

IELab@ Korea University 74 Dec 28, 2022
An NLP library with Awesome pre-trained Transformer models and easy-to-use interface, supporting wide-range of NLP tasks from research to industrial applications.

简体中文 | English News [2021-10-12] PaddleNLP 2.1版本已发布!新增开箱即用的NLP任务能力、Prompt Tuning应用示例与生成任务的高性能推理! 🎉 更多详细升级信息请查看Release Note。 [2021-08-22]《千言:面向事实一致性的生

6.9k Jan 01, 2023
USAD - UnSupervised Anomaly Detection on multivariate time series

USAD - UnSupervised Anomaly Detection on multivariate time series Scripts and utility programs for implementing the USAD architecture. Implementation

116 Jan 04, 2023
Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

XingBowen 4 May 20, 2022
Pytorch implementation of Generative Models as Distributions of Functions 🌿

Generative Models as Distributions of Functions This repo contains code to reproduce all experiments in Generative Models as Distributions of Function

Emilien Dupont 117 Dec 29, 2022
hipCaffe: the HIP port of Caffe

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Cent

ROCm Software Platform 126 Dec 05, 2022
OMNIVORE is a single vision model for many different visual modalities

Omnivore: A Single Model for Many Visual Modalities [paper][website] OMNIVORE is a single vision model for many different visual modalities. It learns

Meta Research 451 Dec 27, 2022
[ICCV 2021] Code release for "Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks"

Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks By Yikai Wang, Yi Yang, Fuchun Sun, Anbang Yao. This is the pytorc

Yikai Wang 26 Nov 20, 2022
Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ryuichiro Hataya 50 Dec 05, 2022
This repository contains pre-trained models and some evaluation code for our paper Towards Unsupervised Dense Information Retrieval with Contrastive Learning

Contriever: Towards Unsupervised Dense Information Retrieval with Contrastive Learning This repository contains pre-trained models and some evaluation

Meta Research 207 Jan 08, 2023