Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

Overview

SW-CV-ModelZoo

Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset


Framework: TF/Keras 2.7

Training SQLite DB built using fire-egg's tools: https://github.com/fire-eggs/Danbooru2019

Currently training on Danbooru2021, 512px SFW subset (sans the rating:q images that had been included in the 2022-01-21 release of the dataset)

Reference:

Anonymous, The Danbooru Community, & Gwern Branwen; “Danbooru2021: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset”, 2022-01-21. Web. Accessed 2022-01-28 https://www.gwern.net/Danbooru2021


Journal

06/02/2022: great news crew! TRC allowed me to use a bunch of TPUs!

To make better use of this amount of compute I had to overhaul a number of components, so a bunch of things are likely to have fallen to bitrot in the process. I can only guarantee NFNet can work pretty much as before with the right arguments.
NFResNet changes should have left it retrocompatible with the previous version.
ResNet has been streamlined to be mostly in line with the Bag-of-Tricks paper (arXiv:1812.01187) with the exception of the stem. It is not compatible with the previous version of the code.

The training labels have been included in the 2021_0000_0899 folder for convenience.
The list of files used for training is going to be uploaded as a GitHub Release.

Now for some numbers:
compared to my previous best run, the one that resulted in NFNetL1V1-100-0.57141:

  • I'm using 1.86x the amount of images: 2.8M vs 1.5M
  • I'm training bigger models: 61M vs 45M params
  • ... in less time: 232 vs 700 hours of processor time
  • don't get me started on actual wall clock time
  • with a few amenities thrown in: ECA for channel attention, SiLU activation

And it's all thanks to the folks at TRC, so shout out to them!

I currently have a few runs in progress across a couple of dimensions:

  • effect of model size with NFNet L0/L1/L2, with SiLU and ECA for all three of them
  • effect of activation function with NFNet L0, with SiLU/HSwish/ReLU, no ECA

Once the experiments are over, the plan is to select the network definitions that lay on the Pareto curve between throughput and F1 score and release the trained weights.

One last thing.
I'd like to call your attention to the tools/cleanlab_stuff.py script.
It reads two files: one with the binarized labels from the database, the other with the predicted probabilities.
It then uses the cleanlab package to estimate whether if an image in a set could be missing a given label. At the end it stores its conclusions in a json file.
This file could, potentially, be used in some tool to assist human intervention to add the missing tags.

You might also like...
Human head pose estimation using Keras over TensorFlow.
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

Hyperparameter Optimization for TensorFlow, Keras and PyTorch
Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Hyperparameter Optimization for Keras Talos • Key Features • Examples • Install • Support • Docs • Issues • License • Download Talos radically changes

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Deep GPs built on top of TensorFlow/Keras and GPflow

GPflux Documentation | Tutorials | API reference | Slack What does GPflux do? GPflux is a toolbox dedicated to Deep Gaussian processes (DGP), the hier

tf2onnx - Convert TensorFlow, Keras and Tflite models to ONNX.

tf2onnx converts TensorFlow (tf-1.x or tf-2.x), tf.keras and tflite models to ONNX via command line or python api.

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Releases(models_db2021_5500_2022_10_21)
  • models_db2021_5500_2022_10_21(Oct 21, 2022)

    ConvNext B, ViT B16
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    top 5500 tags (2021_0000_0899_5500/selected_tags.csv)
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_09_25_2022_05h13m55s | B | 93.2M | 448 | 0.3673 | 0.6941 | | ViTB16_09_25_2022_04h53m38s | B16 | 90.5M | 448 | 0.3663 | 0.6918 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_09_25_2022_05h13m55s.7z(322.58 MB)
    ViTB16_09_25_2022_04h53m38s.7z(312.96 MB)
  • convnexts_db2021_2022_03_22(Mar 22, 2022)

    ConvNext, T/S/B
    Trained on Danbooru2021 512px SFW subset, modulos 0000-0899
    alpha to white
    padding to make the image square is white
    channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | ConvNextBV1_03_10_2022_21h41m23s | B | 90.01M | 448 | 0.3372 | 0.6892 | | ConvNextSV1_03_11_2022_17h49m56s | S | 51.28M | 384 | 0.3301 | 0.6798 | | ConvNextTV1_03_05_2022_15h56m42s | T | 29.65M | 320 | 0.3259 | 0.6595 |

    Source code(tar.gz)
    Source code(zip)
    ConvNextBV1_03_10_2022_21h41m23s.7z(311.29 MB)
    ConvNextSV1_03_11_2022_17h49m56s.7z(177.36 MB)
    ConvNextTV1_03_05_2022_15h56m42s.7z(102.96 MB)
  • nfnets_db2021_2022_03_04(Mar 4, 2022)

    NFNet, L0/L1/L2 (based on timm Lx model definitions) Trained on Danbooru2021 512px SFW subset, modulos 0000-0899 alpha to white padding to make the image square is white channel order is BGR, input is 0...255, scaled to -1...1 within the model

    | run_name | definition_name | params_human | image_size | thres | F1 | |:---------------------------------|:------------------|:---------------|-------------:|--------:|-------:| | NFNetL2V1_02_20_2022_10h27m08s | L2 | 60.96M | 448 | 0.3231 | 0.6785 | | NFNetL1V1_02_17_2022_20h18m38s | L1 | 45.65M | 384 | 0.3259 | 0.6691 | | NFNetL0V1_02_10_2022_17h50m14s | L0 | 27.32M | 320 | 0.3190 | 0.6509 |

    Source code(tar.gz)
    Source code(zip)
    NFNetL0V1_02_10_2022_17h50m14s.7z(94.98 MB)
    NFNetL1V1_02_17_2022_20h18m38s.7z(157.97 MB)
    NFNetL2V1_02_20_2022_10h27m08s.7z(210.49 MB)
  • nfnet_tpu_training(Feb 6, 2022)

  • NFNetL1V1-100-0.57141(Dec 31, 2021)

    • NFNet, L1 (based on timm Lx model definitions), 100 epochs, F1 @ 0.4 at the end of the 100th epoch was 0.57141
    • Trained on Danbooru2020 512px SFW subset, modulos 0000-0599
    • 320px per side
    • alpha to white
    • padding to make the image square is white
    • channel order is BGR, scaled to 0-1
    • mixup alpha = 0.2 during epochs 76-100
    • analyze_metrics on Danbooru2020 original set, modulos 0984-0999: {'thres': 0.3485, 'F1': 0.6133, 'F2': 0.6133, 'MCC': 0.6094, 'A': 0.9923, 'R': 0.6133, 'P': 0.6133}
    • analyze_metrics on image IDs 4970000-5000000: {'thres': 0.3148, 'F1': 0.5942, 'F2': 0.5941, 'MCC': 0.5892, 'A': 0.9901, 'R': 0.5940, 'P': 0.5943}
    Source code(tar.gz)
    Source code(zip)
    NFNetL1V1-100-0.57141.7z(158.09 MB)
Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

48 Dec 16, 2022
Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks.

The Lottery Ticket Hypothesis for Pre-trained BERT Networks Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks. [NeurIPS

VITA 122 Dec 14, 2022
Run Keras models in the browser, with GPU support using WebGL

**This project is no longer active. Please check out TensorFlow.js.** The Keras.js demos still work but is no longer updated. Run Keras models in the

Leon Chen 4.9k Dec 29, 2022
Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

Ruiqi Gao 39 Nov 10, 2022
On Generating Extended Summaries of Long Documents

ExtendedSumm This repository contains the implementation details and datasets used in On Generating Extended Summaries of Long Documents paper at the

Georgetown Information Retrieval Lab 76 Sep 05, 2022
Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

51 Sep 22, 2022
Framework for abstracting Amiga debuggers and access to AmigaOS libraries and devices.

Framework for abstracting Amiga debuggers. This project provides abstration to control an Amiga remotely using a debugger. The APIs are not yet stable

Roc Vallès 39 Nov 22, 2022
a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work 🌟 Deep Learning Notes: A collection of my notes going from basic mult

Ge Yang 137 Nov 09, 2022
ICRA 2021 "Towards Precise and Efficient Image Guided Depth Completion"

PENet: Precise and Efficient Depth Completion This repo is the PyTorch implementation of our paper to appear in ICRA2021 on "Towards Precise and Effic

232 Dec 25, 2022
"Neural Turing Machine" in Tensorflow

Neural Turing Machine in Tensorflow Tensorflow implementation of Neural Turing Machine. This implementation uses an LSTM controller. NTM models with m

Taehoon Kim 1k Dec 06, 2022
An Object Oriented Programming (OOP) interface for Ontology Web language (OWL) ontologies.

Enabling a developer to use Ontology Web Language (OWL) along with its reasoning capabilities in an Object Oriented Programming (OOP) paradigm, by pro

TheEngineRoom-UniGe 7 Sep 23, 2022
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Hong-Jia Chen 91 Dec 02, 2022
Official Implementation of SWAGAN: A Style-based Wavelet-driven Generative Model

Official Implementation of SWAGAN: A Style-based Wavelet-driven Generative Model SWAGAN: A Style-based Wavelet-driven Generative Model Rinon Gal, Dana

55 Dec 06, 2022
A Small and Easy approach to the BraTS2020 dataset (2D Segmentation)

BraTS2020 A Light & Scalable Solution to BraTS2020 | Medical Brain Tumor Segmentation (2D Segmentation) Developed the segmentation models for segregat

Gunjan Haldar 0 Jan 19, 2022
This repository contains the map content ontology used in narrative cartography

Narrative-cartography-ontology This repository contains the map content ontology used in narrative cartography, which is associated with a submission

Weiming Huang 0 Oct 31, 2021
A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

Easy-ERA5-Trck Easy-ERA5-Trck Galleries Install Usage Repository Structure Module Files Version iteration Easy-ERA5-Trck is a super lightweight Lagran

Zhenning Li 26 Nov 19, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
Pytorch modules for paralel models with same architecture. Ideal for multi agent-based systems

WideLinears Pytorch parallel Neural Networks A package of pytorch modules for fast paralellization of separate deep neural networks. Ideal for agent-b

1 Dec 17, 2021
Explainable Medical ImageSegmentation via GenerativeAdversarial Networks andLayer-wise Relevance Propagation

MedAI: Transparency in Medical Image Segmentation What is this repo This repo contains the code and experiments that are implemented to contribute in

Awadelrahman M. A. Ahmed 1 Nov 22, 2021
Supervised forecasting of sequential data in Python.

Supervised forecasting of sequential data in Python. Intro Supervised forecasting is the machine learning task of making predictions for sequential da

The Alan Turing Institute 54 Nov 15, 2022