Implements Gradient Centralization and allows it to use as a Python package in TensorFlow

Overview

Gradient Centralization TensorFlow Twitter

PyPI Upload Python Package Flake8 Lint Python Version

Binder Open In Colab

GitHub license PEP8 GitHub stars GitHub forks GitHub watchers

This Python package implements Gradient Centralization in TensorFlow, a simple and effective optimization technique for Deep Neural Networks as suggested by Yong et al. in the paper Gradient Centralization: A New Optimization Technique for Deep Neural Networks. It can both speedup training process and improve the final generalization performance of DNNs.

Installation

Run the following to install:

pip install gradient-centralization-tf

Usage

gctf.centralized_gradients_for_optimizer

Create a centralized gradients functions for a specified optimizer.

Arguments:

  • optimizer: a tf.keras.optimizers.Optimizer object. The optimizer you are using.

Example:

>>> opt = tf.keras.optimizers.Adam(learning_rate=0.1)
>>> optimizer.get_gradients = gctf.centralized_gradients_for_optimizer(opt)
>>> model.compile(optimizer = opt, ...)

gctf.get_centralized_gradients

Computes the centralized gradients.

This function is ideally not meant to be used directly unless you are building a custom optimizer, in which case you could point get_gradients to this function. This is a modified version of tf.keras.optimizers.Optimizer.get_gradients.

Arguments:

  • optimizer: a tf.keras.optimizers.Optimizer object. The optimizer you are using.
  • loss: Scalar tensor to minimize.
  • params: List of variables.

Returns:

A gradients tensor.

gctf.optimizers

Pre built updated optimizers implementing GC.

This module is speciially built for testing out GC and in most cases you would be using gctf.centralized_gradients_for_optimizer though this module implements gctf.centralized_gradients_for_optimizer. You can directly use all optimizers with tf.keras.optimizers updated for GC.

Example:

>>> model.compile(optimizer = gctf.optimizers.adam(learning_rate = 0.01), ...)
>>> model.compile(optimizer = gctf.optimizers.rmsprop(learning_rate = 0.01, rho = 0.91), ...)
>>> model.compile(optimizer = gctf.optimizers.sgd(), ...)

Returns:

A tf.keras.optimizers.Optimizer object.

Developing gctf

To install gradient-centralization-tf, along with tools you need to develop and test, run the following in your virtualenv:

git clone [email protected]:Rishit-dagli/Gradient-Centralization-TensorFlow
# or clone your own fork

pip install -e .[dev]

License

Copyright 2020 Rishit Dagli

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Comments
  • On windows Tensorflow 2.5 it gives error

    On windows Tensorflow 2.5 it gives error

    On windows 10 with miniconda enviroment tensorflow 2.5 gives error on centralized_gradients.py file.

    the solution is change import keras.backend as K with import tensorflow.keras.backend as K

    bug 
    opened by mgezer 5
  • The results in the mnist example are wrong/misleading

    The results in the mnist example are wrong/misleading

    Describe the bug The results in your colab ipython notebook are misleading: https://colab.research.google.com/github/Rishit-dagli/Gradient-Centralization-TensorFlow/blob/main/examples/gctf_mnist.ipynb

    In this example, the model is first trained with a normal Adam optimizer:

    model.compile(optimizer = tf.keras.optimizers.Adam(),
                  loss = 'sparse_categorical_crossentropy',
                  metrics = ['accuracy'])
    
    history_no_gctf = model.fit(training_images, training_labels, epochs=5, callbacks = [time_callback_no_gctf])
    

    And afterwards the same model is recompiled with the gctf.optimizers.adam(). However, recompiling a keras model does not reset the weights. This means that in the first fit call the model is trained and then in the second fit call with the new optimizer the same model is used and of course then the results are better.

    This can be fixed, by recreating the model for the second run, by just adding these few lines:

    import gctf #import gctf
    
    time_callback_gctf = TimeHistory()
    
    # Model architecture
    model = tf.keras.models.Sequential([
                                        tf.keras.layers.Flatten(), 
                                        tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(256, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(64, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(256, activation=tf.nn.relu),
                                        tf.keras.layers.Dense(64, activation=tf.nn.relu), 
                                        tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
    
    model.compile(optimizer = gctf.optimizers.adam(),
                  loss = 'sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    history_gctf = model.fit(training_images, training_labels, epochs=5, callbacks=[time_callback_gctf])
    

    However, then the results are not better than without gctf:

    Type                   Execution time    Accuracy      Loss
    -------------------  ----------------  ----------  --------
    Model without gctf:           24.7659    0.88825   0.305801
    Model with gctf               24.7881    0.889567  0.30812
    

    Could you please clarify what happens here. I tried this gctf.optimizers.adam() optimizer in my own research and it didn't change the results at all and now after seeing it doesn't work in the example which was constructed here. Makes me question the results of this paper.

    To Reproduce Execute the colab file given in the repository: https://colab.research.google.com/github/Rishit-dagli/Gradient-Centralization-TensorFlow/blob/main/examples/gctf_mnist.ipynb

    Expected behavior The right comparison would be if both models start from a random initialization, not that the second model can start with the already pre-trained weights.

    Looking forward to a fast a swift explanation.

    Best, Max

    question 
    opened by themasterlink 2
  • Wider dependency requirements

    Wider dependency requirements

    The package as of now to be installed requires tensorflow ~= 2.4.0 and keras ~= 2.4.0. It turns out that this is sometimes problematic for folks who have custom installations of TensorFlow and a winder requirement could be set up.

    enhancement 
    opened by Rishit-dagli 1
  • Release 0.0.3

    Release 0.0.3

    This release includes some fixes and improvements

    ✅ Bug Fixes / Improvements

    • Allow wider versions for TensorFlow and Keras while installing the package (#14 )
    • Fixed incorrect usage example in docstrings and description for centralized_gradients_for_optimizer (#13 )
    • Add clear aims for each of the examples of using gctf (#15 )
    • Updates PyPi classifiers to clearly show the aims of this project. This should have no changes in the way you use this package (#18 )
    • Add clear instructions for using this with custom optimizers i.e. directly use get_centralized_gradients however a complete example has not been pushed due to the reasons mentioned in the issue (#16 )
    opened by Rishit-dagli 0
  • Add an

    Add an "About The Examples" section

    Add an "About The Examples" section which contains a summary of the usage example notebooks and links to run it on Binder and Colab.


    Close #15

    opened by Rishit-dagli 0
  • Update relevant pypi classifiers

    Update relevant pypi classifiers

    Add PyPI classifiers for:

    • Development status
    • Intended Audience
    • Topic

    Further also added the Programming Language :: Python :: 3 :: Only classifer


    Closes #18

    opened by Rishit-dagli 0
  • Update pypi classifiers

    Update pypi classifiers

    I am specifically thinking of adding three more categories of pypi classifiers:

    • Development status
    • Intended Audience
    • Topic

    Apart from this I also think it would be great to add the Programming Language :: Python :: 3 :: Only to make sure the audience to know that this package is intended for Python 3 only.

    opened by Rishit-dagli 0
  • Add an

    Add an "About the examples" section

    It would be great to write an "About the example" section which could demonstrate in short what the example notebooks aim to achieve and show.

    documentation 
    opened by Rishit-dagli 0
  • Error in usage example for gctf.centralized_gradients_for_optimizer

    Error in usage example for gctf.centralized_gradients_for_optimizer

    I noticed that the docstrings for gctf.centralized_gradients_for_optimizer have an error in the example usage section. The example creates an Adam optimizer instance and saves it to opt however the centralized_gradients_for_optimizer is applied on optimizer which ideally does not exist and running the example would result in an error.

    documentation 
    opened by Rishit-dagli 0
  • [ImgBot] Optimize images

    [ImgBot] Optimize images

    opened by imgbot[bot] 0
  • [ImgBot] Optimize images

    [ImgBot] Optimize images

    opened by imgbot[bot] 0
Releases(v0.0.3)
  • v0.0.3(Mar 11, 2021)

    This release includes some fixes and improvements

    ✅ Bug Fixes / Improvements

    • Allow wider versions for TensorFlow and Keras while installing the package (#14 )
    • Fixed incorrect usage example in docstrings and description for centralized_gradients_for_optimizer (#13 )
    • Add clear aims for each of the examples of using gctf (#15 )
    • Updates PyPi classifiers to clearly show the aims of this project. This should have no changes in the way you use this package (#18 )
    • Add clear instructions for using this with custom optimizers i.e. directly use get_centralized_gradients however a complete example has not been pushed due to the reasons mentioned in the issue (#16 )
    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(Feb 21, 2021)

    This release includes some fixes and improvements

    ✅ Bug Fixes / Improvements

    • Fix the issue of supporting multiple modules
    • Fix multiple typos.
    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(Feb 20, 2021)

Owner
Rishit Dagli
High School, Ted-X, Ted-Ed speaker|Mentor, TFUG Mumbai|International Speaker|Microsoft Student Ambassador|#ExploreML Facilitator
Rishit Dagli
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

2k Jan 05, 2023
[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

Yi Wei 369 Dec 24, 2022
A general-purpose encoder-decoder framework for Tensorflow

READ THE DOCUMENTATION CONTRIBUTING A general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summariz

Google 5.5k Jan 07, 2023
COD-Rank-Localize-and-Segment (CVPR2021)

COD-Rank-Localize-and-Segment (CVPR2021) Simultaneously Localize, Segment and Rank the Camouflaged Objects Full camouflage fixation training dataset i

JingZhang 52 Dec 20, 2022
This repository contains part of the code used to make the images visible in the article "How does an AI Imagine the Universe?" published on Towards Data Science.

Generative Adversarial Network - Generating Universe This repository contains part of the code used to make the images visible in the article "How doe

Davide Coccomini 9 Dec 18, 2022
Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation

Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation This repository contains the Pytorch implementation of the proposed

Devavrat Tomar 19 Nov 10, 2022
This is an official implementation for "Video Swin Transformers".

Video Swin Transformer By Ze Liu*, Jia Ning*, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin and Han Hu. This repo is the official implementation of "V

Swin Transformer 981 Jan 03, 2023
Drone Task1 - Drone Task1 With Python

Drone_Task1 Matching Results 3.mp4 1.mp4

MLV Lab (Machine Learning and Vision Lab at Korea University) 11 Nov 14, 2022
Convolutional Neural Network for Text Classification in Tensorflow

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post. It is slightly simplified implementation of Kim's Convo

Denny Britz 5.5k Jan 02, 2023
Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition [ArXiv+Supplementary] [IEEE Xplore RA-L 2021] [ICRA 2021 YouTube Video]

Sourav Garg 63 Dec 12, 2022
CNN designed for pansharpening

PROGRESSIVE BAND-SEPARATED CONVOLUTIONAL NEURAL NETWORK FOR MULTISPECTRAL PANSHARPENING This repository contains main code for the paper PROGRESSIVE B

SerendipitysX 3 Dec 29, 2021
Changing the Mind of Transformers for Topically-Controllable Language Generation

We will first introduce the how to run the IPython notebook demo by downloading our pretrained models. Then, we will introduce how to run our training and evaluation code.

IESL 20 Dec 06, 2022
FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

HKBU High Performance Machine Learning Lab 6 Nov 18, 2022
A naive ROS interface for visualDet3D.

YOLO3D ROS Node This repo contains a Monocular 3D detection Ros node. Base on https://github.com/Owen-Liuyuxuan/visualDet3D All parameters are exposed

Yuxuan Liu 19 Oct 08, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
My 1st place solution at Kaggle Hotel-ID 2021

1st place solution at Kaggle Hotel-ID My 1st place solution at Kaggle Hotel-ID to Combat Human Trafficking 2021. https://www.kaggle.com/c/hotel-id-202

Kohei Ozaki 18 Aug 19, 2022
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation Ported from https://github.com/hzwer/arXiv2020-RIFE Dependencies NumPy

49 Jan 07, 2023
Generating Band-Limited Adversarial Surfaces Using Neural Networks

Generating Band-Limited Adversarial Surfaces Using Neural Networks This is the official repository of the technical report that was published on arXiv

3 Jul 26, 2022
A distributed deep learning framework that supports flexible parallelization strategies.

FlexFlow FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization stra

528 Dec 25, 2022
(CVPR 2022) Energy-based Latent Aligner for Incremental Learning

Energy-based Latent Aligner for Incremental Learning Accepted to CVPR 2022 We illustrate an Incremental Learning model trained on a continuum of tasks

Joseph K J 37 Jan 03, 2023