SPEAR: Semi suPErvised dAta progRamming

Overview

PyPI docs license website GitHub repo size



Semi-Supervised Data Programming for Data Efficient Machine Learning

SPEAR is a library for data programming with semi-supervision. The package implements several recent data programming approaches including facility to programmatically label and build training data.

Pipeline

  • Design Labeling functions(LFs)
  • generate pickle file containing labels by passing raw data to LFs
  • Use one of the Label Aggregators(LA) to get final labels



SPEAR provides functionality such as

  • development of LFs/rules/heuristics for quick labeling
  • compare against several data programming approaches
  • compare against semi-supervised data programming approaches
  • use subset selection to make best use of the annotation efforts

Labelling Functions (LFs)

  • discrete LFs - Users can define LFs that return discrete labels
  • continuous LFs - return continuous scores/confidence to the labels assigned

Approaches Implemented

You can read this paper to know about below approaches

  • Only-L
  • Learning to Reweight
  • Posterior Regularization
  • Imply Loss
  • CAGE
  • Joint Learning

Data folder for SMS can be found here. This folder needs to be placed in the same directory as notebooks folder is in, to run the notebooks or examples.

Installation

Method 1

To install latest version of SPEAR package using PyPI:

pip install decile-spear

Method 2

SPEAR requires Python 3.6 or later. First install submodlib. Then install SPEAR:

git clone https://github.com/decile-team/spear.git
cd spear
pip install -r requirements/requirements.txt

Citation

@misc{abhishek2021spear,
      title={SPEAR : Semi-supervised Data Programming in Python}, 
      author={Guttu Sai Abhishek and Harshad Ingole and Parth Laturia and Vineeth Dorna and Ayush Maheshwari and Ganesh Ramakrishnan and Rishabh Iyer},
      year={2021},
      eprint={2108.00373},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Quick Links

Acknowledgment

SPEAR takes inspiration, builds upon, and uses pieces of code from several open source codebases. These include Snorkel, Snuba & Imply Loss. Also, SPEAR uses SUBMODLIB for subset selection, which is provided by DECILE too.

Team

SPEAR is created and maintained by Ayush, Abhishek, Vineeth, Harshad, Parth, Pankaj, Rishabh Iyer, and Ganesh Ramakrishnan. We look forward to have SPEAR more community driven. Please use it and contribute to it for your research, and feel free to use it for your commercial projects. We will add the major contributors here.

Publications

[1] Maheshwari, Ayush, et al. Data Programming using Semi-Supervision and Subset Selection, In Findings of ACL (Long Paper) 2021.

[2] Chatterjee, Oishik, Ganesh Ramakrishnan, and Sunita Sarawagi. Data Programming using Continuous and Quality-Guided Labeling Functions, In AAAI 2020.

[3] Sahay, Atul, et al. Rule augmented unsupervised constituency parsing, In Findings of ACL (Short Paper) 2021.

You might also like...
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

Semi-supervised Learning for Sentiment Analysis

Neural-Semi-supervised-Learning-for-Text-Classification-Under-Large-Scale-Pretraining Code, models and Datasets for《Neural Semi-supervised Learning fo

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

 From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement (CVPR'2020)
From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement (CVPR'2020)

Under-exposure introduces a series of visual degradation, i.e. decreased visibility, intensive noise, and biased color, etc. To address these problems, we propose a novel semi-supervised learning approach for low-light image enhancement.

Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

S2VD Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021) Requirements and Dependencies Ubuntu 16.04, cuda 10.0 Python 3.6.10, P

[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Comments
  • Updated condition for Gold Label check and passing parameter name passing

    Updated condition for Gold Label check and passing parameter name passing

    1. Current Version of Spear fails when we are trying to do LF analysis without passing Gold Labels and their values is passed as None and is causing the following error as it is not checked

    Y = np.array([self.mapping[v] for v in Y]) TypeError: 'NoneType' object is not iterable

    1. Also their is a function call of confusion_matrix in lf_summary method, which requires the parameter name to execute properly else it fails with following error of argument passing

    confusion_matrix(Y, self.L[:, i], labels)[1:, 1:] for i in range(m) TypeError: confusion_matrix() takes 2 positional arguments but 3 were given

    The current code change fixes these two issues.

    opened by kasuba-badri-vishal 1
  • sms_jl.ipynb ISSUE with

    sms_jl.ipynb ISSUE with "Some Labelling Functions" code snippet

    I have changed the directory of previously glove_w2v.txt and then ran on my local pc and installed all reqd libraries but it shows an invalid literal for int() with base 10: 'import'

    I think its an issue with gensim but can;t seem to resolve it

    i'm attaching a picture down below :

    https://cdn.discordapp.com/attachments/754057588714373325/989172192078098442/unknown.png

    opened by Brshank 1
Releases(v1.0.0)
Owner
decile-team
DECILE: Data EffiCient machIne LEarning
decile-team
Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).

Densely Connected Convolutional Networks (DenseNets) This repository contains the code for DenseNet introduced in the following paper Densely Connecte

Zhuang Liu 4.5k Jan 03, 2023
A collection of Google research projects related to Federated Learning and Federated Analytics.

Federated Research Federated Research is a collection of research projects related to Federated Learning and Federated Analytics. Federated learning i

Google Research 483 Jan 05, 2023
Personal project about genus-0 meshes, spherical harmonics and a cow

How to transform a cow into spherical harmonics ? Spot the cow, from Keenan Crane's blog Context In the field of Deep Learning, training on images or

3 Aug 22, 2022
Official implementation of the MM'21 paper Constrained Graphic Layout Generation via Latent Optimization

[MM'21] Constrained Graphic Layout Generation via Latent Optimization This repository provides the official code for the paper "Constrained Graphic La

Kotaro Kikuchi 73 Dec 27, 2022
Veri Setinizi Yolov5 Formatına Dönüştürün

Veri Setinizi Yolov5 Formatına Dönüştürün! Bu Repo da Neler Var? Xml Formatındaki Veri Setini .Txt Formatına Çevirme Xml Formatındaki Dosyaları Silme

Kadir Nar 4 Aug 22, 2022
[CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'

[CVPR2021] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator Overview This is the entire codebase for the paper

35 Dec 01, 2022
🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

Deep CORAL A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation. B Sun, K Saenko, ECCV 2016' Deep CORAL can learn

Andy Hsu 200 Dec 25, 2022
Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

Angular Desktop UI This is a collection for native desktop like user interface components in Angular, especially useful for Electron apps. It starts w

Marc J. Schmidt 49 Dec 22, 2022
An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

This is the code for the paper: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [Journal PDF] J

226 Nov 05, 2022
Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

nnvolterra Run Code Compile first: make compile Run all codes: make all Test xconv: make npxconv_test MNIST dataset needs to be downloaded, converted

1 May 24, 2022
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

AnimeGAN A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing. Randomly Generated Images The images are

Jie Lei 雷杰 1.2k Jan 03, 2023
Tesla Light Show xLights Guide With python

Tesla Light Show xLights Guide Welcome to the Tesla Light Show xLights guide! You can create and run your own light shows on Tesla vehicles. Running a

Tesla, Inc. 2.5k Dec 29, 2022
"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices This repository contains the official PyTorch implemen

Yandex Research 21 Oct 18, 2022
Boston House Prediction Valuation Tool

Boston-House-Prediction-Valuation-Tool From Below Anlaysis The Valuation Tool is Designed Correlation Matrix Regrssion Analysis Between Target Vs Pred

0 Sep 09, 2022
Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

This repository is the official PyTorch implementation of Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

hippopmonkey 4 Dec 11, 2022
Seq2seq - Sequence to Sequence Learning with Keras

Seq2seq Sequence to Sequence Learning with Keras Hi! You have just found Seq2Seq. Seq2Seq is a sequence to sequence learning add-on for the python dee

Fariz Rahman 3.1k Dec 18, 2022
Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated th

Yvictor 1.1k Jan 02, 2023
Replication package for the manuscript "Using Personality Detection Tools for Software Engineering Research: How Far Can We Go?" submitted to TOSEM

tosem2021-personality-rep-package Replication package for the manuscript "Using Personality Detection Tools for Software Engineering Research: How Far

Collaborative Development Group 1 Dec 13, 2021
QT Py Media Knob using rotary encoder & neopixel ring

QTPy-Knob QT Py USB Media Knob using rotary encoder & neopixel ring The QTPy-Knob features: Media knob for volume up/down/mute with "qtpy-knob.py" Cir

Tod E. Kurt 56 Dec 30, 2022
Implementing DeepMind's Fast Reinforcement Learning paper

Fast Reinforcement Learning This is a repo where I implement the algorithms in the paper, Fast reinforcement learning with generalized policy updates.

Marcus Chiam 6 Nov 28, 2022