Extreme Dynamic Classifier Chains - XGBoost for Multi-label Classification

Related tags

Deep LearningXDCC
Overview

Extreme Dynamic Classifier Chains

Classifier chains is a key technique in multi-label classification, sinceit allows to consider label dependencies effectively. However, the classifiers arealigned according to a static order of the labels. In the concept of dynamic classifier chains (DCC) the label ordering is chosen for each prediction dynamically depending on the respective instance at hand. We combine this concept with the boosting of extreme gradient boosted trees (XGBoot), an effective and scalable state-of-the-art technique, and incorporate DCC in a fast multi-label extension of XGBoost which we make publicly available. As only positive labels have to be predicted and these are usually only few, the training costs can be further substantially reduced. Moreover, as experiments on ten datasets show, the length of the chain allows for a more control over the usage of previous predictions and hence over the measure one want to optimize,

Installation

The first step requires to build the modified multilabel version of XGBoost and install the resulting python package to build the dynamic chain model. This requires MinGW, i.e. the mingw32-make command, and Python 3. To start the build run the following commands:

cd XGBoost_ML
mingw32-make -j4

After a successful execution the python package can be installed.

cd python-package
python setup.py install

You should now be able to import the package into your Python project:

import xgboost as xgb

Training the Dynamic Chain Model

We recommend running the models by calling train_dcc.py from within a console. Place all datasets as .arff files into the datasets directory. Append -train to the train set and -test to the test set.

Parameters:

The following parameters are available:

Parameter Short Description Required
--filename <string> -f Name of your dataset .arff file located in the datasets sub-directory yes
--num_labels <int> -l Number of Labels in the dataset yes
--models <string> -m Specifies all models that will be build. Available options:
  • dcc: The proposed dynamic chain model
  • sxgb: A single multilabel XGBoost model
  • cc-dcc: A classifier chain with the label order of a previously built dynamic chain
  • cc-freq: A classifier chain with a label order sorted by label frequency (frequent to rare) in the train set
  • cc-rare: A classifier chain with a label order sorted by label frequency (rare to frequent) in the train set
  • cc-rand: A classifier chain with a random label order
  • br: A binary relevance model
example: -m "dc,br"
yes
--validation <int> -v Size of validation set. The first XX% of the train set will be used for validating the model. If the parameter is not set, the test set will be used for evaluation. Example: --validation 20 The frist 20% will be used for evaluation, the last 80% for training. (default: 0) no
--max_depth <int> -d Max depth of each XGBoost multilabel tree (default: 10) no
--num_rounds <int> -r Number of boosting rounds of each XGBoost model (default: 10) no
--chain_length <int> -c Length of the chain. Represents number of labeling-rounds. Each round builds a new XGBoost model that will predict a single label per instance (default: num_labels) no
--split <int> -s Index of split method used for building the trees. Available options:
  • maxGain: 1
  • maxWeight: 2
  • sumGain: 3
  • sumWeight: 4
  • maxAbsGain: 5
  • sumAbsGain: 6
(default: 1)
no
--parameters <string> -p XGBoost parameters used for each model in the chain. Example: -p "{'silent':1, 'eta':0.1}" (default: {}) no
--features_to_transform <string> -t A list of all features in the dataset that have to be encoded. XGBoost can only process numerical features. Use this parameter to encode categorical features. Example: -t "featureA,featureB" no
--output_extra -o Write extended log and json files (default: True) no

Example

We train two models, the dynamic chain and a binary relevance model, on a dataset called emotions with 6 labels. So we specify the models with -m "dc, br" and the dataset with -f "emotions". Additionally we place the files for training and testing into the datasets directory:

project
│   README.md
│   train_dcc.py   
│
└───datasets
│   │   emotions-train.arff
│   │   emotions-test.arff
│   
└───XGBoost_ML
    │   ...

The dcc model should build a full chain with 6 models, so we use -l 6. All XGBoost models, also the one for binary relevance, should train for 100 rounds with a maximum tree depth of 10 and a step size of 0.1. Therefore we add -p "{'eta':0.1}" -r 100 -d 10

The full command to train and evaluate both models is:

 train_dcc.py -p "{'eta':0.1}" -f "emotions" -l 6 -r 100 -d 10 -c 6 -m 'dcc, br'
The code for replicating the experiments from the LFI in SSMs with Unknown Dynamics paper.

Likelihood-Free Inference in State-Space Models with Unknown Dynamics This package contains the codes required to run the experiments in the paper. Th

Alex Aushev 0 Dec 27, 2021
Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)"

BAM and CBAM Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)" Updat

Jongchan Park 1.7k Jan 01, 2023
Western-3DSlicer-Modules - Point-Set Registrations for Ultrasound Probe Calibrations

Point-Set Registrations for Ultrasound Probe Calibrations -Undergraduate Thesis-

Matteo Tanzi 0 May 04, 2022
PyGCL: Graph Contrastive Learning Library for PyTorch

PyGCL: Graph Contrastive Learning for PyTorch PyGCL is an open-source library for graph contrastive learning (GCL), which features modularized GCL com

GCL: Graph Contrastive Learning Library for PyTorch 594 Jan 08, 2023
Visual dialog agents with pre-trained vision-and-language encoders.

Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation Or READ-UP: Referring Expression Agent Dialog with Unified Pretr

7 Oct 08, 2022
PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

PyStan NOTE: This documentation describes a BETA release of PyStan 3. PyStan is a Python interface to Stan, a package for Bayesian inference. Stan® is

Stan 229 Dec 29, 2022
Code for the bachelors-thesis flaky fault localization

Flaky_Fault_Localization Scripts for the Bachelors-Thesis: "Flaky Fault Localization" by Christian Kasberger. The thesis examines the usefulness of sp

Christian Kasberger 1 Oct 26, 2021
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences forImage-Text Retrieval

NSGDC Some codes in this repo are copied/modified from opensource implementations made available by UNITER, PyTorch, HuggingFace, OpenNMT, and Nvidia.

Zhihao Fan 2 Nov 07, 2022
code for our ECCV 2020 paper "A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation"

Code for our ECCV (2020) paper A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation. Prerequisites: python == 3.6.8 pytorch ==1.1.0

32 Nov 27, 2022
Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection

fpn.pytorch Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection Introduction This project inherits the property of our pytorc

Jianwei Yang 912 Dec 21, 2022
The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

MultiModal-Collaborative (MMC) Learning Framework for integrating RGB and Thermal spectral modalities This is the official code for NeurIPS 2021 Machi

NeurAI 12 Nov 02, 2022
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Table of Content Introduction Getting Started Datasets Installation Experiments Training & Testing Pretrained models Texture fine-tuning Demo Toward R

VinAI Research 42 Dec 05, 2022
Generating Videos with Scene Dynamics

Generating Videos with Scene Dynamics This repository contains an implementation of Generating Videos with Scene Dynamics by Carl Vondrick, Hamed Pirs

Carl Vondrick 706 Jan 04, 2023
Flower classification model that classifies flowers in 10 classes made using transfer learning (~85% accuracy).

flower-classification-inceptionV3 Flower classification model that classifies flowers in 10 classes. Training and validation are done using a pre-anot

Ivan R. Mršulja 1 Dec 12, 2021
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
Source code for the paper "Periodic Traveling Waves in an Integro-Difference Equation With Non-Monotonic Growth and Strong Allee Effect"

Source code for the paper "Periodic Traveling Waves in an Integro-Difference Equation With Non-Monotonic Growth and Strong Allee Effect" by Michael Ne

M Nestor 1 Apr 19, 2022
A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Lobe This is a Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images. This component lets you easily use an exported m

Kendell R 4 Feb 28, 2022
DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

DCGAN in Tensorflow Tensorflow implementation of Deep Convolutional Generative Adversarial Networks which is a stabilize Generative Adversarial Networ

Taehoon Kim 7.1k Dec 29, 2022
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

Wenxuan Zhou 146 Nov 29, 2022