Repository for Multimodal AutoML Benchmark

Last update: Nov 24, 2022

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Repository for the NeurIPS 2021 Dataset Track Submission "Benchmarking Multimodal AutoML for Tabular Data with Text Fields" (Link, Full Paper with Appendix). An earlier version of the paper, called "Multimodal AutoML on Structured Tables with Text Fields" (Link) has been accepted by ICML 2021 AutoML workshop as Oral. As we have since updated the benchmark with more datasets, the version used in the AutoML workshop paper has been archived at the icml_workshop branch.

This benchmark contains a diverse collection of tabular datasets. Each dataset contains numeric/categorical as well as text columns. The goal is to evaluate the performance of (automated) ML systems for supervised learning (classification and regression) with such multimodal data. The folder multimodal_text_benchmark/scripts/benchmark/ provides Python scripts to run different variants of the AutoGluon and H2O AutoML tools on the benchmark.

Datasets used in the Benchmark

Here's a brief summary of the datasets in our benchmark. Each dataset is described in greater detail in the multimodal_text_benchmark/ folder.

ID	key	#Train	#Test	Task	Metric	Prediction Target
prod	product_sentiment_machine_hack	5,091	1,273	multiclass	accuracy	sentiment related to product
salary	data_scientist_salary	15,84	3961	multiclass	accuracy	salary range in data scientist job listings
airbnb	melbourne_airbnb	18,316	4,579	multiclass	accuracy	price of Airbnb listing
channel	news_channel	20,284	5,071	multiclass	accuracy	category of news article
wine	wine_reviews	84,123	21,031	multiclass	accuracy	variety of wine
imdb	imdb_genre_prediction	800	200	binary	roc_auc	whether film is a drama
fake	fake_job_postings2	12,725	3,182	binary	roc_auc	whether job postings are fake
kick	kick_starter_funding	86,052	21,626	binary	roc_auc	will Kickstarter get funding
jigsaw	jigsaw_unintended_bias100K	100,000	25,000	binary	roc_auc	whether comments are toxic
qaa	google_qa_answer_type_reason_explanation	4,863	1,216	regression	r2	type of answer
qaq	google_qa_question_type_reason_explanation	4,863	1,216	regression	r2	type of question
book	bookprice_prediction	4,989	1,248	regression	r2	price of books
jc	jc_penney_products	10,860	2,715	regression	r2	price of JC Penney products
cloth	women_clothing_review	18,788	4,698	regression	r2	review score
ae	ae_price_prediction	22,662	5,666	regression	r2	American-Eagle item prices
pop	news_popularity2	24,007	6,002	regression	r2	news article popularity online
house	california_house_price	24,007	6,002	regression	r2	sale price of houses in California
mercari	mercari_price_suggestion100K	100,000	25,000	regression	r2	price of Mercari products

License

The versions of datasets in this benchmark are released under the CC BY-NC-SA license. Note that the datasets in this benchmark are modified versions of previously publicly-available original copies and we do not own any of the datasets in the benchmark. Any data from this benchmark which has previously been published elsewhere falls under the original license from which the data originated. Please refer to the licenses of each original source linked in the multimodal_text_benchmark/README.md.

Install the Benchmark Suite

cd multimodal_text_benchmark
# Install the benchmarking suite
python3 -m pip install -U -e .

You can do a quick test of the installation by going to the test folder

cd multimodal_text_benchmark/tests
python3 -m pytest test_datasets.py

To work with one of the datasets, use the following code:

from auto_mm_bench.datasets import dataset_registry

print(dataset_registry.list_keys())  # list of all dataset names
dataset_name = 'product_sentiment_machine_hack'

train_dataset = dataset_registry.create(dataset_name, 'train')
test_dataset = dataset_registry.create(dataset_name, 'test')
print(train_dataset.data)
print(test_dataset.data)

To access all datasets that comprise the benchmark:

from auto_mm_bench.datasets import create_dataset, TEXT_BENCHMARK_ALIAS_MAPPING

for dataset_name in list(TEXT_BENCHMARK_ALIAS_MAPPING.values()):
    print(dataset_name)
    dataset = create_dataset(dataset_name)

Run Experiments

Go to multimodal_text_benchmark/scripts/benchmark to see how to run some baseline ML methods over the benchmark.

References

BibTeX entry of the ICML Workshop Version:

@article{agmultimodaltext,
  title={Multimodal AutoML on Structured Tables with Text Fields},
  author={Shi, Xingjian and Mueller, Jonas and Erickson, Nick and Li, Mu and Smola, Alexander},
  journal={8th ICML Workshop on Automated Machine Learning (AutoML)},
  year={2021}
}

Repository for Multimodal AutoML Benchmark

Related tags

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Datasets used in the Benchmark

License

Install the Benchmark Suite

Run Experiments

References

Owner

Xingjian Shi

[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

LinkNet - This repository contains our Torch7 implementation of the network developed by us at e-Lab.

Dynamica causal Bayesian optimisation

End-To-End Memory Network using Tensorflow

Towards the D-Optimal Online Experiment Design for Recommender Selection (KDD 2021)

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

LAnguage Model Analysis

Deep Learning Theory

ML powered analytics engine for outlier detection and root cause analysis.

Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task

The code of "Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer".

Gym for multi-agent reinforcement learning

HybridNets: End-to-End Perception Network

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

PyTorch implementation of SIFT descriptor

Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"