Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

Last update: Dec 10, 2022

Related tags

Deep Learning mlp-mixer

Overview

MLP Mixer

Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision. Give us a star if you like this repo.

Author:

Github: bangoc123
Email: [email protected]

This library belongs to our project: Papers-Videos-Code where we will implement AI SOTA papers and publish all source code. Additionally, videos to explain these models will be uploaded to ProtonX Youtube channels.

[Note] You can use your data to train this model.

I. Set up environment

Make sure you have installed Miniconda. If not yet, see the setup document here.
cd into mlp-mixer and use command line conda env create -f environment.yml to setup the environment
Run conda environment using the command conda activate mlp-mixer

II. Set up your dataset.

Create 2 folders train and validation in the data folder (which was created already). Then Please copy your images with the corresponding names into these folders.

train folder was used for the training process
validation folder was used for validating training result after each epoch

This library use image_dataset_from_directory API from Tensorflow 2.0 to load images. Make sure you have some understanding of how it works via its document.

Structure of these folders.

train/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
...class_c/
......c_image_1.jpg
......c_image_2.jpg

validation/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
...class_c/
......c_image_1.jpg
......c_image_2.jpg

III. Train your model by running this command line

python train.py --epochs ${epochs} --num-classes ${num_classes}

You want to train a model in 10 epochs for binary classification problems (with 2 classes)

Example:

python train.py --epochs 10 --num-classes 2

There are some important arguments for the script you should consider when running it:

train-folder: The folder of training images
valid-folder: The folder of validation images
model-folder: Where the model after training saved
num-classes: The number of your problem classes.
batch-size: The batch size of the dataset
c: Patch Projection Dimension
dc: Token-mixing units. It was mentioned in the paper on page 3
ds: Channel-mixing units. It was mentioned in the paper on page 3
num-of-mlp-blocks: The number of MLP Blocks
learning-rate: The learning rate of Adam Optimizer

After training successfully, your model will be saved to model-folder defined before

IV. Testing model with a new image

We offer a script for testing a model using a new image via a command line:

python predict.py --test-file-path ${test_file_path}

where test_file_path is the path of your test image.

Example:

python predict.py --test-file-path ./data/test/cat.2000.jpg

V. Feedback

If you meet any issues when using this library, please let us know via the issues submission tab.

Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision

Related tags

Overview

MLP Mixer

I. Set up environment

II. Set up your dataset.

III. Train your model by running this command line

IV. Testing model with a new image

V. Feedback

Owner

Ngoc Nguyen Ba

sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

UFT - Universal File Transfer With Python

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

This tool uses Deep Learning to help you draw and write with your hand and webcam.

A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

Using Python to Play Cyberpunk 2077

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

AI4Good project for detecting waste in the environment

A Quick and Dirty Progressive Neural Network written in TensorFlow.

RobustVideoMatting and background composing in one model by using onnxruntime.

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation

A TikTok-like recommender system for GitHub repositories based on Gorse

Course materials for Fall 2021 "CIS6930 Topics in Computing for Data Science" at New College of Florida

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Multistream CNN for Robust Acoustic Modeling