Active Learning demo using two small datasets

Last update: Nov 10, 2021

Related tags

Data Analysis ActiveLearningDemo

Overview

ActiveLearningDemo

How to run

step one

put the dataset folder and use command below to split the dataset to the required structure

run utils.py

For each dataset, six .mat documents should be included: TrainingMatrix.mat, TrainingLabels.mat, TestingMatrix.mat, TestingLabels.mat, UnlabeledMatrix.mat and UnlabeledLabels.mat.

step two

Train the model. You can set arguments:

Active learning

optional arguments:
  -h, --help            show this help message and exit
  --src SRC             dataset path
  --dst DST             destination path
  --type TYPE           sample strategy:random, entropy, combine
  --solver SOLVER       model solver
  --max_iter MAX_ITER   max iteration of each training
  --k K                 samele added for each iteration
  --n N                 number of iterations
  --plot_type PLOT_TYPE
                        plot single for one case(single) or plot average for
                        entire database(average)

You can utilize both one dataset with multiple subsets inside and one case of a dataset with only six .mat documents. By default, I used "newton-cg" solver and "combine" type which can train model with both strategies at once. To get results on different datasets directly, you can use:

python main.py --src your dataset path(./datasets/MMI) --dst output path(./img)

Result

MMI dataset

use "lbfgs" solver:

use "newton-cg" solver:

MindReading dataset

use "lbfgs" solver:

use "newton-cg" solver:

Active Learning demo using two small datasets

Related tags

Overview

ActiveLearningDemo

How to run

Result

Owner

GWpy is a collaboration-driven Python package providing tools for studying data from ground-based gravitational-wave detectors

Python library for creating data pipelines with chain functional programming

Program that predicts the NBA mvp based on data from previous years.

Bamboolib - a GUI for pandas DataFrames

Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

A set of procedures that can realize covid19 virus detection based on blood.

Used for data processing in machine learning, and help us to construct ML model more easily from scratch

Demonstrate a Dataflow pipeline that saves data from an API into BigQuery table

Data Intelligence Applications - Online Product Advertising and Pricing with Context Generation

Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

MotorcycleParts DataAnalysis python

Udacity - Data Analyst Nanodegree - Project 4 - Wrangle and Analyze Data

Detecting Underwater Objects (DUO)

DefAP is a program developed to facilitate the exploration of a material's defect chemistry

Ejercicios Panda usando Pandas

A set of tools to analyse the output from TraDIS analyses

Minimal working example of data acquisition with nidaqmx python API

Containerized Demo of Apache Spark MLlib on a Data Lakehouse (2022)

First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.