An implementation of a discriminant function over a normal distribution to help classify datasets.

Overview

CS4044D Machine Learning Assignment 1

By Dev Sony, B180297CS

The question, report and source code can be found here.

Github Repo

Solution 1

Based on the formula given: Formula

The function has been defined:

def discriminant_function(x, mean, cov, d, P):
    if d == 1:
        output = -0.5*(x - mean) * (1/cov)
        output = output * (x - mean)
        output += -0.5*d*log(2*pi) - 0.5*log(cov) 

    else: 
        output = np.matmul(-0.5*(x - mean), np.linalg.inv(cov))
        output = np.matmul(output, (x - mean).T)
        output += -0.5*d*log(2*pi) - 0.5*log(np.linalg.det(cov)) 

    # Adding Prior Probability
    output += log(P)

    return output

It also accomdatees the case if only one feature is used, thus using only scalar quantities.

The variables can be configured based on the scenario. Here, it's assumed that prior probabilities are equally distributed and all features are taken:

n = len(data)
P = [1/n for i in range(n)]
d = len(data[0][0])

The input is the sample dataset, each set separated by the class they belong to as given below:

data = [
    # W1
    np.array([
        [-5.01, -8.12, -3.68],
        [-5.43, -3.48, -3.54],
        [1.08, -5.52, 1.66],
        [0.86, -3.78, -4.11],
        [-2.67, 0.63, 7.39],
        [4.94, 3.29, 2.08],
        [-2.51, 2.09, -2.59],
        [-2.25, -2.13, -6.94],
        [5.56, 2.86, -2.26],
        [1.03, -3.33, 4.33]
    ]),

    # W2
    np.array([
        [-0.91, -0.18, -0.05],
        [1.30, -2.06, -3.53],
        [-7.75, -4.54, -0.95],
        [-5.47, 0.50, 3.92],
        [6.14, 5.72, -4.85],
        [3.60, 1.26, 4.36],
        [5.37, -4.63, -3.65],
        [7.18, 1.46, -6.66],
        [-7.39, 1.17, 6.30],
        [-7.50, -6.32, -0.31]
    ]),

    # W3
    np.array([
        [5.35, 2.26, 8.13],
        [5.12, 3.22, -2.66],
        [-1.34, -5.31, -9.87],
        [4.48, 3.42, 5.19],
        [7.11, 2.39, 9.21],
        [7.17, 4.33, -0.98],
        [5.75, 3.97, 6.65],
        [0.77, 0.27, 2.41],
        [0.90, -0.43, -8.71],
        [3.52, -0.36, 6.43]
    ]) 
]

In order to classify the sample data, we first run the function through our sample dataset, classwise. On each sample, we find the class which gives the maximum output from its discriminant function.

A count and total count is maintained in order to find the success and failiure rates.

for j in range(n):
    print("\nData classes should be classified as:", j+1)
    total_count, count = 0, 0

    # Taking x as dataset belonging to class j + 1
    for x in data[j]:
        g_values = [0 for g in range(n)]        

        # Itering through each class' discriminant function
        for i in range(n):
            g_values[i] = discriminant_function(x, means[i], cov[i], d, P[i])

        # Now to output the maximum result 
        result = g_values.index(max(g_values)) + 1
        print(x, "\twas classified as", result)
        total_count, count = total_count + 1, (count + 1 if j == result - 1 else count)
        
    print("Success Rate:", (count/total_count)*100,"%")
    print("Fail Rate:", 100 - ((count/total_count))*100,"%")

Assuming that all classes have an equal prior probability (as per the configuration in the example picture), the following output is produced:

Output

Solution 2

Part (a) and (b)

In order to match the question, the configuration variables are altered.

  • data-1 for n indicates that only 2 classes will be considered (the final class would not be considered as its Prior probability is 0, implying that it wouldn't appear.)
  • We iterate through n + 1 in the outer loop as datasets of all 3 classes are being classified. (Althought class 3 is fully misclassified.)
  • The d value is changed to 1, indicating that only 1 feature will be used. (which is x1 )
n = len(data) - 1
P = [0.5, 0.5, 0]
d = 1

The configuration parameters being passed are also changed.

  • x[0] indicates that only x1 will be used.
  • means[i][0] indiciates that we need the mean only for x1).
  • cov[i][0][0] indicates the variance of feature x1).
for j in range(n + 1):
    print("\nData classes should be classified as:", j+1)
    total_count, count = 0, 0

    # Taking x as dataset belonging to class j + 1
    for x in data[j]:
        g_values = [0 for g in range(n)]        # Array for all discrminant function outputs.

        # Itering through each class' discriminant function
        for i in range(n):
            g_values[i] = discriminant_function(x[0], means[i][0], cov[i][0][0], d, P[i])

        # Now to output the maximum result 
        result = g_values.index(max(g_values)) + 1
        print(x, "\twas classified as", result)
        total_count, count = total_count + 1, (count + 1 if j == result - 1 else count)
        
    print("Success Rate:", (count/total_count)*100,"%")
    print("Fail Rate:", 100 - ((count/total_count))*100,"%")

This results in the following output:

Output1

Part (c)

Here, the configuration parameters are changed slightly.

  • d is changed to 2, as now we are considering the first and second features.
  • The matrix paramateres passed now include necessary values for the same reason.
n = len(data) - 1
P = [0.5, 0.5, 0]
d = 2

This results in the following output: Output2

Part (d)

Here again, the configurations are changed in a similiar fashion as in (c).

  • d values is changed to 3 as all three features are now considered.
  • The matrix paramaeteres are now passed without slicing as all values are important.
n = len(data) - 1
P = [0.5, 0.5, 0]
d = 3

The resuls in the following output:

Output2

Part (e)

On comparing the three outputs, using one or three features give more accurate results than using the first and second features.

Output3

The reason for this could be because the covariance with the third feature is much higher than the ones associated with the second feature.

Variance

Part (f)

In order to consider the possible configurations mentioned, the code takes an input vector and goes through all of them.

General Configuration values
n = len(data) - 1
P = [0.5, 0.5, 0]
g_values = [0 for i in range(n)]
Get input
x = list(map(float, input("Enter the input vector: ").strip().split()))
Case A
d = 1
print("Case A: Using only feature vector x1")
for i in range(n):
    g_values[i] = discriminant_function(x[0], means[i][0], cov[i][0][0], d, P[i])

result = g_values.index(max(g_values)) + 1
print(x, "\twas classified as", result)
Case B
d = 2
print("\nCase B: Using only feature vectors x1 and x2")
for i in range(n):
    g_values[i] = discriminant_function(x[0:2], means[i][0:2], cov[i][0:2, 0:2], d, P[i])

result = g_values.index(max(g_values)) + 1
print(x, "\twas classified as", result)
Case C
d = 3
print("\nCase C: Using all feature vectors")
for i in range(n):
    g_values[i] = discriminant_function(x, means[i], cov[i], d, P[i])

result = g_values.index(max(g_values)) + 1
print(x, "\twas classified as", result)

Here are the outputs for the 4 input vectors mentioned in the question: Output4

Owner
Dev Sony
I do stuff
Dev Sony
Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays

Description Summary of phylogenomic methods and analyses used in "Immunogenicity of convalescent and vaccinated sera against clinical isolates of ance

Finlay Maguire 1 Jan 06, 2022
Semantic Segmentation Architectures Implemented in PyTorch

pytorch-semseg Semantic Segmentation Algorithms Implemented in PyTorch This repository aims at mirroring popular semantic segmentation architectures i

Meet Shah 3.3k Dec 29, 2022
Code for Neural-GIF: Neural Generalized Implicit Functions for Animating People in Clothing(ICCV21)

NeuralGIF Code for Neural-GIF: Neural Generalized Implicit Functions for Animating People in Clothing(ICCV21) We present Neural Generalized Implicit F

Garvita Tiwari 104 Nov 18, 2022
Drone Task1 - Drone Task1 With Python

Drone_Task1 Matching Results 3.mp4 1.mp4

MLV Lab (Machine Learning and Vision Lab at Korea University) 11 Nov 14, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Ibai Gorordo 35 Sep 07, 2022
Rethinking Transformer-based Set Prediction for Object Detection

Rethinking Transformer-based Set Prediction for Object Detection Here are the code for the ICCV paper. The code is adapted from Detectron2 and AdelaiD

Zhiqing Sun 62 Dec 03, 2022
A large-image collection explorer and fast classification tool

IMAX: Interactive Multi-image Analysis eXplorer This is an interactive tool for visualize and classify multiple images at a time. It written in Python

Matias Carrasco Kind 23 Dec 16, 2022
[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Just Ask: Learning to Answer Questions from Millions of Narrated Videos Webpage • Demo • Paper This repository provides the code for our paper, includ

Antoine Yang 87 Jan 05, 2023
A standard framework for modelling Deep Learning Models for tabular data

PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.

801 Jan 08, 2023
Official implementation of Rich Semantics Improve Few-Shot Learning (BMVC, 2021)

Rich Semantics Improve Few-Shot Learning Paper Link Abstract : Human learning benefits from multi-modal inputs that often appear as rich semantics (e.

Mohamed Afham 11 Jul 26, 2022
Project NII pytorch scripts

project-NII-pytorch-scripts By Xin Wang, National Institute of Informatics, since 2021 I am a new pytorch user. If you have any suggestions or questio

Yamagishi and Echizen Laboratories, National Institute of Informatics 184 Dec 23, 2022
Self-supervised Product Quantization for Deep Unsupervised Image Retrieval - ICCV2021

Self-supervised Product Quantization for Deep Unsupervised Image Retrieval Pytorch implementation of SPQ Accepted to ICCV 2021 - paper Young Kyun Jang

Young Kyun Jang 71 Dec 27, 2022
LibFewShot: A Comprehensive Library for Few-shot Learning.

LibFewShot Make few-shot learning easy. Supported Methods Meta MAML(ICML'17) ANIL(ICLR'20) R2D2(ICLR'19) Versa(NeurIPS'18) LEO(ICLR'19) MTL(CVPR'19) M

<a href=[email protected]&L"> 603 Jan 05, 2023
Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

Tencent YouTu Research 50 Dec 03, 2022
[WACV 2022] Contextual Gradient Scaling for Few-Shot Learning

CxGrad - Official PyTorch Implementation Contextual Gradient Scaling for Few-Shot Learning Sanghyuk Lee, Seunghyun Lee, and Byung Cheol Song In WACV 2

Sanghyuk Lee 4 Dec 05, 2022
Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral)

Joint Discriminative and Generative Learning for Person Re-identification [Project] [Paper] [YouTube] [Bilibili] [Poster] [Supp] Joint Discriminative

NVIDIA Research Projects 1.2k Dec 30, 2022
Pytorch implementation of the paper Time-series Generative Adversarial Networks

TimeGAN-pytorch Pytorch implementation of the paper Time-series Generative Adversarial Networks presented at NeurIPS'19. Jinsung Yoon, Daniel Jarrett

Zhiwei ZHANG 21 Nov 24, 2022
How to Learn a Domain Adaptive Event Simulator? ACM MM, 2021

LETGAN How to Learn a Domain Adaptive Event Simulator? ACM MM 2021 Running Environment: pytorch=1.4, 1 NVIDIA-1080TI. More details can be found in pap

CVTEAM 4 Sep 20, 2022
Introduction to AI assignment 1 HCM University of Technology, term 211

Sokoban Bot Introduction to AI assignment 1 HCM University of Technology, term 211 Abstract This is basically a solver for Sokoban game using Breadth-

Quang Minh 4 Dec 12, 2022
Diffgram - Supervised Learning Data Platform

Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning

Diffgram 1.6k Jan 07, 2023