CNNs for Sentence Classification in PyTorch

Overview

Introduction

This is the implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in PyTorch.

  1. Kim's implementation of the model in Theano: https://github.com/yoonkim/CNN_sentence
  2. Denny Britz has an implementation in Tensorflow: https://github.com/dennybritz/cnn-text-classification-tf
  3. Alexander Rakhlin's implementation in Keras; https://github.com/alexander-rakhlin/CNN-for-Sentence-Classification-in-Keras

Requirement

  • python 3
  • pytorch > 0.1
  • torchtext > 0.1
  • numpy

Result

I just tried two dataset, MR and SST.

Dataset Class Size Best Result Kim's Paper Result
MR 2 77.5%(CNN-rand-static) 76.1%(CNN-rand-nostatic)
SST 5 37.2%(CNN-rand-static) 45.0%(CNN-rand-nostatic)

I haven't adjusted the hyper-parameters for SST seriously.

Usage

./main.py -h

or

python3 main.py -h

You will get:

CNN text classificer

optional arguments:
  -h, --help            show this help message and exit
  -batch-size N         batch size for training [default: 50]
  -lr LR                initial learning rate [default: 0.01]
  -epochs N             number of epochs for train [default: 10]
  -dropout              the probability for dropout [default: 0.5]
  -max_norm MAX_NORM    l2 constraint of parameters
  -cpu                  disable the gpu
  -device DEVICE        device to use for iterate data
  -embed-dim EMBED_DIM
  -static               fix the embedding
  -kernel-sizes KERNEL_SIZES
                        Comma-separated kernel size to use for convolution
  -kernel-num KERNEL_NUM
                        number of each kind of kernel
  -class-num CLASS_NUM  number of class
  -shuffle              shuffle the data every epoch
  -num-workers NUM_WORKERS
                        how many subprocesses to use for data loading
                        [default: 0]
  -log-interval LOG_INTERVAL
                        how many batches to wait before logging training
                        status
  -test-interval TEST_INTERVAL
                        how many epochs to wait before testing
  -save-interval SAVE_INTERVAL
                        how many epochs to wait before saving
  -predict PREDICT      predict the sentence given
  -snapshot SNAPSHOT    filename of model snapshot [default: None]
  -save-dir SAVE_DIR    where to save the checkpoint

Train

./main.py

You will get:

Batch[100] - loss: 0.655424  acc: 59.3750%
Evaluation - loss: 0.672396  acc: 57.6923%(615/1066) 

Test

If you has construct you test set, you make testing like:

/main.py -test -snapshot="./snapshot/2017-02-11_15-50-53/snapshot_steps1500.pt

The snapshot option means where your model load from. If you don't assign it, the model will start from scratch.

Predict

  • Example1

     ./main.py -predict="Hello my dear , I love you so much ." \
               -snapshot="./snapshot/2017-02-11_15-50-53/snapshot_steps1500.pt" 
    

    You will get:

     Loading model from [./snapshot/2017-02-11_15-50-53/snapshot_steps1500.pt]...
     
     [Text]  Hello my dear , I love you so much .
     [Label] positive
    
  • Example2

     ./main.py -predict="You just make me so sad and I have to leave you ."\
               -snapshot="./snapshot/2017-02-11_15-50-53/snapshot_steps1500.pt" 
    

    You will get:

     Loading model from [./snapshot/2017-02-11_15-50-53/snapshot_steps1500.pt]...
     
     [Text]  You just make me so sad and I have to leave you .
     [Label] negative
    

Your text must be separated by space, even punctuation.And, your text should longer then the max kernel size.

Reference

Owner
Shawn Ng
Now, I focus on the Natural Language Processing, such as QA
Shawn Ng
Companion code for the paper Theoretical characterization of uncertainty in high-dimensional linear classification

Companion code for the paper Theoretical characterization of uncertainty in high-dimensional linear classification Usage The required packages are lis

0 Feb 07, 2022
Code for CPM-2 Pre-Train

CPM-2 Pre-Train Pre-train CPM-2 此分支为110亿非 MoE 模型的预训练代码,MoE 模型的预训练代码请切换到 moe 分支 CPM-2技术报告请参考link。 0 模型下载 请在智源资源下载页面进行申请,文件介绍如下: 文件名 描述 参数大小 100000.tar

Tsinghua AI 136 Dec 28, 2022
FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

FCOS: Fully Convolutional One-Stage Object Detection This project hosts the code for implementing the FCOS algorithm for object detection, as presente

Tian Zhi 3.1k Jan 05, 2023
This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields Project Page | Paper | Supplementary | Video | Slides | Blog | Talk If

1.1k Dec 30, 2022
End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 1.9.0 ubuntu20/python3.9/pip ubuntu20/python3.8/p

ESPnet 5.9k Jan 04, 2023
CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices.

CenterFace Introduce CenterFace(size of 7.3MB) is a practical anchor-free face detection and alignment method for edge devices. Recent Update 2019.09.

StarClouds 1.2k Dec 21, 2022
Awesome Monocular 3D detection

Awesome Monocular 3D detection Paper list of 3D detetction, keep updating! Contents Paper List 2022 2021 2020 2019 2018 2017 2016 KITTI Results Paper

Zhikang Zou 184 Jan 04, 2023
An LSTM based GAN for Human motion synthesis

GAN-motion-Prediction An LSTM based GAN for motion synthesis has a few issues reading H3.6M data from A.Jain et al , will fix soon. Prediction of the

Amogh Adishesha 9 Jun 17, 2022
Improving Object Detection by Label Assignment Distillation

Improving Object Detection by Label Assignment Distillation This is the official implementation of the WACV 2022 paper Improving Object Detection by L

Cybercore Co. Ltd 51 Dec 08, 2022
We have made you a wrapper you can't refuse

We have made you a wrapper you can't refuse We have a vibrant community of developers helping each other in our Telegram group. Join us! Stay tuned fo

20.6k Jan 09, 2023
Generative Adversarial Networks(GANs)

Generative Adversarial Networks(GANs) Vanilla GAN ClusterGAN Vanilla GAN Model Structure Final Generator Structure A MLP with 2 hidden layers of hidde

Zhenbang Feng 2 Nov 05, 2021
A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking

PoseRBPF: A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking PoseRBPF Paper Self-supervision Paper Pose Estimation Video Robot Manipulati

NVIDIA Research Projects 107 Dec 25, 2022
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN Official PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. Prerequisites Python 2.7

SK T-Brain 754 Dec 29, 2022
A CNN model to detect hand gestures.

Software Used python - programming language used, tested on v3.8 miniconda - for managing virtual environment Libraries Used opencv - pip install open

Shivanshu 6 Jul 14, 2022
Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness (NeurIPS2021) This repository contains code for the paper "Smo

Jongheon Jeong 17 Dec 27, 2022
Logistic Bandit experiments. Official code for the paper "Jointly Efficient and Optimal Algorithms for Logistic Bandits".

Code for the paper Jointly Efficient and Optimal Algorithms for Logistic Bandits, by Louis Faury, Marc Abeille, Clément Calauzènes and Kwang-Sun Jun.

Faury Louis 1 Jan 22, 2022
PyTorch implementation of Constrained Policy Optimization

PyTorch implementation of Constrained Policy Optimization (CPO) This repository has a simple to understand and use implementation of CPO in PyTorch. A

Sapana Chaudhary 25 Dec 08, 2022
Fake News Detection Using Machine Learning Methods

Fake-News-Detection-Using-Machine-Learning-Methods Fake news is always a real and dangerous issue. However, with the presence and abundance of various

Achraf Safsafi 1 Jan 11, 2022
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

赛题背景 在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛

65 Dec 22, 2022
Fashion Entity Classification

Fashion-Entity-Classification - Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grays

ADITYA SHAH 1 Jan 04, 2022