Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Last update: Dec 15, 2022

Related tags

Deep Learning Language-Identifier

Overview

Language Identifier

What is this ?

The goal of this project is to create a model that is able to predict a given sentence language through text processing, including tokenizing and representation of sentences as vectors and applying concepts such as RNN, LSTM and GRU to create the classifier that can detect the language among 17 languages.

Dataset

Language Detection It's a small language detection dataset. This dataset consists of text details for 17 different languages

Results

All models achieved high accuracy even when using one convolution layer instead of LSTM or GRU, But GRU achieved highest accuracy 99% training accuracy 94% validation accuracy.
Using convlution layer achieved high accuracy about 95% validation accuracy
Using fewer embedding dimensions makes the model reach high accuracy faster but in Embedding Projector alot of words grouped with other languages.

32 Embedding dimensions examples

3 Embedding dimensions examples

GRU Accuracy and Loss

GRU Confusion matrix

Libraries

Tensorflow
Scikit-learn
NumPy
Pandas
Matplotlib

Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Related tags

Overview

Language Identifier

What is this ?

Dataset

Results

32 Embedding dimensions examples

3 Embedding dimensions examples

GRU Accuracy and Loss

GRU Confusion matrix

Libraries

Owner

Hossam Asaad

A PyTorch implementation of a Factorization Machine module in cython.

Object-Centric Learning with Slot Attention

Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)

Deformable DETR is an efficient and fast-converging end-to-end object detector.

Pytorch Implementation for CVPR2018 Paper: Learning to Compare: Relation Network for Few-Shot Learning

Focal Loss for Dense Rotation Object Detection

Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

PyTorch implementation of MICCAI 2018 paper "Liver Lesion Detection from Weakly-labeled Multi-phase CT Volumes with a Grouped Single Shot MultiBox Detector"

Styleformer - Official Pytorch Implementation

Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

Local Attention - Flax module for Jax

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

A proof of concept ai-powered Recaptcha v2 solver

Simple codebase for flexible neural net training

Running Google MoveNet Multipose Tracking models on OpenVINO.