Gesture-Detection-and-Depth-Estimation

This is my graduation project.

(1) In this project, I use the YOLOv3 object detection model to detect gesture in RGB image. I trained the model on the self-made gesture dataset to obtain the gesture detection model based on deep learning. Then by testing the model on the test dataset, I found that the model can meet the requirements of real-time gesture detection while maintaining high accuracy.

(2) Then I tried to use the monocular depth estimation algorithm based on depth learning to estimate the depth of gesture object from a single RGB image, including FastDepth algorithm and the improved detection model based on YOLOv3. The FastDepth algorithm is trained and tested on the self-made gesture-depth dataset. Then, by adding a depth vector to output dimensions and modifying the loss function, the function of estimating target depth is added to the YOLOv3 model. Then I trained and tested the modified YOLOv3 model on the same gesture-depth dataset. Finally, the experiment results show that both methods can estimate the depth information of gesture object in RGB image to a certain extent.

Gesture detection:

Depth data:

Estimate target depth：

(3) Also, I developed a simple program with PyOpenGL that can use gesture information to draw simple shapes in three-dimensional space.

Try to draw a cube:

For more information, you can check my final paper.

YOLOv3 model is based on coldlarry's model: https://github.com/coldlarry/YOLOv3-complete-pruning

Graduation Project

Related tags

Overview

Gesture-Detection-and-Depth-Estimation

Owner

ChaosAT

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

Personalized Federated Learning using Pytorch (pFedMe)

基于Paddle框架的arcface复现

Official Python implementation of the FuzionCoin protocol

Code for binary and multiclass model change active learning, with spectral truncation implementation.

This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

Model Zoo of BDD100K Dataset

Geometry-Free View Synthesis: Transformers and no 3D Priors

Convert weight file.pth to weight file.blob

Video Frame Interpolation with Transformer (CVPR2022)

pixelNeRF: Neural Radiance Fields from One or Few Images

Extracts data from the database for a graph-node and stores it in parquet files

Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

Based on the given clinical dataset, Predict whether the patient having Heart Disease or Not having Heart Disease

disentanglement_lib is an open-source library for research on learning disentangled representations.

Code for the paper "Reinforced Active Learning for Image Segmentation"

smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the COVID-19 case by Storvik et al

Torchlight2 lan game server tool - A message forwarding tool for Torchlight 2 lan game

Jupyter Dock is a set of Jupyter Notebooks for performing molecular docking protocols interactively, as well as visualizing, converting file formats and analyzing the results.