GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Last update: Nov 24, 2021

Related tags

Overview

Guidedog

Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled. You may as well think of it as "speaking guide dog," as the name suggests. It has three key features based on the scene captured by your mobile phone:

Reads text upon command
Describes the scene around you upon command
Warns you if there is an obstacle in front of you

Check out this demo video to learn more about our app!

Android App

UI/UX
- Simple and Responsive
- Voice Assistant architecture for targeted audience
Libraries / APIs
- GC Speech-to-text and Text-to-Speech
- Android SDK , androidX
- ML Kit object detection and tracking api
- TensorFlow Lite MobileNet Image Classification Model

Backend

Flask API
- Image Captioning
- Optical Character Recognition
Deployment
- Google App Engine
- fast central API with different endpoints

Image Captioning

We used tensorflow to build and train model for image captioning on MS-COCO 2014 based on the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. The model uses standard convolutional network as an encoder to extract features from images (we use Inception V3) and feed the generated features into an attention-based decoder generate sentences. While the paper used LSTM model as a decoder, we use a simpler RNN instead.

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Related tags

Overview

Guidedog

Android App

Backend

Image Captioning

Get more insights : Devpost

Owner

Kyuhee Jo

Github Traffic Insights as Prometheus metrics.

Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

PyTorch implementation for View-Guided Point Cloud Completion

3D dataset of humans Manipulating Objects in-the-Wild (MOW)

A simple tutoral for error correction task, based on Pytorch

Code for Efficient Visual Pretraining with Contrastive Detection

Predictive AI layer for existing databases.

SingleVC performs any-to-one VC, which is an important component of MediumVC project.

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

Kernel Point Convolutions

Scientific Computation Methods in C and Python (Open for Hacktoberfest 2021)

Hyperparameter tuning for humans

Learning Versatile Neural Architectures by Propagating Network Codes

Graph Convolutional Networks in PyTorch

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

A Fast Sequence Transducer Implementation with PyTorch Bindings

dualPC.R contains the R code for the main functions.

TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

Predicting 10 different clothing types using Xception pre-trained model.

Official implementation for "Image Quality Assessment using Contrastive Learning"