AIsight

Helping you see the world!

How does AIsight work?

AIsight describes the ongoing events captured through camera. This will help visually impaired people know about their surroundings and better equip them with details.

Who is AISight for?

People with Visual Impairment / Blindness

Abstract

How do you help your visually impaired friends see the world? We propose a solution - AISight, which acts as an all-time companion by describing the events happening in your surroundings. AIsight is a virtual companion which will describe what is in front of you(audience being visually impaired people).

For instance, Let's assume a visually impaired person is walking on a street, and wants to know what's in the surroundings, AISight will process the live video and describe what is going on at that moment in front of the person.

Why Us?

There are applications in market which identify the objects in the surroundings but what makes us different is the scene description which will help the person understand the surroundings better in real time. We will try to achieve a near real time audio description from the streaming video.

Logical Steps of the Proposed solution:

When the user scans the surroundings using Raspberrypi, image frames are sent to Image Captioning Engine on the server.
Image Captioning Engine will then repeatedly generate a caption for every image frame and sends it back to the Raspberrypi.
The obtained caption is then converted to speech by Pico2wave on Raspberrypi.
Eventually, it gives you a spoken description of surroundings.

Architecture Diagram

Different Approaches

On-edge:

Model Running on Raspberry Pi

Drawbacks: Takes 40-45 seconds to generate caption for an image.

On-cloud:(using scp):

Raspberry Pi sends images and recieves text using scp which is asynchronus and thus captions generated are not in proper order.

On-cloud(using flask):

Flask is used on server-side to receive images and generate captions which ensures synchronization.

Technology Stack

Raspberry Pi Rspberry Pi 3 is used to capture images and send it to flask server running on cloud. It receives the text generated in cloud and gives speech as output.
OpenCV Machine Leaning model uses OpenCV for image processing.
Flask Server uses flask framework.
Lua, Torch The pre-trained Machine Learning model is written in Lua and uses torch.
Pico2wave Text-to-speech conversion is done on raspberrypi using Pico2wave
Docker Machine learning model which takes image as input and generates text is Dockerised and runs on GCP.
Google Cloud Platform It is the cloud service which hosts server side code for AISight.

Team

Mihir Patel
Nishit Doshi
Jainish Parikh
Apoorva Banubakode

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
DesignThinking		DesignThinking
Images		Images
on-cloud-with-sync		on-cloud-with-sync
on-cloud		on-cloud
on-edge		on-edge
sampleOutputs		sampleOutputs
self-training-attempts		self-training-attempts
.DS_Store		.DS_Store
AIsight.pdf		AIsight.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIsight

How does AIsight work?

Who is AISight for?

Abstract

Why Us?

Logical Steps of the Proposed solution:

Architecture Diagram

Different Approaches

On-edge:

On-cloud:(using scp):

On-cloud(using flask):

Technology Stack

Team

About

Releases

Packages

Languages

mihir-1997/AISight

Folders and files

Latest commit

History

Repository files navigation

AIsight

How does AIsight work?

Who is AISight for?

Abstract

Why Us?

Logical Steps of the Proposed solution:

Architecture Diagram

Different Approaches

On-edge:

On-cloud:(using scp):

On-cloud(using flask):

Technology Stack

Team

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages