Computer-Pointer-Controller

Table of Contents

1. Introduction
2. Architecture Block Diagram
3. Project structure
4. Prerequisites
5. Build
6. Usage
7. To Do
- 7.1. README
- 7.2. Development

1. Introduction

Control the mouse pointer of the computer by using gaze detection points. The gaze is the deep learning model to estimate the gaze of the user’s eyes and change the mouse pointer position accordingly. The gaze detection model depends on the output of the other models face-detection, head-pose-estimation, facial-landmarks. So, The application is an integration of face detection model, head-pose estimation model, and facial landmarks model.

2. Architecture Block Diagram

3. Project structure

.
├── core                    # This has all the core components face, gaze, headpos, landmarks and mouse_controller
├── images                  # supported images for README.adoc
├── resource                # Demo video's
├── scripts                 # Auto script to download pre-trained models
├── utils                   # Helper files
├── main.py                 # Main function driver file
├── LICENSE
└── README.md

4. Prerequisites

To run the application in this tutorial, the OpenVINO™ toolkit and its dependencies must already be installed and verified using the included demos. Installation instructions may be found at: https://software.intel.com/en-us/articles/OpenVINO-Install-Linux or https://github.com/udacity/nd131-openvino-fundamentals-project-starter/blob/master/linux-setup.md

The below steps are tested on Ubuntu 16.04:

# Install OpenVino
wget http://registrationcenter-download.intel.com/akdlm/irc_nas/16612/l_openvino_toolkit_p_2020.2.120.tgz
tar -xvf l_openvino_toolkit_p_2020.2.120.tgz
cd l_openvino_toolkit_p_2020.2.120
sed -i 's/decline/accept/g' silent.cfg
sudo ./install.sh -s silent.cfg

# System dep
sudo apt update
sudo apt-get install python3-pip
pip3 install numpy
pip3 install paho-mqtt
sudo apt install libzmq3-dev libkrb5-dev
sudo apt install ffmpeg
sudo apt-get install cmake
sudo apt-get install python3-venv

# Create a Virtual env
python3 -m venv openvino-env
source openvino-env/bin/activate

# Project dep
pip3 install -r requirements.txt

5. Build

1. Clone the repository at desired location:

git clone https://github.com/nullbyte91/Computer-Pointer-Controller.git

2. Configure the build environment for the OpenCV toolkit by sourcing the "setupvars.sh" script.

bash /opt/intel/openvino/bin/setupvars.sh

3. Change to the top git repository:

cd Computer-Pointer-Controller/

4. Model download: Download all the models that necessary to this project.

bash scripts/download_models.sh

6. Usage

6.1. Command Line Options

usage: main.py [-h] -i INPUT -m_fd MODE_FACE_DETECTION
               [-d_fd {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}] [-t_fd [0..1]]
               [-o_fd] -m_hp MODEL_HEAD_POSITION
               [-d_hp {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}] [-o_hp] -m_lm
               MODEL_LANDMARK_REGRESSOR
               [-d_lm {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}] [-o_lm] -m_gm
               MODEL_GAZE [-d_gm {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}] [-o_gm]
               [-o_mc] [-pc] [-exp_r_fd NUMBER] [-cw CROP_WIDTH] [-v]
               [-l PATH] [-c PATH] [--no_show] [-tl] [-o PATH]

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Path to image or video file or enter cam for webcam
  -m_fd MODE_FACE_DETECTION, --mode_face_detection MODE_FACE_DETECTION
                        Path to an .xml file with a trained Face Detection
                        model
  -d_fd {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}
                        (optional) Target device for the Face Detection model
                        (default: CPU)
  -t_fd [0..1]          (optional) Probability threshold for face
                        detections(default: 0.4)
  -o_fd                 (optional) Show face detection output
  -m_hp MODEL_HEAD_POSITION, --model_head_position MODEL_HEAD_POSITION
                        Path to an .xml file with a trained Head Pose
                        Estimation model
  -d_hp {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}
                        (optional) Target device for the Head Position model
                        (default: CPU)
  -o_hp                 (optional) Show HeadPsition output
  -m_lm MODEL_LANDMARK_REGRESSOR, --model_landmark_regressor MODEL_LANDMARK_REGRESSOR
                        Path to an .xml file with a trained Head Pose
                        Estimation model
  -d_lm {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}
                        (optional) Target device for the Facial Landmarks
                        Regression model (default: CPU)
  -o_lm                 (optional) Show Landmark detection output
  -m_gm MODEL_GAZE, --model_gaze MODEL_GAZE
                        Path to an .xml file with a trained Gaze Estimation
                        model
  -d_gm {CPU,GPU,FPGA,MYRIAD,HETERO,HDDL}
                        (optional) Target device for the Gaze estimation model
                        (default: CPU)
  -o_gm                 (optional) Show Gaze estimation output
  -o_mc                 (optional) Run mouse counter
  -pc, --perf_stats     (optional) Output detailed per-layer performance stats
  -exp_r_fd NUMBER      (optional) Scaling ratio for bboxes passed to face
                        recognition (default: 1.15)
  -cw CROP_WIDTH, --crop_width CROP_WIDTH
                        (optional) Crop the input stream to this width
                        (default: no crop). Both -cw and -ch parameters should
                        be specified to use crop.
  -v, --verbose         (optional) Be more verbose
  -l PATH, --cpu_lib PATH
                        (optional) For MKLDNN (CPU)-targeted custom layers, if
                        any. Path to a shared library with custom layers
                        implementations
  -c PATH, --gpu_lib PATH
                        (optional) For clDNN (GPU)-targeted custom layers, if
                        any. Path to the XML file with descriptions of the
                        kernels
  --no_show             (optional) Do not display output
  -tl, --timelapse      (optional) Auto-pause after each frame
  -o PATH, --output PATH
                        (optional) Path to save the output video to

6.2. Run

6.2.1. Face Detection

python3.6 main.py -i resource/demo.mp4 -m_fd mo_model/intel/face-detection-adas-binary-0001/FP32-INT1/face-detection-adas-binary-0001.xml -m_hp mo_model/intel/head-pose-estimation-adas-0001/FP32/head-pose-estimation-adas-0001.xml -m_lm mo_model/intel/facial-landmarks-35-adas-0002/FP32/facial-landmarks-35-adas-0002.xml -m_g mo_model/intel/gaze-estimation-adas-0002/FP32/gaze-estimation-adas-0002.xml -o_fd

6.2.2. Head Pose Estimation

python3.6 main.py -i resource/test_2.mp4 -m_fd mo_model/intel/face-detection-adas-binary-0001/FP32-INT1/face-detection-adas-binary-0001.xml -m_hp mo_model/intel/head-pose-estimation-adas-0001/FP32/head-pose-estimation-adas-0001.xml -m_lm mo_model/intel/facial-landmarks-35-adas-0002/FP32/facial-landmarks-35-adas-0002.xml -m_g mo_model/intel/gaze-estimation-adas-0002/FP32/gaze-estimation-adas-0002.xml -o_hp

6.2.3. Landmark regression

python3.6 main.py -i resource/test_2.mp4 -m_fd mo_model/intel/face-detection-adas-binary-0001/FP32-INT1/face-detection-adas-binary-0001.xml -m_hp mo_model/intel/head-pose-estimation-adas-0001/FP32/head-pose-estimation-adas-0001.xml -m_lm mo_model/intel/facial-landmarks-35-adas-0002/FP32/facial-landmarks-35-adas-0002.xml -m_g mo_model/intel/gaze-estimation-adas-0002/FP32/gaze-estimation-adas-0002.xml -o_lm

6.2.4. Gaze Estimation

python3.6 main.py -i resource/test_2.mp4 -m_fd mo_model/intel/face-detection-adas-binary-0001/FP32-INT1/face-detection-adas-binary-0001.xml -m_hp mo_model/intel/head-pose-estimation-adas-0001/FP32/head-pose-estimation-adas-0001.xml -m_lm mo_model/intel/facial-landmarks-35-adas-0002/FP32/facial-landmarks-35-adas-0002.xml -m_g mo_model/intel/gaze-estimation-adas-0002/FP32/gaze-estimation-adas-0002.xml -o_gm

==== Mouse Pointer

python3.6 main.py -i resource/test_2.mp4 -m_fd mo_model/intel/face-detection-adas-binary-0001/FP32-INT1/face-detection-adas-binary-0001.xml -m_hp mo_model/intel/head-pose-estimation-adas-0001/FP32/head-pose-estimation-adas-0001.xml -m_lm mo_model/intel/facial-landmarks-35-adas-0002/FP32/facial-landmarks-35-adas-0002.xml -m_g mo_model/intel/gaze-estimation-adas-0002/FP32/gaze-estimation-adas-0002.xml -o_mc

sluggishness can be removed by removing if frame_count % 10 == 0: this counter.

6.2.5. Run on Video or Camera Feed

# To Run on Video feed use -i video_path
python3.6 main.py -i resource/test_2.mp4

# To Run on Camera Feed
python3.6 main.py -i cam

[quote]
By default the camera node is pointing to 0, Please modify if you have different camera node.

6.3. Performance analysis

6.3.1. Model size comparision

Hardware configuration: i7-6820HQ CPU

Model Combination

precision - Size

FPS in Sec

Load Time in Sec

face-detection-adas - FP32

head-pose-estimation-adas - FP32

facial-landmarks-35-adas - FP32

gaze-estimation-adas - FP32

1.8M

7.3M

18M

7.2M

42 FPS

0.08657677700102795

Model Combination

precision - Size

FPS in Sec

Load Time in Sec

face-detection-adas - FP32

head-pose-estimation-adas - FP16

facial-landmarks-35-adas - FP16

gaze-estimation-adas - FP16

1.8M

3.7M

8.8M

3.6M

43.5 FPS

0.05757568099943455

6.3.1.1. Conclusion

Face detection is key information that we are passing to the other three models. So, Keeping the Face detection in Precision 32 bits and other models in 16 bits help us reduce the model load time, model size. But, INT8 Precision output was very poor.

I got the best performance in terms of time and Core utilization with Face detection with FP32 and other models with FP16.

6.3.2. OpenVino API for Layer by layer analysis

python3.6 main.py -i resource/test_2.mp4 -m_fd mo_model/intel/face-detection-adas-binary-0001/FP32-INT1/face-detection-adas-binary-0001.xml -m_hp mo_model/intel/head-pose-estimation-adas-0001/FP32-INT8/head-pose-estimation-adas-0001.xml -m_lm mo_model/intel/facial-landmarks-35-adas-0002/FP32-INT8/facial-landmarks-35-adas-0002.xml -m_g mo_model/intel/gaze-estimation-adas-0002/FP32-INT8/gaze-estimation-adas-0002.xml --perf_stats

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computer-Pointer-Controller

1. Introduction

2. Architecture Block Diagram

3. Project structure

4. Prerequisites

5. Build

6. Usage

6.1. Command Line Options

6.2. Run

6.2.1. Face Detection

6.2.2. Head Pose Estimation

6.2.3. Landmark regression

6.2.4. Gaze Estimation

6.2.5. Run on Video or Camera Feed

6.3. Performance analysis

6.3.1. Model size comparision

6.3.1.1. Conclusion

6.3.2. OpenVino API for Layer by layer analysis

7. To Do

7.1. README

7.2. Development

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
core		core
images		images
resource		resource
scripts		scripts
utils		utils
LICENSE		LICENSE
README.adoc		README.adoc
main.py		main.py
requirements.txt		requirements.txt

License

nullbyte91/Computer-Pointer-Controller

Folders and files

Latest commit

History

Repository files navigation

Computer-Pointer-Controller

About

Topics

Resources

License

Stars

Watchers

Forks

Languages