RGB-D Semantic Segmentation with UNet and ENet

Authors

Project Background

Image segmentation is the process by which the pixels of an image are partitioned into multiple segments based on shared characteristics that can include color, texture, and intensity. The goal of semantic image segmentation is to predict a class for every pixel in an image. Semantic image segmentation has important applications in fields such as medical imaging and autonomous vehicles.

This project aimed to investigate the use of depth information in semantic image segmentation. We evaluate the performance of two network architectures, Efficient Network (ENet) and a UNet with skip connections, on the task of semantic image segmentation using both RGB and RGB-D images. Our experiments were conducted using the NYUv2 dataset. A combination of Dice Loss and Cross Entropy Loss were used to stabilize the losses. Our results demonstrate the benefits of incorporating depth information in improving the accuracy of semantic image segmentation, showing that Depth gave an improvement of up to 17% mean IoU (intersection over union) on the NYUv2 test set. We acheived our best results when allowing each network to train for 400 epochs.

The reference paper for this repository is linked here.

Model Example

Below showcases our model running on the NYUv2 dataset. The first figure showcases our models output, while the second showcases the input and the ground truth mask.

Model outputs

Expected outputs

Model Results

Our results indicate that depth has the ability to significantly improve semantic segmentation results. While ENet only saw a marginal improvement, but our UNet model saw an improvement of nearly 17% in mean IoU, giving an mIoU of about 48%. Our results are tabulated below.

Real Time Model View

In addition to proving a number of model testing utilities, this repo also provides the ability to view your models working in real time. The python script, named real_time.py, is uses freenect to interface with an xbox kinect V1. Install freenect to your system, then run the program and watch the magic happen!

Example of the model running in real time:

Installation

To setup the environment, run

pip3 install -r requirements.txt

Usage

Once installed, you can train the networks using any of the train files. You can visualize the results with the associated ipynb files.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
models		models
nyu_dataloader		nyu_dataloader
utils		utils
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
enet_display_depth.ipynb		enet_display_depth.ipynb
enet_display_no_depth.ipynb		enet_display_no_depth.ipynb
real_time.py		real_time.py
report.pdf		report.pdf
requirements.txt		requirements.txt
train_enet_depth.py		train_enet_depth.py
train_enet_no_depth.py		train_enet_no_depth.py
train_unet.py		train_unet.py
train_unet_depth.py		train_unet_depth.py
unet_display_depth.ipynb		unet_display_depth.ipynb
unet_display_no_depth.ipynb		unet_display_no_depth.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RGB-D Semantic Segmentation with UNet and ENet

Authors

Project Background

Model Example

Model Results

Real Time Model View

Installation

Usage

License

About

Releases

Packages

Contributors 2

Languages

License

fvolcic/NYUv2-Semantic-Segmentation

Folders and files

Latest commit

History

Repository files navigation

RGB-D Semantic Segmentation with UNet and ENet

Authors

Project Background

Model Example

Model Results

Real Time Model View

Installation

Usage

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages