Kinect Fusion builds a 3D model of a scene using the depth and color data recorded by an RGB-D camera.We have recorded our own data using the Intel RealSense depth camera.For the implementation we have followed the approach in "Kinectfusion: Real-time dense surface mapping and tracking" given by Izadi et al. with some modifications.
-
Data Acquisition and Depth Map Conversion
Acquire the depth and color data from an RGB-D camera. The depth data from the sensor is back-projected into the camera space to create the vertex map. Normal map is created by taking the cross product of approximate tangent vectors at each pixel. -
Pose Estimation
This step aims to estimate the 6DoF camera pose. To estimate the pose of each frame, we have used the linear least-square optimization of Iterative Closest Point algorithm with point to plane error metric and projective data association for correspondence finding. The estimated pose gives the transformation from camera to global space. -
Volumetric Representation And Integration
We have used a voxel grid system to represent a global volumetric model. Each voxel contains a truncated signed distance value which represents how far the corresponding voxel is from the surface. -
Surface Prediction via Raycasting
Generates view of the implicit surface by rendering the surface at the zero-crossing. It provides a better estimate of global coordinates and normals for each frame. -
Volume Visualization
In order to visualize the fused volume the Marching Cubes algorithm is used.
-
Install FreeImage
sudo apt-get install libfreeimage3 libfreeimage-dev -
Install Ceres and dependencies
sudo apt-get install libeigen3-dev
sudo apt-get install libgoogle-glog-dev
sudo apt-get install libceres-dev