To get started: Clone this repository then issue
git clone --recursive http://github.com/[username]/geometry-processing-registration.git
See introduction.
Once built, you can execute the assignment from inside the build/
using
./registration [path to mesh1.obj] [path to mesh2.obj]
In this assignment, we will be implementing a version of the iterative closest point (ICP), not to be confused with Insane Clown Posse.
Rather than registering multiple point clouds, we will register multiple triangle mesh surfaces.
This algorithm and its many variants has been used for quite some time to align discrete shapes. One of the first descriptions is given in "A Method for Registration of 3-D Shapes" by Besl & McKay 1992. However, the award-winning PhD thesis of Sofien Bouaziz ("Realtime Face Tracking and Animation" 2015, section 3.2-3.3) contains a more modern view that unifies many of the variants with respect to how they impact the same core optimization problem.
For our assignment, we will assume that we have a triangle mesh representing a
complete scan of the surface
These meshes will not have the same number of vertices or the even the same
topology. We will first explore different ways to measure how well aligned
two surfaces are and then how to optimize the rigid alignment of the partial
surface
We would like to compute a single scalar number that measures how poorly two surfaces are matched. In other words, we would like to measure the distance between two surfaces. Let's start by reviewing more familiar distances:
The usually Euclidean distance between two points
When we consider the distance between a point
written in this way the
infimum considers all
possible points
If
We might be tempted to define the distance from surface
but this will not be useful for registering two surfaces: it will measure zero
if even just a single point of
Instead, we should take the supremum of point-to-projection distances over
all points
This surface-to-surface distance measure is called the directed Hausdorff
distance. We may interpret
this as taking the worst of the best: we
let each point
It is easy to verify that
The converse is not true: if
We can approximate a lower bound on the Hausdorff distance between two meshes
by densely sampling surfaces
where we should be careful to ensure that the projection
As our sampling
Even if it were cheap to compute, Hausdorff distance is difficult to
optimize when aligning two surfaces. If we treat the Hausdorff distance
between surfaces
Hausdorff distance can serve as a validation measure, while we turn to
We would like a distance measure between two surfaces that — like Hausdorff
distance — does not require a shared parameterization. Unlike Hausdorff
distance, we would like this distance to diffuse the measurement over the
entire surfaces rather than generate it from the sole worst offender. We can
accomplish this by replacing the supremum in the Hausdorff distance (
This distance will only be zero if all points on
This distance is suitable to define a matching energy, but is not necessarily
welcoming for optimization: the function inside the square is non-linear. Let's
dig into it a bit. We'll define a directed matching energy
where we introduce the proximity function
Suppose
Similarly, suppose
But in general, if
In optimization, a common successful strategy to minimize energies composed of
squaring a non-linear functions
This is the core idea behind gradient descent and the Gauss-Newton methods:
minimize f(z)^{2}
z_{0} ← initial guess
repeat until convergence
f_{0} ← linearize f(z) around z_{0}
z_{0} ← minimize f_{0}(z)^{2}
Since our
If we make the convenient—however unrealistic—assumption that in the
neighborhood of the closest-point projection
In effect, we are assuming that the surface
Minimizing
If we make make a slightly more appropriate assumption that in the neighborhood
of the
where the plane that best approximates
Minimizing
Equipped with these linearizations, we may now describe an optimization
algorithm
for minimizing the matching energy between a surface
So far we have derived distances between a surface
Our matching problem can be written as an optimization problem to find the best
possible rotation
Even if
where
As the name implies, the method proceeds by iteratively finding the closest
point on
If V_X
and F_X
are the vertices and faces of a triangle mesh surface
icp V_X, F_X, V_Y, F_Y
R,t \Leftarrow initialize (e.g., set to identity transformation)
repeat until convergence
X \Leftarrow sample source mesh (V_X,F_X)
P0 \Leftarrow project all X onto target mesh (V_Y,F_Y)
R,t \Leftarrow update rigid transform to best match X and P0
V_X \Leftarrow rigidly transform original source mesh by R and t
We would like to find the rotation matrix
ICP using the point-to-point matching energy linearization is slow to converge.
ICP using the point-to-plane matching energy linearization is faster.
In either case, this is still a non-linear optimization problem. This time due to the constraints rather than the energy term.
In an effort to provide an alternative from "Least-Squares Rigid Motion Using SVD" [Sorkine 2009], this derivation purposefully avoids the trace operator and its various nice properties.
The point-to-point (gradient descent) rigid matching problem solves:
This is a variant of what's known as a Procrustes problem, named after a mythical psychopath who would kidnap people and force them to fit in his bed by stretching them or cutting off their legs. In our case, we are forcing
This energy is quadratic in
where sum(sum(A.^2))
). Setting the partial derivative with respect to
Rearranging terms above reveals that the optimal
Now we have a formula for the optimal translation vector
where we introduce
Now we have the canonical form of the orthogonal procrustes problem. To find the optimal rotation matrix
where sum(sum(A.*B))
). This can be further reduced:
Question: what is
$\mathbf{R}^\top \mathbf{R}$ ?Hint: 👁️
Letting
Question: How can we prove that
$\left<\mathbf{R}\overline{\mathbf{X}}^\top,\overline{\mathbf{P}}^\top\right> \left<\mathbf{R},\overline{\mathbf{P}}^\top\overline{\mathbf{X}}\right>$ ?Hint: Recall some linear algebra properties:
- Matrix multiplication (on the left) can be understood as acting on each column:
$\mathbf{A} \mathbf{B} = \mathbf{A} [\mathbf{B}_1 \ \mathbf{B}_2 \ \ldots \ \mathbf{B}_n] = [\mathbf{A} \mathbf{B}_1 \ \mathbf{A} \mathbf{B}_2 \ \ldots \ \mathbf{A} \mathbf{B}_n]$ ,- The Kronecker product
$\mathbf{I} \otimes \mathbf{A}$ of the identity matrix$\mathbf{I}$ of size$k$ and a matrix$\mathbf{A}$ simply repeats$\mathbf{A}$ along the diagonal k times. In MATLAB,repdiag(A,k)
,- Properties 1. and 2. imply that the vectorization of a matrix product
$\mathbf{B}\mathbf{C}$ can be written as the Kronecker product of the #-columns-in-$\mathbf{C}$ identity matrix and$\mathbf{B}$ times the vectorization of$\mathbf{C}$ :$\text{vec}(\mathbf{B}\mathbf{C}) = (\mathbf{I} \otimes \mathbf{B})\text{vec}(\mathbf{C})$ ,- The transpose of a Kronecker product is the Kronecker product of transposes:
$(\mathbf{A} \otimes \mathbf{B})^{\top} = \mathbf{A}^{\top} \otimes \mathbf{B}^{\top}$ ,- The Frobenius inner product can be written as a dot product of vectorized matrices:
$<\mathbf{A},\mathbf{B}>_F = \text{vec}(\mathbf{A}) \cdot \text{vec}(\mathbf{B}) = \text{vec}(\mathbf{A})^{\top} \text{vec}(\mathbf{B})$ ,- Properties 3., 4., and 5. imply that Frobenius inner product of a matrix
$\mathbf{A}$ and the matrix product of matrix$\mathbf{B}$ and$\mathbf{C}$ is equal to the Frobenius inner product of the matrix product of the transpose of$\mathbf{B}$ and$\mathbf{A}$ and the matrix$\mathbf{C}$ :
$$ \begin{align*}\left<\mathbf{A},\mathbf{B}\mathbf{C}\right>_F &= \text{vec}(\mathbf{A})^{\top} \text{vec}(\mathbf{B}\mathbf{C}) \\ &= \text{vec}(\mathbf{A})^{\top} (\mathbf{I} \otimes \mathbf{B})\text{vec}(\mathbf{C}) \\ &= \text{vec}(\mathbf{A})^{\top} (\mathbf{I} \otimes \mathbf{B}^{\top})^{\top} \text{vec}(\mathbf{C}) \\ &= \text{vec}(\mathbf{B}^{\top}\mathbf{A})^{\top} \text{vec}(\mathbf{C}) \\ &= \left<\mathbf{B}^{\top} \mathbf{A},\mathbf{C}\right>_F \end{align*}. $$
Any matrix can be written in terms of its singular value decomposition. Let's do this for our covariance matrix:
We can use the permutation property of Frobenius inner product again to move the products by
Now,
This ensures that as a result
Recall that
$\sigma \in \mathbb{R}^{3\times 3}$ is a non-negative diagonal matrix of singular values sorted so that the smallest value is in the bottom right corner.
Because
Finally, we have a formula for our optimal rotation:
The point-to-plane (Gauss-Newton) rigid matching problem solves:
where
Unlike the point-to-point problem above, there is closed-form solution to this problem. Instead we will ensure that
that
If we simply optimize the 9 matrix entries of
Instead, we linearize the constraint that
Any rotation
If
For a general, rotation axis
where
In this form, we can linearize by considering a small change in
By defining
or written in terms of its action on a vector
If we apply our linearization of
Let's gather a vector of unknowns:
Expanding all terms, moving the summations inside like terms, we can expose this in familiar quadratic energy minimization form:
Gather coefficients into
whose solution is revealed as
Question: How do we know that
$\mathbf{u}^*$ is a minimizer and not a maximizer of the quadratic expression above?Hint: 🥣
Question: For our problem can we reasonably assume that
$\mathbf{A}$ will be invertible?Hint: 🎰
Solving this small
then our transformation will not be rigid. Instead, we should recover the
axis and angle of rotation from
Our last missing piece is to sample the surface of a triangle mesh
We would like our random
variable
Suppose we have a way to evaluate a continuous random point
In order to pick a point uniformly randomly in a triangle with corners
where
Assuming we know how to draw a continuous uniform random variable
We can achieve this by first computing the cumulative
sum
Then our random index is found by identifying the first entry in
Try profiling your code. Where is most of the computation time spent?
If you have done things right, the majority of time is spent computing
point-to-mesh distances. For each query point, the computational
complexity of
computing its distance to a mesh with
This can be dramatically improved (e.g., to
You could follow this assignment from our graphics course to learn how to implement an AABB tree.
This reading task is not directly graded, but it's expected that you read and understand sections 3.2-3.3 of Sofien Bouaziz's PhD thesis "Realtime Face Tracking and Animation" 2015. Understanding this may require digging into wikipedia, other online resources or other papers.
You may not use the following libigl functions:
igl::AABB
igl::fit_rotations
igl::hausdorff
igl::iterative_closest_point
igl::point_mesh_squared_distance
igl::point_simplex_squared_distance
igl::polar_dec
igl::polar_svd3x3
igl::polar_svd
igl::random_points_on_mesh
igl::rigid_alignment
Eigen::umeyama
You are encouraged to use the following libigl functions:
igl::cumsum
computes cumulative sumigl::doublearea
computes triangle areasigl::per_face_normals
computes normal vectors for each triangle faceEigen::JacobiSVD
computes singular value decomposition of a matrix
Generate n
random points uniformly sampled on a given triangle mesh with
vertex positions VX
and face indices FX
.
Compute the distance d
between a given point x
and the closest point p
on
a given triangle with corners a
, b
, and c
.
Compute the distances D
between a set of given points X
and their closest
points P
on a given mesh with vertex positions VY
and face indices FY
.
For each point in P
also output a corresponding normal in N
.
It is OK to assume that all points in
P
lie inside (rather than exactly at vertices or exactly along edges) for the purposes of normal computation inN
.
Compute a lower bound on the directed Hausdorff distance from a given mesh
(VX
,FX
) to another mesh (VY
,FY
). This function should be implemented by
randomly sampling the
Given a M
, find the closest rotation matrix R
.
Given a set of source points X and corresponding target points P, find the optimal rigid transformation (R,t) that aligns X to P, minimizing the point-to-point matching energy.
Given a set of source points X
and corresponding target points P
and their
normals N
, find the optimal rigid transformation (R
,t
) that aligns X
to
planes passing through P
orthogonal to N
, minimizing the point-to-point
matching energy.
Conduct a single iteration of the iterative closest point method align
(VX
,FX
) to (VY
,FY
) by finding the rigid transformation (R
,t
)
minimizing the matching energy.
The caller can specify the number of samples num_samples
used to approximate
the integral over method
(point-to-point or
point-to-plane).