jupytext | kernelspec | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Mathematical models and data play a big role in modern science and engineering. The field of inverse problems bridges these two and studies if and how one can infer model parameters from relevant observations. A prominent example is the famous black hole image, shown in {numref}blackhole
, that was constructed by merging data from various telescopes {cite}akiyama
.
---
height: 150px
name: blackhole
---
Depiction of a black hole as reconstructed from data recorded by the Event Horizon Telescope.
A promiment subfield of inverse problems is that of imaging, with applications in chemistry, biology and medicine. Various methods for imaging have been developed over the previous four decades and have been partly made available for clinical use. What sets many of these imaging modalities aside from conventional photography is that the raw data is not interpretable as an image directly. Instead, significant processing using mathematical models is required to obtain a usuable image. Next, we give a few examples.
Images are of great importance in many applications, ranging from microscopy, medicine, to astronomy. In all these applications, the recorded images are distorted or corrupted versions of the ideal image. The inverse problem consists of reconstructing the idealised image from the measured one.
We can think of a (digital) image as a two dimensional array of pixels where each element could, for example, represent the grey value of the corresponding pixel. A colour image could be represented by a tensor containing the RGB intensities for each pixel. Alternatively, we can think of an idealised image as a (vector-valued) function on a bounded domain (say, the unit square).
A mathematical model for the observed image should model the process by which the image was obtained. This often involves convolving the image with a filter and adding noise. For example, for a digital image with pixel values
Representing the image as a function
where
The resulting inverse problem is to undo the convolution. In some cases,
:tags: [hide-input]
# import libaries
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
from scipy.signal import convolve2d as conv2
from skimage import color, data, restoration
from skimage.restoration import inpaint
# load test image and convert to gray-scale
astro = color.rgb2gray(data.astronaut())
## image deblurring
# define blurring kernel (pointspread function)
psf = np.ones((5, 5)) / 25
# convolve image with kernel
astro_blur = conv2(astro, psf, 'same')
astro_blur += (np.random.poisson(lam=25, size=astro.shape) - 10) / 255.
# Restore Image using Richardson-Lucy algorithm
astro_restored = restoration.richardson_lucy(astro_blur, psf, iterations=10)
# plot results
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(8, 5))
plt.gray()
for a in (ax[0], ax[1], ax[2]):
a.axis('off')
ax[0].imshow(astro, vmin=0, vmax=1)
ax[0].set_title('Original image')
ax[1].imshow(astro_blur,vmin=0, vmax=1)
ax[1].set_title('Blurred image')
ax[2].imshow(astro_restored, vmin=0, vmax=1)
ax[2].set_title('Restored image')
plt.figtext(0,0.25,"An example of image deblurring. Do you think we would ever be able to fully recover the original image?",{"fontsize":12})
plt.show()
:tags: [hide-input]
# import libaries
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
from scipy.signal import convolve2d as conv2
from skimage import color, data, restoration
from skimage.restoration import inpaint
# load test image and convert to gray-scale
astro = color.rgb2gray(data.astronaut())
## image inpainting
# Create mask with three defect regions: left, middle, right respectively
mask = np.zeros((200,200))
mask[20:60, 0:20] = 1
mask[160:180, 70:155] = 1
mask[30:60, 170:195] = 1
# remove parts of the image
astro_defect = astro[0:200, 0:200].copy()
astro_defect[np.where(mask)] = 0
# inpaint
astro_inpainted = inpaint.inpaint_biharmonic(astro_defect, mask, multichannel=False)
# plot results
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(8, 5))
plt.gray()
for a in (ax[0], ax[1], ax[2]):
a.axis('off')
ax[0].imshow(astro[0:200, 0:200], vmin=0, vmax=1)
ax[0].set_title('Original image')
ax[1].imshow(astro_defect,vmin=0, vmax=1)
ax[1].set_title('Damaged image')
ax[2].imshow(astro_inpainted, vmin=0, vmax=1)
ax[2].set_title('Restored image')
plt.figtext(0,0.25,"An example of image inpainting. Do you think we would ever be able to fully recover the original image?",{"fontsize":12})
plt.show()
A CT-scanner is a common tool for medical diagonisis, providing detailed images of the anatomy of a patient. An example is shown in {numref}ModernCT
. In other applications, such as materials science, similar techniques are being used to obtain three-dimensional reconstructions of the inner structure of objects.
---
height: 150px
name: ModernCT
---
Modern CT scanner (left). Slice of high resolution CT scan of a patient's lung (right). Source for both: Wikimedia Commons
In this case
with
An example of an image and its corresponding sinogram are shown in figure {numref}walnut
. It is clear that the sinogram itself is not directly suited for interpretation and we have to solve the inverse problem to get a useful image.
---
height: 150px
name: walnut
---
Two-dimensional slice through a walnut and the corresponding sinogram.
+++
Magnetic resonance imaging (MRI) is another imaging modality for clinical diagnosis. As opposed to CT, MRI does not rely on ionising radiation and is hence considered much safer. A disadvantage of MRI is that the time needed for a scan is much longer than for CT.
Here,
The measurements
Note that when a fully sampled spectrum is available, we can easily retrieve
The challenge here lies in retrieving
+++
In seismic inversion, the goal is to infer physical properties of the subsurface from seismic measurements. Here, seismic_acquisition
. A typical image resulting from such data is shown in {numref}seismic_image
.
---
height: 150px
name: seismic_acquisition
---
A typical seismic acquisition setup.
---
height: 150px
name: seismic_image
---
A typical seismic image depicting various earth layers.
+++
The simplest example is when the underlying physics can be described by a simple scalar wave equation:
where
The inverse problem consists of retrieving either
-
Inverse source problem: The measurements
$f$ are linear in terms of$q$ , leading to a linear inverse problem to retrieve$q$ . In earth-quake localisation, the source is for example parametrised as$q(t,x) = u_1(t)u_2(x)$ in which case the goal is to retrieve$(u_1,u_2)$ . -
Inverse medium problem: Here the goal is to retrieve
$c$ , sometimes parametrised as$c(x) = c_0(1 + u(x))$ . Here, the measurements generally depend non-linearly on$u$ . However, when$|u| \ll 1$ the relation between$f$ and$u$ may be linearised.
+++
+++
We can abstractly formulate most (if not all) inverse problems as finding a solution,
:label: ip
K(u) = f.
Here,
For the purposes of this course, we will assume that
An important notion in inverse problems is ill-posedness. We call a problem ill-posed if it is not well-posed:
:class: important
The equation $K(u) = f$ is *well-posed* if *all* three criteria are met:
* **Existence.** There exists at least one $u$ for which $K(u) = f$
* **Uniqueness.** There is exactly one $u$ for which $K(u) = f$
* **Stability.** The solution depends continuously on the data, i.e., there is a constant $C < \infty$ such that $\|u - u'\| \leq C \|f - f'\|$ where $K(u) = f$ and $K(u') = f'$.
If the problem is not well-posed, we call it *ill-posed*.
It may seem strange that equation {eq}ip
may not have a solution. After all, are the measurements not the result of the forward opertor applied to some ground-truth
where
Given a continuous function
We can thus derive the error bound
with
We conclude that the problem of finding a root of
Matrix inversion is a prime example of a linear inverse problem. These can be written in the form of {eq}ip
with
Under these conditions ip
are guaranteed. Regarding stability, however, it is well known from numerical linear algebra that a small perturbation of
where
Here, we are given a continuously differentiable function
We immediately find that the solution is given by
It is not hard to show in this case that the forward error $|f^{\delta} - f|{L^{\infty}([0,1])} \rightarrow 0$ as $\delta \rightarrow 0$ but that the backward error $|u - u^{\delta}|{L^{\infty}([0,1])}$ is constant. In the exercises we will see that the ill-posedness can be alleviated to some extent by demanding more regularity from the noise.
+++
In some special cases we can derive an explicit expression for (an approximation) of the solution of
Consider the matrix
$$
K = \left(\begin{matrix} 1 & 1\\ 2 & 2 \end{matrix}\right).
$$
Obviously this matrix is singular, so there is no way to define the inverse in the usual sense. However, modifying the matrix slightly
$$
\widetilde{K} = \left(\begin{matrix} 1 + \alpha & 1\\ 2 & 2 + \alpha\end{matrix}\right),
$$
allows us to compute the inverse. Indeed, given $f = K\overline{u}$ with $\overline{u} = (1,2)$ we have $f = (3,6)$. Applying the inverse of $\widetilde{K}$ we get $\widetilde{u} \approx (1,2)$. The corresponding equations are visualized in {numref}`matrix_inversion`.
```{glue:figure} matrix_inversion
:figwidth: 300px
:name: "matrix_inversion"
Original and regularized equations. We see that the regularized equations have a unique solution, but has thereby implicity selected one particular solution of the original system.
```
:tags: [hide-cell]
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
from myst_nb import glue
u1 = np.linspace(0,5,100)
u2 = np.linspace(0,5,100)
fig,ax = plt.subplots(1,1)
alpha = 1e-1
ax.plot(u1,3-u1,'k',label=r'$Ku=f$')
ax.plot(u1,3-(1+alpha)*u1,'r--',label=r'$\widetilde{K}u=f$')
ax.plot(u1,(6-2*u1)/(2+alpha),'r--')
ax.set_xlabel(r'$u_1$')
ax.set_ylabel(r'$u_2$')
ax.set_xlim([0.5,1.5])
ax.set_ylim([1.5,2.5])
ax.set_aspect(1)
ax.legend()
plt.show()
glue("matrix_inversion", fig, display=False)
Variational methods are a popular and quite generally applicable technique for solving inverse problems. The idea is to replace the original equation
where further a-priori information on the class of images
We illustrate this principle by a simple example. We search for a *small* solution of the following quadratic equation
$$
u^2 - f_1 u + f_2 = 0,
$$
with the given data $f_1=1.1$ and $f_2 = .1$.
With the solution formula for quadratic equations we obtain two solutions
$$
u_1 = 0.1 \qquad \text{and} \qquad u_2 = 1.
$$
With our a-priori information of a *small* solution we obviously choose the solution $u_1$. However, this is only possible in this specific case, because we can indeed compute *all* solutions. For more complicated problems this is not possible anymore. Moreover, we have to face measurement errors, such that we should better compute an approximate solution in a robust manner than many exact solutions. A variational approach can be realised for this example by minimizing the functional
$$
J(u) = (u^2 - f_1 u + f_2)^2 + \alpha u^2.
$$
For $\alpha$ large, this functional has a global minimum around $u = 0$, while for $\alpha$ small, it still has multiple minima. However, for suitable $\alpha$, we have a global minimum at $u \approx 0.1$ and minimzing the regularized problem yields (an approximation of) the smaller solution, as illustrated in {numref}`quadratic`.
```{glue:figure} quadratic
:figwidth: 300px
:name: "quadratic"
Original quadratic equation and functional to be minimized. We see that mimizing the functional picks out one of the two solutions.
```
:tags: [hide-cell]
import numpy as np
import matplotlib.pyplot as plt
from myst_nb import glue
u = np.linspace(0,5,100)
fig,ax = plt.subplots(1,1)
alpha = [1e-2, 1e-1, 1]
ax.plot(u,u**2 - 1.1*u + 0.1,'k')
ax.plot(0.1,0,'ko')
ax.plot(1,0,'ko')
for k in range(len(alpha)):
ax.plot(u,(u**2 - 1.1*u + 0.1)**2 + alpha[k]*(u**2),'--',label=r'$\alpha=$'+str(alpha[k]))
ax.set_xlabel(r'$u$')
ax.set_xlim([0,1.5])
ax.set_ylim([-0.25,.25])
ax.set_aspect(3)
ax.legend()
plt.show()
glue("quadratic", fig, display=False)
:style: plain
We want to solve a system of linear equations
-
$$K = \left(\begin{matrix} 1 & 2 \ 2 & 1\end{matrix}\right), \quad f = \left(\begin{matrix} 1 \ 1 \end{matrix}\right),$$
-
$$K = \left(\begin{matrix} 1 & 1 \ 1 & 1.001\end{matrix}\right), \quad f = \left(\begin{matrix} 1 \ 1 \end{matrix}\right),$$
-
$$K = \left(\begin{matrix} 1 & 1 \ 2 & 1 \ 2 & 2\end{matrix}\right), \quad f = \left(\begin{matrix} 1 \ 1 \ 1\end{matrix}\right),$$
-
$$K = \left(\begin{matrix} 1 & 1 \ 2 & 1 \ 2 & 2\end{matrix}\right), \quad f = \left(\begin{matrix} 1 \ 2 \ 2\end{matrix}\right),$$
-
$$K = \left(\begin{matrix} 1 & 2 & 1\ 2 & 0 & 2\end{matrix}\right), \quad f = \left(\begin{matrix} 1 \ 1 \end{matrix}\right),$$
-
$$K = \left(\begin{matrix} 1 & 0 \ 0 & 0\end{matrix}\right), \quad f = \left(\begin{matrix} 0 \ 1 \end{matrix}\right),$$
Is the problem ill-posed? Why? You can compute eigenvalues and vectors numerically, as shown in the example below.
import numpy as np
K = np.array([[1,2],[2,1]])
f = np.array([1,1])
l, V = np.linalg.eig(K)
print("The eigenvalues are:", l)
print("The eigenvectors are:",V[:,0], V[:,1])
:class: hint, dropdown
1. This matrix is invertible, so we only need to check the condition number which we find to be $3$. This is not so large as to cause problems numerically.
2. This matrix is near singular (the columns are nearly linearly dependent). However, it still invertible. The condition number is $\kappa(K) \approx 4\cdot 10^3$. While this would not lead to problems when computing solutions numerically, it may amplify realistic measurement errors. We would call this problem ill-posed.
3. This equation does not have a solution.
4. This equation does have a unique solution $\overline{u} = (1,0)$. It is not immediately clear how to analyse stability in general. We could, for example, look at the first two equations (the last one is the same as the first one) and compute the condition number of the corresponding matrix. This yields $\approx 6.85$, so the system is not ill-posed. Another way to look at it, perhaps, is that adding a small perturbation may make the system inconsistent. As such, we could call the inverse problem ill-posed.
We will learn how to deal with this situation in general in the next chapter.
5. Here, the solutions are of the form $\overline{u} = (1/4,1/4,1/4) + t (-1,0,1)$ for any $t$. Hence, the system is definitely ill-posed.
6. Again, an inconsistent system with no solution.
Given a symmetric, positive definite matrix
where
To study the well-posedness of the equation we want to bound the backward error
- Show that
$|u - u^\delta|_2 \leq \lambda_n^{-1} |f - f^{\delta}|_2.$ - Show that the relative error is bounded by
$\frac{|u - u^\delta|_2}{|u|_2} \leq \lambda_1\lambda_n^{-1} \frac{|f - f^{\delta}|_2}{|f|_2}.$
:class: hint, dropdown
1. We have $\|u - u^\delta\|_2 = \|K^{-1}(f - f^\delta)\|_2 \leq \|K^{-1}\| \|f - f^\delta\|_2$ with $\|K^{-1}\|$ denoting the operator norm of $K^{-1}$, defined as $\sup_{\|v\|_2=1} \|K^{-1}v\|$. The supremum is attained at $v = k_n$ which leads to $\|K^{-1}\|_2 = \lambda_n^{-1}$.
2. Similarly, we find $\|f\|_2 = \|Ku\|_2 \leq \lambda_1 \|u\|_2$.
We are given a continuously differentiable function
It is readily verified that we can find a (unique) solution by differentiation:
$$|g|{L^{\infty}([0,1])} = \sup{x\in[0,1]} |g(x)|.$$
- Show that the forward error
$f - f^{\delta}$ is bounded in the$L^{\infty}$ norm, in particular$|f - f^{\delta}|_{L^{\infty}([0,1])} = \delta$ . - Show that the backward error
$u - u^{\delta}$ can be arbirarily large, even if$\delta\downarrow 0$ :$|u - u^{\delta}|_{L^{\infty}([0,1])} = k$ . - Is the inverse problem ill-posed?
:class: hint, dropdown
1. Since $|\sin(\cdot)| \leq 1$ we immediately get the desired result.
2. By linearity we have $u - u^\delta = k \sin(k x)$ and we immediately find the desired result.
3. This shows that the problem is *ill-conditioned*; a small forward error does not guarantee a small backward error, implying that the inverse map is not continuous.
The analysis in the previous exercise depends crucially on the type of noise we allow. If we assume that
$$|g|{C^{1}([0,1])} = |g|{L^{\infty}[0,1]} + |g'|_{L^{\infty}[0,1]},$$
denotes the $W^{1,\infty}$-Sobolev norm
-
Assuming that $|n^{\delta}|{C^1([0,1])} = \delta$, show that $|u - u^{\delta}|{L^{\infty}([0,1])} \rightarrow 0$ when
$\delta \rightarrow 0$ . -
Can you come up with a type of noise that obeys the assumed bound? Is it reasonable to make such assumptions on the noise?
+++