Skip to content

02 Organising Data

matthew-gaddes edited this page Sep 18, 2020 · 2 revisions

Spatial data

To recover spatially independent sources, each interferogram is considered a variable, and each pixel an observation. In the style of most ICA literature, we therefore organise each interferogram as a row vector. For a time series of i interferograms each of p pixels, this produces an input matrix of size i × p

The data do not need to be mean centred (i.e. the mean of each row set to 0 ), but the outputs (spatial sources and time courses) return mean centred mixtures. Should a user wish to reconstruct the time series using a selection of the recovered sources, the mean centering process can be reversed by adding mixtures_mean back to the product of the time courses and the spatial sources.

Masked arrays are used to convert between row vectors and 2 D images. The figure below shows a data matrix, its associated pixel mask, and the effect of applying the pixel mask to one of the row vectors.

figure_1

Temporal data

ICASAR also supports the use of temporal data, such as GPS time series. Given mixtures that are instead of this form:

temporal_mixtures

The clusters can be explored. It appears that there are three compact and isolated clusters that relate to real sources (blue, green, and orange), but the red and purple clusters are surrounded by grey points (labelled as noise by HDBSCAN), and are likely to correspond only to noise terms.

interative_clustering

The sources most representative of each clusters are then returned and are ordered by cluster quality. Given the results of the previous figure, the last two components can be discarded as they are likely to be noise.

temporal_ICASAR

These results contrast with PCA, which is unable to recover the sources well:
temporal_pca

Or FastICA, which is able to recover the sources well, but provides no information on whether a source represents a true signal, or is just a noise term:

temporal_FastICA

Clone this wiki locally