GitHub - wesbarnett/diffusion-map: Comparison of principal components analysis with diffusion maps on toy data sets and a molecular simulation trajectory

Personal code for principal component analysis and diffusion map examples. Specifically made to test the idea on some well-known types of data, but it wouldn't take much to modify the source for use with whatever data set or distance metric you desire.

Compilation

$ make

A library is compiled with the classes needed for the main program and the main program links to that. The main program requires json-fortran. LAPACK is required for the library to calculate the eigenvectors and eigenvalues of various matrices.

Running

Modify dmap.json. Then do:

$ ./run dmap.json

You can also run principal component analysis using the following file:

$ ./run pca.json

bandwidth.json is for running the program iteratively over different bandwidth values. See Figure S1 in this document for what I was going for with this. This would more helpful for analyzing simulation data, but the main program is not set up for that.

Extras

The extras folder contains the source code of two programs to aid in generating example data sets. No configuration files are provided, so you will need to edit the source.

Examples

A few examples using this program.

Compare the swiss roll and punctured sphere results with those found in this paper, specifically in Section 3.1. Note that my value of bandwidth is the square of what they call sigma (I am not squaring the denominator of the Gaussian kernel in my code).

Cluster of points

Colors indicate where points are in relationship to axis with greatest variance.

Original data

Principal component analysis

Diffusion maps

Multiple clusters

Colors indicate original cluster.

Original data

Principal component analysis

Diffusion maps

Swiss roll

Colors indicate where points are in relationship to the center of the swiss roll.

Original data

Principal component analysis

Diffusion maps

Punctured sphere

Colors indicate where points are in relationship to axis that goes through the holes in the sphere.

Original data

Principal component analysis

Diffusion maps

Simulation of octane in water

The original data is from a Molecular Dynamics simulation I performed of a single octane in water. I used the RMSD between each pair of simulation snapshots of the octane as the distance metric for the diffusion map calculation (1,000 snapshots total). For the principal components analysis I used the dihedral angles as the metric. The colors indicate the radius of gyration of the octane. Compare these results with Figure S2.C from this paper's SI (PDF).

The branch alkane has the modified code that performs these calculations. The original simulation trajectory is too large to post here. To reproduce the data, use this input file with GROMACS and run the simulation. Then use gmx trjconv to fit the octane's translational and rotational motion, saving only the octane's coordinates. Use the output coordinate file (xtc) as the input for this analysis. By default the simulation will output 10,000 frames, so you may want to reduce this some for the diffusion map analysis, since it is very memory intensive.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
examples		examples
extra		extra
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
bandwidth.json		bandwidth.json
dmap.json		dmap.json
pca.json		pca.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compilation

Running

Extras

Examples

Cluster of points

Original data

Principal component analysis

Diffusion maps

Multiple clusters

Original data

Principal component analysis

Diffusion maps

Swiss roll

Original data

Principal component analysis

Diffusion maps

Punctured sphere

Original data

Principal component analysis

Diffusion maps

Simulation of octane in water

Principal component analysis

Diffusion map

About

Releases

Packages

Languages

wesbarnett/diffusion-map

Folders and files

Latest commit

History

Repository files navigation

Compilation

Running

Extras

Examples

Cluster of points

Original data

Principal component analysis

Diffusion maps

Multiple clusters

Original data

Principal component analysis

Diffusion maps

Swiss roll

Original data

Principal component analysis

Diffusion maps

Punctured sphere

Original data

Principal component analysis

Diffusion maps

Simulation of octane in water

Principal component analysis

Diffusion map

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages