[WIP] Analysis #38

lilyminium · 2020-02-07T00:12:26Z

Closes #4, #7, most of #29

This is taking much longer than I thought it would. I would appreciate any feedback on what I have so far, what I should prioritise next, and also any changes I should make before I try to finish the rest of the notebooks.

Every notebook aims to show a useful use case of a class or function in MDAnalysis, point out variables that users will likely need to change, visualise the results, and interpret them in some way.

Notebooks on Analysis modules

Other notebooks added

References and executing the notebooks

In #34 I mentioned that a pro of having examples in Jupyter notebooks is that the examples get checked by executing the notebook. To ensure that the notebooks get checked every time the documentation is built, I've written a script scripts/clean_example_notebooks.py that takes in a list of notebooks. For each notebook, it:

Goes through and collects references
Converts any shorthand references to an inline citation that links to the paper (in the notebook) or the actual reference (online HTML)
Builds a bibliography cell at the end of the notebook
Executes the notebook
On success, adds or updates a Last executed line to the first cell
If successful, rewrites the notebook.
On failure, it collects all the errors and raises them all at once. So doc building should fail.

This script already successfully fails on the harmonic ensemble clustering notebook. In this notebook, the shorthand references haven't been converted yet.

Caveats

The NGLView widget can only save the molecule data if it is displayed in a browser. These notebooks are not executed in a browser with the script, and there are only a few of them, so right now I just re-run them after executing through the script. If this becomes too annoying I can convert to selenium.

(The script will print which notebooks have NGLView in them and need to be re-executed when it's done.)

References

nbsphinx can handle citations through sphinxcontrib-bibtex, so I moved the user guide references to a bibtex file references.bib. Pasting citations gets quite tedious when making notebooks, so I use a shorthand with the pattern '#{bibtex-key}'. e.g. #michaud-agrawal_mdanalysis_2011 gets converted to the reference with a bibtex key of michaud-agrawal_mdanalysis_2011. The full HTML in the notebook looks like <a data-cite="michaud-agrawal_mdanalysis_2011" href="https://doi.org/10.1002/jcc.21787">Michaud-Agrawal *et al.*, 2011</a>

The inline display in the browser looks quite ugly (e.g. first cell here) but I'm not sure how to change that in sphinxcontrib-bibtex.

Structure of docs

Right now, the notebooks are all displayed in Examples and in the sidebar of Analysis. It can be a bit distracting to click on a notebook from the main Analysis page, and suddenly realise you've moved to the Examples section in the sidebar. I might fiddle with this later.

lilyminium · 2020-02-07T00:19:06Z

As always, a built version is at https://lilyminium.github.io/UserGuide/

orbeckst · 2020-02-07T17:55:19Z

RE: PSA — we did a more real-life example in https://www.mdanalysis.org/SPIDAL-MDAnalysis-Midas-tutorial/ and the data set is available as https://www.mdanalysis.org/MDAnalysisData/adk_transitions.html ; @sseyler created a tutorial in 2015 https://github.com/Becksteinlab/PSAnalysisTutorial but this might be out-of-date with the current code.

By the way, finding pairs is a useful tool once one wants to know why two trajectories differ.

I don't suggest that you change the example that you have. But perhaps add links to the other PSA tutorials, with a note that these showe more complicated cases and more advanced functions but might be somewhat out of date. If we have someone particularly interested in PSA then they can update the User Guide.

orbeckst · 2020-02-07T17:56:45Z

MDAnalysis.analysis.hbonds is indeed going to be only in 1.0 and we recommend MDAnalysis.analysis.hydrogenbonding. Don't bother adding examples with hbonds.

orbeckst · 2020-02-07T17:59:37Z

It seems pretty clear that we will go to 1.0.0 and that we're not bothering with 0.21.0 — @richardjgowers ?? — so change references from 0.21 to 1.0. In general, I would say you can simply state that the User Guide requires MDAnalysis 1.0 or better. If you link to specific classes such as AverageStructure then they have their own versionadded notes.

orbeckst · 2020-02-07T18:06:45Z

https://lilyminium.github.io/UserGuide/examples/analysis/alignment_and_rms/rmsf.html#Plotting-RMSF : "As we can see, the LID and NMP residues indeed move much more compared to the rest of the enzyme."

I would not say this about the PSF/DCD data from the tests. It's a non-equilibrium trajectory that is biased to make a transition, so computing fluctuations on it seems inappropriate. A better example would be a real AdK equilibrium trajectory https://www.mdanalysis.org/MDAnalysisData/adk_equilibrium.html

If you want to stick with the test trajectory then I would say that the apparent fluctuations in these domains are high but in this particular case it is due to the fact that these domains move from their closed to their open conformation. Note that this trajectory does not sample from the equilibrium distribution so it's not really meaningful to calculate these fluctuations (but we do it anyway for demonstration purposes).

richardjgowers · 2020-02-07T19:54:33Z

Yeah next is 1.0 but I’m gonna convert all 21.0 to 1.0 so either works.

…

On Fri, Feb 7, 2020 at 18:06, Oliver Beckstein ***@***.***> wrote: https://lilyminium.github.io/UserGuide/examples/analysis/alignment_and_rms/rmsf.html#Plotting-RMSF : "As we can see, the LID and NMP residues indeed move much more compared to the rest of the enzyme." I would not say this about the PSF/DCD data from the tests. It's a non-equilibrium trajectory that is biased to make a transition, so computing fluctuations on it seems inappropriate. A better example would be a real AdK equilibrium trajectory https://www.mdanalysis.org/MDAnalysisData/adk_equilibrium.html If you want to stick with the test trajectory then I would say that the apparent fluctuations in these domains are high but in this particular case it is due to the fact that these domains move from their closed to their open conformation. Note that this trajectory does not sample from the equilibrium distribution so it's not really meaningful to calculate these fluctuations (but we do it anyway for demonstration purposes). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#38?email_source=notifications&email_token=ACGSGB6ZPVQTBFWRNARJGP3RBWPLNA5CNFSM4KRGGESKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELEACAY#issuecomment-583532803>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACGSGB2OVNUXI34ZHCF72FLRBWPLNANCNFSM4KRGGESA> .

lilyminium · 2020-02-08T11:23:31Z

@orbeckst Thanks for the comments, I wasn't aware of the MDAnalysisData package. This makes things a lot easier.

In general, I would say you can simply state that the User Guide requires MDAnalysis 1.0 or better.

Should I remove the Minimum version lines from the first cell then? I was also wondering if the Last updated was overkill.

Edit: I mostly look at this as a Jupyter notebook where the first cell looks neat, but all the info takes up quite a lot of space in the HTML conversion.

vs

orbeckst · 2020-02-17T17:30:46Z

doc/source/examples/analysis/polymers_and_membranes/README.rst

+
+   /examples/analysis/polymers_and_membranes/polymer
+   /examples/analysis/polymers_and_membranes/hole
+   /examples/analysis/polymers_and_membranes/hole2


The hole2 docs show up with the same title as the hole ones. Perhaps put "Deprecated", "Old", or somthing else in the title of the old module?

References in the hole notebooks did not resolve eg the Note:

The classes in MDAnalysis.analysis.hole are wrappers around the HOLE program. Please cite ([smart_pore_1993], [smart_hole_1996]) when using this module in published work.

Hole2 docs still contain reference to HOLEtraj

One of the limitations of the hole program is that it can only accept PDB files. In order to use other formats with hole, or to run hole on trajectories, we can use the hole.HOLEtraj class with an MDAnalysis.Universe

and ht.profiles instead of ha.

The gif of the hole surface is cute but the gA channel is not properly shown in VMD because of the distortions along the normal mode. To get a nice representation do the following:

import MDAnalysis as mda; from MDAnalysis.tests import datafiles as data # write out the structure without ETA (which elNemo ignored :-p) u = mda.Universe(data.PDB_HOLE) u.select_atoms("not resname ETA").write("elNemo_ga.pdb")

Use the above

vmd elNemo_ga.pdb path/to/1grm_elNemo_mode7.pdb

and delete the first frame in the trajectory.

We need to to remove ETA (the ethanolamine C-terminus) from the original PDB file because the elNemo trajectory does not contain it and so you cannot directly use PDB_HOLE as the "topology" for MULTI_PDB_HOLE.

This should give you a nicer representation of the channel around the pore.

Can you use your new plot_order_parameters(..., aggregrator=...) function to do the most common application of a hole trajectory profile, the "average HOLE profile" with "mean(R) ± std(R)" (as at the end of https://github.com/MDAnalysis/binder-notebook/blob/master/notebooks/analysis/hole-basics.ipynb )?

No, but you could plot the fillbetween standard deviation with on the returned axis from aggregator=np.mean, or just make the plot yourself with the data returned from over_order_parameters. I could add another plotting function for this?

lilyminium · 2020-02-19T14:52:24Z

Thanks @orbeckst. Is this approximately what you were thinking? (Is there a way to use GROMACS to wrap molecules back into the unit cell?)

richardjgowers · 2020-02-19T15:07:33Z

Those embedded nglviews are awesome!

orbeckst · 2020-02-19T19:28:28Z

Awesome, I am so happy to see the transform + density workflow actually working (probably glacially slow but that's just an engineering problem ;-) ).

Apparently, nglview can display density data, at least it is mentioned in the API docs. Maybe @arose can give us a hint how to load a DX file ... or directly use a NumPy array with a density?

orbeckst · 2020-02-19T19:32:02Z

(Is there a way to use GROMACS to wrap molecules back into the unit cell?)
The standard Gromacs workflow is

printf "protein\nsystem\n"| gmx trjconv -f md.xtc -s md.tpr -o md_centered.xtc -pbc mol -center -ur compact
printf "backbone\nsystem\n" | gmx trjconv -s md.tpr -f md_centered.xtc -o md_fit.xtc -fit rot+trans

The pbc mol -center step will pack all molecules around whatever is specified as the center (in my example the protein) and make all molecules whole. This means that some water molecules or lipids will poke outside the actual box boundaries but that's generally what you want to end up with. Once this is done, the rotational/translational superposition is performed on the centered system.

EDIT: Gromacs also has the really nice -ur compact feature where it converts the triclinic unitcell into a compact representation, namely truncated octahedron or rhombic dodecahedron. MDAnalysis cannot do this at the moment.

arose · 2020-02-19T19:45:33Z

Maybe @arose can give us a hint how to load a DX file ...

I think view.add_component('my.dx') should be enough. There is no way to directly use a numpy array. Instead of .dx I would suggest to use a binary format like .ccp4 for faster parsing if you don't already have the dx file. Note that there is also a binary dx format variant.

orbeckst · 2020-02-19T19:53:11Z

Thanks!

In principle we can write ccp4 ... unless there's a bug in the writer MDAnalysis/GridDataFormats#76

…ances

lilyminium · 2020-06-07T11:30:40Z

This has merge conflicts up the wazoo and I'm not sure how many of the notebooks are even accurate to behaviour anymore; it's probably best to separate each one out and merge one-by-one after checking. I do still have them up at my fork for easier viewing.

lilyminium · 2020-06-07T11:32:14Z

I can do this over the next few days myself but help is very welcome.

lilyminium · 2020-12-29T21:15:16Z

Superseded by #125 and #126.

lilyminium added 15 commits December 28, 2019 18:45

added rms

6773912

undid ignore notebooks

d05c657

added note boxes

31aed4d

added psa

48d939d

added trajectory similarity

1f425d2

added rst

43d3a63

reduce menu level

aa5452f

updated hes

2b27a53

added more analysis

cc0f568

started rdf

b986abc

more rdf

116abe5

added gifs

ce03523

wrote script to execute and clean notebooks

9661bb2

reorganised notebooks

e60dc5f

fixed merge conflict

971774f

lilyminium added 3 commits February 7, 2020 16:06

updated trajectories to html

286b07c

refresh transformations notebook

ced30dd

moved transformations back

2b73a88

lilyminium added 4 commits February 10, 2020 12:17

updated rmsf to use equilibrium data

dcd4316

updated dependencies

e8627ec

added hole

a51f97a

updated hole

2bee421

orbeckst reviewed Feb 17, 2020

View reviewed changes

Merge branch 'master' into distances

fd9596d

lilyminium added 2 commits February 21, 2020 00:13

fixed hole link

dbcbdc9

Merge branch 'distances' of github.com:MDAnalysis/UserGuide into dist…

ad26988

…ances

lilyminium force-pushed the distances branch from f076616 to ad26988 Compare February 20, 2020 13:34

lilyminium added 5 commits February 22, 2020 13:45

started dmap

a434db7

added dmap entry

1b1d66f

saved widget state

90accb4

added plotly js to nbsphinx

975474a

fixed notebook typos

89ca0f3

lilyminium added the version-1.0 Contains 1.0 features label Feb 24, 2020

lilyminium mentioned this pull request Feb 24, 2020

Page on trajectories and on-the-fly transformations #40

Merged

lilyminium added 3 commits February 29, 2020 15:09

updated hole nb

701bfa8

updated pore gif

91a2120

updated pairwise tutorial

fe7fc98

This was referenced Feb 29, 2020

Analysis notebooks for 0.20.1 #52

Merged

NGLView widgets showing up twice #53

Closed

lilyminium added 3 commits March 10, 2020 18:32

added dssp

418a226

added helanal

87c4484

added helanal images

791774f

orbeckst mentioned this pull request Jul 2, 2020

Typo in RMSF docstring MDAnalysis/mdanalysis#2806

Closed

lilyminium closed this Dec 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Analysis #38

[WIP] Analysis #38

lilyminium commented Feb 7, 2020 •

edited

Loading

lilyminium commented Feb 7, 2020

orbeckst commented Feb 7, 2020

orbeckst commented Feb 7, 2020

orbeckst commented Feb 7, 2020

orbeckst commented Feb 7, 2020

richardjgowers commented Feb 7, 2020 via email

lilyminium commented Feb 8, 2020 •

edited

Loading

orbeckst Feb 17, 2020

orbeckst Feb 17, 2020 •

edited

Loading

orbeckst Feb 17, 2020 •

edited

Loading

orbeckst Feb 17, 2020

orbeckst Feb 17, 2020

lilyminium Feb 17, 2020

lilyminium commented Feb 19, 2020

richardjgowers commented Feb 19, 2020

orbeckst commented Feb 19, 2020

orbeckst commented Feb 19, 2020 •

edited

Loading

arose commented Feb 19, 2020

orbeckst commented Feb 19, 2020

lilyminium commented Jun 7, 2020

lilyminium commented Jun 7, 2020

lilyminium commented Dec 29, 2020

[WIP] Analysis #38

[WIP] Analysis #38

Conversation

lilyminium commented Feb 7, 2020 • edited Loading

lilyminium commented Feb 7, 2020

orbeckst commented Feb 7, 2020

orbeckst commented Feb 7, 2020

orbeckst commented Feb 7, 2020

orbeckst commented Feb 7, 2020

richardjgowers commented Feb 7, 2020 via email

lilyminium commented Feb 8, 2020 • edited Loading

orbeckst Feb 17, 2020

Choose a reason for hiding this comment

orbeckst Feb 17, 2020 • edited Loading

Choose a reason for hiding this comment

orbeckst Feb 17, 2020 • edited Loading

Choose a reason for hiding this comment

orbeckst Feb 17, 2020

Choose a reason for hiding this comment

orbeckst Feb 17, 2020

Choose a reason for hiding this comment

lilyminium Feb 17, 2020

Choose a reason for hiding this comment

lilyminium commented Feb 19, 2020

richardjgowers commented Feb 19, 2020

orbeckst commented Feb 19, 2020

orbeckst commented Feb 19, 2020 • edited Loading

arose commented Feb 19, 2020

orbeckst commented Feb 19, 2020

lilyminium commented Jun 7, 2020

lilyminium commented Jun 7, 2020

lilyminium commented Dec 29, 2020

lilyminium commented Feb 7, 2020 •

edited

Loading

lilyminium commented Feb 8, 2020 •

edited

Loading

orbeckst Feb 17, 2020 •

edited

Loading

orbeckst Feb 17, 2020 •

edited

Loading

orbeckst commented Feb 19, 2020 •

edited

Loading