-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Analysis #38
[WIP] Analysis #38
Conversation
As always, a built version is at https://lilyminium.github.io/UserGuide/ |
RE: PSA — we did a more real-life example in https://www.mdanalysis.org/SPIDAL-MDAnalysis-Midas-tutorial/ and the data set is available as https://www.mdanalysis.org/MDAnalysisData/adk_transitions.html ; @sseyler created a tutorial in 2015 https://github.com/Becksteinlab/PSAnalysisTutorial but this might be out-of-date with the current code. By the way, finding pairs is a useful tool once one wants to know why two trajectories differ. I don't suggest that you change the example that you have. But perhaps add links to the other PSA tutorials, with a note that these showe more complicated cases and more advanced functions but might be somewhat out of date. If we have someone particularly interested in PSA then they can update the User Guide. |
|
It seems pretty clear that we will go to 1.0.0 and that we're not bothering with 0.21.0 — @richardjgowers ?? — so change references from 0.21 to 1.0. In general, I would say you can simply state that the User Guide requires MDAnalysis 1.0 or better. If you link to specific classes such as AverageStructure then they have their own versionadded notes. |
I would not say this about the PSF/DCD data from the tests. It's a non-equilibrium trajectory that is biased to make a transition, so computing fluctuations on it seems inappropriate. A better example would be a real AdK equilibrium trajectory https://www.mdanalysis.org/MDAnalysisData/adk_equilibrium.html If you want to stick with the test trajectory then I would say that the apparent fluctuations in these domains are high but in this particular case it is due to the fact that these domains move from their closed to their open conformation. Note that this trajectory does not sample from the equilibrium distribution so it's not really meaningful to calculate these fluctuations (but we do it anyway for demonstration purposes). |
Yeah next is 1.0 but I’m gonna convert all 21.0 to 1.0 so either works.
…On Fri, Feb 7, 2020 at 18:06, Oliver Beckstein ***@***.***> wrote:
https://lilyminium.github.io/UserGuide/examples/analysis/alignment_and_rms/rmsf.html#Plotting-RMSF
: "As we can see, the LID and NMP residues indeed move much more compared
to the rest of the enzyme."
I would not say this about the PSF/DCD data from the tests. It's a
non-equilibrium trajectory that is biased to make a transition, so
computing fluctuations on it seems inappropriate. A better example would be
a real AdK equilibrium trajectory
https://www.mdanalysis.org/MDAnalysisData/adk_equilibrium.html
If you want to stick with the test trajectory then I would say that the
apparent fluctuations in these domains are high but in this particular case
it is due to the fact that these domains move from their closed to their
open conformation. Note that this trajectory does not sample from the
equilibrium distribution so it's not really meaningful to calculate these
fluctuations (but we do it anyway for demonstration purposes).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#38?email_source=notifications&email_token=ACGSGB6ZPVQTBFWRNARJGP3RBWPLNA5CNFSM4KRGGESKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELEACAY#issuecomment-583532803>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGSGB2OVNUXI34ZHCF72FLRBWPLNANCNFSM4KRGGESA>
.
|
@orbeckst Thanks for the comments, I wasn't aware of the MDAnalysisData package. This makes things a lot easier.
Should I remove the Minimum version lines from the first cell then? I was also wondering if the Last updated was overkill. Edit: I mostly look at this as a Jupyter notebook where the first cell looks neat, but all the info takes up quite a lot of space in the HTML conversion. |
|
||
/examples/analysis/polymers_and_membranes/polymer | ||
/examples/analysis/polymers_and_membranes/hole | ||
/examples/analysis/polymers_and_membranes/hole2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hole2 docs show up with the same title as the hole ones. Perhaps put "Deprecated", "Old", or somthing else in the title of the old module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References in the hole notebooks did not resolve eg the Note:
The classes in MDAnalysis.analysis.hole are wrappers around the HOLE program. Please cite ([smart_pore_1993], [smart_hole_1996]) when using this module in published work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hole2 docs still contain reference to HOLEtraj
One of the limitations of the hole program is that it can only accept PDB files. In order to use other formats with hole, or to run hole on trajectories, we can use the hole.HOLEtraj class with an MDAnalysis.Universe
and ht.profiles
instead of ha
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gif of the hole surface is cute but the gA channel is not properly shown in VMD because of the distortions along the normal mode. To get a nice representation do the following:
import MDAnalysis as mda; from MDAnalysis.tests import datafiles as data
# write out the structure without ETA (which elNemo ignored :-p)
u = mda.Universe(data.PDB_HOLE)
u.select_atoms("not resname ETA").write("elNemo_ga.pdb")
Use the above
vmd elNemo_ga.pdb path/to/1grm_elNemo_mode7.pdb
and delete the first frame in the trajectory.
We need to to remove ETA (the ethanolamine C-terminus) from the original PDB file because the elNemo trajectory does not contain it and so you cannot directly use PDB_HOLE as the "topology" for MULTI_PDB_HOLE.
This should give you a nicer representation of the channel around the pore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use your new plot_order_parameters(..., aggregrator=...)
function to do the most common application of a hole trajectory profile, the "average HOLE profile" with "mean(R) ± std(R)" (as at the end of https://github.com/MDAnalysis/binder-notebook/blob/master/notebooks/analysis/hole-basics.ipynb )?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but you could plot the fillbetween standard deviation with on the returned axis from aggregator=np.mean
, or just make the plot yourself with the data returned from over_order_parameters
. I could add another plotting function for this?
Thanks @orbeckst. Is this approximately what you were thinking? (Is there a way to use GROMACS to wrap molecules back into the unit cell?) |
Those embedded nglviews are awesome! |
Awesome, I am so happy to see the transform + density workflow actually working (probably glacially slow but that's just an engineering problem ;-) ). Apparently, nglview can display density data, at least it is mentioned in the API docs. Maybe @arose can give us a hint how to load a DX file ... or directly use a NumPy array with a density? |
The EDIT: Gromacs also has the really nice |
I think |
Thanks! In principle we can write ccp4 ... unless there's a bug in the writer MDAnalysis/GridDataFormats#76 |
f076616
to
ad26988
Compare
This has merge conflicts up the wazoo and I'm not sure how many of the notebooks are even accurate to behaviour anymore; it's probably best to separate each one out and merge one-by-one after checking. I do still have them up at my fork for easier viewing. |
I can do this over the next few days myself but help is very welcome. |
Closes #4, #7, most of #29
This is taking much longer than I thought it would. I would appreciate any feedback on what I have so far, what I should prioritise next, and also any changes I should make before I try to finish the rest of the notebooks.
Every notebook aims to show a useful use case of a class or function in MDAnalysis, point out variables that users will likely need to change, visualise the results, and interpret them in some way.
Notebooks on Analysis modules
match_atoms
(new in 0.21.0 or 1.0)[ ] PSAPairdon't think people will use this much[ ] Pairdon't think people will use this much[ ] confdistmatrixnot for casual users[ ] covariance_matrixnot for casual users[ ] bootstrapnot for casual users[ ] hbond_analysissoon to be deprecated?[x] HOLE[x] HOLEtrajdensity_from_Universe[ ] density_from_PDBOther notebooks added
References and executing the notebooks
In #34 I mentioned that a pro of having examples in Jupyter notebooks is that the examples get checked by executing the notebook. To ensure that the notebooks get checked every time the documentation is built, I've written a script
scripts/clean_example_notebooks.py
that takes in a list of notebooks. For each notebook, it:This script already successfully fails on the harmonic ensemble clustering notebook. In this notebook, the shorthand references haven't been converted yet.
Caveats
The NGLView widget can only save the molecule data if it is displayed in a browser. These notebooks are not executed in a browser with the script, and there are only a few of them, so right now I just re-run them after executing through the script. If this becomes too annoying I can convert to selenium.
(The script will print which notebooks have NGLView in them and need to be re-executed when it's done.)
References
nbsphinx
can handle citations throughsphinxcontrib-bibtex
, so I moved the user guide references to a bibtex filereferences.bib
. Pasting citations gets quite tedious when making notebooks, so I use a shorthand with the pattern '#{bibtex-key}'. e.g.#michaud-agrawal_mdanalysis_2011
gets converted to the reference with a bibtex key ofmichaud-agrawal_mdanalysis_2011
. The full HTML in the notebook looks like<a data-cite="michaud-agrawal_mdanalysis_2011" href="https://doi.org/10.1002/jcc.21787">Michaud-Agrawal *et al.*, 2011</a>
The inline display in the browser looks quite ugly (e.g. first cell here) but I'm not sure how to change that in sphinxcontrib-bibtex.
Structure of docs
Right now, the notebooks are all displayed in Examples and in the sidebar of Analysis. It can be a bit distracting to click on a notebook from the main Analysis page, and suddenly realise you've moved to the Examples section in the sidebar. I might fiddle with this later.