Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding parts to the users guide regarding the Default guesser and guess_topologyAttributes #213

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions doc/source/formats/guessers/default_guesser.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
.. -*- coding: utf-8 -*-
.. _default-guesser:

====================
Default guesser
====================

The default guesser class contain generic methods that is not intended to any specific context. It is useful in general purpose guessing for universes that has no special cases.


.. _guessing-masses:

Masses
======

Atom masses are always guessed for every file format. They are guessed from the ``Atom.atom_type``. This attribute represents a number of different values in MDAnalysis, depending on which file format you used to create your Universe. ``Atom.atom_type`` can be force-field specific atom types, from files that provide this information; or it can be an element, guessed from the atom name. `See further discussion here. <https://github.com/MDAnalysis/mdanalysis/issues/2348>`_


.. important::

When an atom mass cannot be guessed from the atom ``atom_type`` or ``name``, the atom is assigned a mass of 0.0. Masses are guessed atom-by-atom, so even if most atoms have been guessed correctly, it is possible that some have been given masses of 0. It is important to check for non-zero masses before using methods that rely on them, such as :meth:`AtomGroup.center_of_mass`.


Types
=====

When atom ``atom_type``\ s are guessed, they represent the atom element. Atom types are always guessed from the atom name. MDAnalysis follows biological naming conventions, where atoms named "CA" are much more likely to represent an alpha-carbon than a calcium atom. This guesser is still relatively fragile for non-traditionally biological atom names.

Bonds, Angles, Dihedrals, Impropers
====================================

MDAnalysis can guess if bonds exist between two atoms, based on the distance between them. A bond is created if the 2 atoms are within

.. math::

d < f \cdot (R_1 + R_2)

of each other, where :math:`R_1` and :math:`R_2` are the VdW radii
of the atoms and :math:`f` is an ad-hoc *fudge_factor*. This is
the `same algorithm that VMD uses`_.

Angles can be guessed from the bond connectivity. MDAnalysis assumes that if atoms 1 & 2 are bonded, and 2 & 3 are bonded, then (1,2,3) must be an angle.

::

1
\
2 -- 3

Dihedral angles and improper dihedrals can both be guessed from angles. Proper dihedrals are guessed by assuming that if (1,2,3) is an angle, and 3 & 4 are bonded, then (1,2,3,4) must be a dihedral.

::

1 4
\ /
2 -- 3

Likewise, if (1,2,3) is an angle, and 2 & 4 are bonded, then (2, 1, 3, 4) must be an improper dihedral (i.e. the improper dihedral is the angle between the planes formed by (1, 2, 3) and (1, 3, 4))

::

1
\
2 -- 3
/
4

The method available to users is :meth:`AtomGroup.guess_bonds <MDAnalysis.core.groups.AtomGroup.guess_bonds>`, which allows users to pass in a dictionary of van der Waals' radii for atom types. This guesses bonds, angles, and dihedrals (but not impropers) for the specified AtomGroup and adds it to the underlying Universe.


.. _`same algorithm that VMD uses`:
http://www.ks.uiuc.edu/Research/vmd/vmd-1.9.1/ug/node26.html
12 changes: 12 additions & 0 deletions doc/source/formats/guessers_list.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. -*- coding: utf-8 -*-
.. _guessers-list:

====================
Guessers list
====================

.. toctree::
:maxdepth: 1
:glob:

guessers/*
69 changes: 12 additions & 57 deletions doc/source/formats/guessing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,67 +5,22 @@
Guessing
====================

When a Universe is created from a Universe, MDAnalysis guesses properties that have not been read from the file. Sometimes these properties are available in the file, but are simply not read by MDAnalysis. For example, :ref:`masses are always guessed <guessing-masses>`.
When a Universe is created from a Universe, MDAnalysis can guesses properties that have not been read from the file. Sometimes these properties are available in the file, but are simply not read by MDAnalysis. For example, masses are always guessed.
The :mod:`~MDAnalysis.guesser` module contains different context-specific guessers. This can be forcefield-specific like :class:`~MDAnalysis.guesser.default_guesser.MartiniGuesser`, or format-specific guesser like :class:`~MDAnalysis.guesser.default_guesser.PDBGuesser`.
You can utilize guessers either by initiating an object of it or through the :meth:`~MDAnalysis.core.universe.Universe.guess_TopologyAttributes` API of the universe to guess various properties to the universe. See :ref:`Guessing topology attributes <guessing-topology-attributes>` for details.

.. _guessing-masses:
.. _available-guessers:

Masses
======
Available guessers
===================

Atom masses are always guessed for every file format. They are guessed from the ``Atom.atom_type``. This attribute represents a number of different values in MDAnalysis, depending on which file format you used to create your Universe. ``Atom.atom_type`` can be force-field specific atom types, from files that provide this information; or it can be an element, guessed from the atom name. `See further discussion here. <https://github.com/MDAnalysis/mdanalysis/issues/2348>`_

Here is a list of the currently available context-specific guesser and what attributes they can guess

.. important::

When an atom mass cannot be guessed from the atom ``atom_type`` or ``name``, the atom is assigned a mass of 0.0. Masses are guessed atom-by-atom, so even if most atoms have been guessed correctly, it is possible that some have been given masses of 0. It is important to check for non-zero masses before using methods that rely on them, such as :meth:`AtomGroup.center_of_mass`.
+--------------------------------------------+-----------------------+-----------------------------------------------------------------------------------------------------------------+
| **guesser** | **context** | **to_guess** |
+--------------------------------------------+-----------------------+-----------------------------------------------------------------------------------------------------------------+
| :ref:`DefaultGuesser <default-guesser>` | :code:`default` | masses, atom types, elements, bonds, angles, dihedrals, improper dihedrals, aromaticities, gasteiger charges |
+--------------------------------------------+-----------------------+-----------------------------------------------------------------------------------------------------------------+


Types
=====

When atom ``atom_type``\ s are guessed, they represent the atom element. Atom types are always guessed from the atom name. MDAnalysis follows biological naming conventions, where atoms named "CA" are much more likely to represent an alpha-carbon than a calcium atom. This guesser is still relatively fragile for non-traditionally biological atom names.

Bonds, Angles, Dihedrals, Impropers
====================================

MDAnalysis can guess if bonds exist between two atoms, based on the distance between them. A bond is created if the 2 atoms are within

.. math::

d < f \cdot (R_1 + R_2)

of each other, where :math:`R_1` and :math:`R_2` are the VdW radii
of the atoms and :math:`f` is an ad-hoc *fudge_factor*. This is
the `same algorithm that VMD uses`_.

Angles can be guessed from the bond connectivity. MDAnalysis assumes that if atoms 1 & 2 are bonded, and 2 & 3 are bonded, then (1,2,3) must be an angle.

::

1
\
2 -- 3

Dihedral angles and improper dihedrals can both be guessed from angles. Proper dihedrals are guessed by assuming that if (1,2,3) is an angle, and 3 & 4 are bonded, then (1,2,3,4) must be a dihedral.

::

1 4
\ /
2 -- 3

Likewise, if (1,2,3) is an angle, and 2 & 4 are bonded, then (2, 1, 3, 4) must be an improper dihedral (i.e. the improper dihedral is the angle between the planes formed by (1, 2, 3) and (1, 3, 4))

::

1
\
2 -- 3
/
4

The method available to users is :meth:`AtomGroup.guess_bonds <MDAnalysis.core.groups.AtomGroup.guess_bonds>`, which allows users to pass in a dictionary of van der Waals' radii for atom types. This guesses bonds, angles, and dihedrals (but not impropers) for the specified AtomGroup and adds it to the underlying Universe.


.. _`same algorithm that VMD uses`:
http://www.ks.uiuc.edu/Research/vmd/vmd-1.9.1/ug/node26.html
29 changes: 28 additions & 1 deletion doc/source/universe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,12 +126,39 @@ For example, to construct a universe with 6 atoms in 2 residues:

`See this notebook tutorial for more information. <examples/constructing_universe.ipynb>`_

.. _guessing-topology-attributes:

----------------------------
Guessing topology attributes
----------------------------

MDAnalysis can guess two kinds of information. Sometimes MDAnalysis guesses information instead of reading it from certain file formats, which can lead to mistakes such as assigning atoms the wrong element or charge. See :ref:`the available topology parsers <topology-parsers>` for a case-by-case breakdown of which atom properties MDAnalysis guesses for each format. See :ref:`guessing` for how attributes are guessed, and :ref:`topologyattr-defaults` for which attributes have default values.
MDAnalysis has a guesser library that hold various guesser classes. Each guesser class is tailored to be context-specific. For example, PDBGuesser is specific for guessing attributes for PDB file format. See :ref:`guessing` for more details about the available context-aware guessers.
The Universe has :meth:`~MDAnalysis.core.universe.Universe.guess_TopologyAttributes` API, which ability to guess an attribute within a specific context either at the universe creation or by using the API directly.
For example, to guess ``element`` attribute for a PDB file by either of two ways:

.. ipython:: python
:okwarning:

u = mda.Universe(PDB, context='PDB', to_guess=['elements'])

or

.. ipython:: python
:okwarning:

u = mda.Universe(PDB)
u.guess_TopologyAttributes(context='PDB', to_guess=['elements'])

**The following options modify how to guess attribute(s):**

* :code:`context`: the context of the guesser to be used in guessing the attribute. You can pass either a string representing the context (see :ref:`guessing` for more detail about available guessers and their context), or as an object of a guesser class. The default value of the context is :code:`default`, which corresponds to a generic :class:`~MDAnalysis.guesser.default_guesser.DefaultGuesser`, that is not specific to any context. You can pass a context once, and whenever you call :meth:`~MDAnalysis.core.universe.Universe.guess_TopologyAttributes` again it will assume that you still using the same context. N.B.: If you didn't pass any ``context`` to the API, it will use the :class:`~MDAnalysis.guesser.default_guesser.DefaultGuesser`

* :code:`to_guess`: list of the attributes to be guessed (these attributes will be either guessed if they don't exist in the universe or partially guessed by only filling its empty values if universe has the attribute). This has to be the plural name of the attributes (masses not mass).
* :code:`force_guess`: a list of attributes to be forced guessed (these attributes will be either guessed if they don't exist in the universe or their values will be completely overwritten by guessed ones if the universe has the attribute). This has to be the plural name of the attributes (masses not mass).
* :code:`**kwargs`: to pass any supplemental data to the :meth:`~MDAnalysis.core.universe.Universe.guess_TopologyAttributes` API that can be useful in guessing some attributes (eg. passing vdwradii for bond guessing).

For now, MDAnalysis automatically guess :code:`types` and * :code:`masses` at the universe creation by having a default value of the :code:`to_guess` parameter to be * :code`['types', 'masses']`. This is done using the :class:`~MDAnalysis.guesser.default_guesser.DefaultGuesser`.
you can stop this by passing ``()`` to the ``to_guess`` parameter.

.. _universe-properties:

Expand Down