Skip to content

Commit

Permalink
Ensemble creation feature (#68)
Browse files Browse the repository at this point in the history
* first draft of ensemble code

* add ensemble.py

* working version, still working on merging identical nodes

* add merging of identical nodes and some cleaning

* try to add type annotations

* adding ensemble usage example

* update docs for ensembles and more

* further docs

* update example wording
  • Loading branch information
lubbersnick authored Apr 13, 2024
1 parent 604a93c commit 7d3e0f3
Show file tree
Hide file tree
Showing 18 changed files with 625 additions and 25 deletions.
25 changes: 25 additions & 0 deletions docs/source/examples/ensembles.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Ensembling Models
#################


Using the :func:`~hippynn.graphs.make_ensemble` function makes it easy to combine models.

By default, ensembling is based on the db_name for the nodes in each input graph.
Nodes which have the same name will be assigned an ensemble node which combines
the different versions of that quantity, and additionally calculates the
mean and standard deviation.

It is easy to make an ensemble from a glob string or a list of directories where
the models are saved::

from hippynn.graphs import make_ensemble
model_form = '../../collected_models/quad0_b512_p5_GPU*'
ensemble_graph, ensemble_info = make_ensemble(model_form)

The ensemble graph takes the inputs which are required for all of the models in the ensemble.
The ``ensemble_info`` object provides the counts for the inputs and targets of the ensemble
and the counts of those corresponding quantities across the ensemble members.

A typical use case would be to then build a Predictor or ASE Calculator from the ensemble.
See :file:`~examples/ensembling_models.py` for a detailed example.

8 changes: 6 additions & 2 deletions docs/source/examples/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ Examples

Here are some examples about how to use various features in
``hippynn``. Besides the :doc:`/examples/minimal_workflow` example,
the examples are just snippets. For fully-fledged examples see the
``examples`` directory in the repository.
the examples are just snippets. For runnable example scripts, see
`the examples at the hippynn github repository`_

.. _`the examples at the hippynn github repository`: https://github.com/lanl/hippynn/tree/development/examples

.. toctree::
:maxdepth: 1
Expand All @@ -13,6 +15,7 @@ the examples are just snippets. For fully-fledged examples see the
controller
plotting
predictor
ensembles
periodic
forces
restarting
Expand All @@ -21,3 +24,4 @@ the examples are just snippets. For fully-fledged examples see the
excited_states
weighted_loss


21 changes: 19 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,28 @@ What is hippynn?
We aim to provide high-performance modular design so that different
components can be re-used, extended, or added to. You can find more information
at the :doc:`/user_guide/features` page. The development home is located
at `the hippynn github repository`_.
at `the hippynn github repository`_, which also contains `many example files`_

The main components of hippynn are constructing models, loading databases,
training the models to those databases, making predictions on new databases,
and interfacing with other atomistic codes. In particular, we provide interfaces
to `ASE`_ (prediction), `PYSEQM`_ (training/prediction), and `LAMMPS`_ (prediction).
hippynn is also used within `ALF`_ for generating machine learned potentials
along with their training data completely from scratch.

Multiple formats for training data are supported, including
Numpy arrays, the ASE Database, `fitSNAP`_ JSON format, and `ANI HDF5 files`_.

.. _`ASE`: https://wiki.fysik.dtu.dk/ase/
.. _`PYSEQM`: https://github.com/lanl/PYSEQM/
.. _`LAMMPS`: https://www.lammps.org
.. _`fitSNAP`: https://github.com/FitSNAP/FitSNAP
.. _`ANI HDF5 files`: https://doi.org/10.1038/s41597-020-0473-z
.. _`ALF`: https://github.com/lanl/ALF/

.. _`the hippynn github repository`: https://github.com/lanl/hippynn/
.. _`many example files`: https://github.com/lanl/hippynn/tree/development/examples


.. toctree::
:maxdepth: 1
Expand All @@ -27,7 +45,6 @@ at `the hippynn github repository`_.
hippynn API documentation <api_documentation/hippynn>
license


Indices and tables
==================

Expand Down
19 changes: 17 additions & 2 deletions docs/source/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,21 @@ Interfacing codes:
Installation Instructions
^^^^^^^^^^^^^^^^^^^^^^^^^

Conda
-----
Install using conda::

conda install -c conda-forge hippynn

Pip
---
Install using pip::

pip install hippynn

Install from source:
--------------------

Clone the hippynn_ repository and navigate into it, e.g.::

$ git clone https://github.com/lanl/hippynn.git
Expand All @@ -55,14 +70,14 @@ Clone the hippynn_ repository and navigate into it, e.g.::
out ``cupy`` from the conda_requirements.txt file.

Dependencies using conda
-------------------------
........................

Install dependencies from conda using recommended channels::

$ conda install -c pytorch -c conda-forge --file conda_requirements.txt

Dependencies using pip
-----------------------
.......................

Minimum dependencies using pip::

Expand Down
56 changes: 56 additions & 0 deletions examples/ensembling_models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
import torch
import hippynn

if torch.cuda.is_available():
device = 0
else:
device = 'cpu'

### Building the ensemble just requires calling one function call.
model_form = '../../collected_models/quad0_b512_p5_GPU*'
ensemble_graph, ensemble_info = hippynn.graphs.make_ensemble(model_form)

# Retrieve the ensemble node which has just been created.
# The name will be the prefix 'ensemble' followed by the db_name from the ensemble members.
ensemble_energy = ensemble_graph.node_from_name("ensemble_T")

### Building an ASE calculator for the ensemble

import ase.build

from hippynn.interfaces.ase_interface import HippynnCalculator

# The ensemble node has `mean`, `std`, and `all` outputs.
energy_node = ensemble_energy.mean
extra_properties = {"ens_predictions": ensemble_energy.all, "ens_std": ensemble_energy.std}
calc = HippynnCalculator(energy=energy_node, extra_properties=extra_properties)
calc.to(device)

# build something and attach the calculator
molecule = ase.build.molecule("CH4")
molecule.calc = calc

energy_value = molecule.get_potential_energy() # Activate calculation to get results dict

print("Got energy", energy_value)
print("In units of kcal/mol", energy_value / (ase.units.kcal/ase.units.mol))

# All outputs from the ensemble members. Because the model was trained in kcal/mol, this is too.
# The name in the results dictionary comes from the key in the 'extra_properties' dictionary.
print("All predictions:", calc.results["ens_predictions"])


### Building a Predictor object for the ensemble
pred = hippynn.graphs.Predictor.from_graph(ensemble_graph)

# get batch-like inputs to the ensemble
z_vals = torch.as_tensor(molecule.get_atomic_numbers()).unsqueeze(0)
r_vals = torch.as_tensor(molecule.positions).unsqueeze(0)

pred.to(r_vals.dtype)
pred.to(device)
# Do some computation
output = pred(Z=z_vals, R=r_vals)
# Print the output of a node using the node or the db_name.
print(output[ensemble_energy.all])
print(output["T_all"])
2 changes: 2 additions & 0 deletions hippynn/graphs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from .graph import GraphModule

from .predictor import Predictor
from .ensemble import make_ensemble

__all__ = [
"get_subgraph",
Expand All @@ -39,4 +40,5 @@
"GraphModule",
"Predictor",
"IdxType",
"make_ensemble",
]
Loading

0 comments on commit 7d3e0f3

Please sign in to comment.