Add entropy and enthalpy #406

jaclark5 · 2024-09-19T13:09:59Z

Add enthalpy and entropy outputs for MBAR and BAR (via 2-state MBAR).

codecov · 2024-09-19T13:16:01Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.80%. Comparing base (9e8139e) to head (d4dbeb0).
Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #406      +/-   ##
==========================================
+ Coverage   98.78%   98.80%   +0.01%     
==========================================
  Files          28       28              
  Lines        1982     2008      +26     
  Branches      354      356       +2     
==========================================
+ Hits         1958     1984      +26     
  Misses          2        2              
  Partials       22       22

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…e coverage

xiki-tempula · 2024-09-30T09:04:37Z

src/alchemlyb/estimators/bar_.py

        Parameters
        ----------
        u_nk : DataFrame
            u_nk[n,k] is the reduced potential energy of uncorrelated
            configuration n evaluated at state k.

+        use_mbar : bool, optional, default=False


Sorry do you mind explain to me why do you want to do this in BAR? I think if you want the enthalpic and entropic contributions, you would just run MBAR?

xiki-tempula · 2024-09-30T09:08:47Z

I'm happy with the MBAR changes, it looks simple and good. I kind of not too happy with the BAR one as it looks very complicated and I think if one want to use MBAR to compute entropy and enthalpy.
One shall just use MBAR?

Or one could do a BAR calculation first and then run MBAR again to get entropy and enthalpy. Instead of use BAR and then run MBAR inside BAR to get entropy and enthalpy.

mrshirts · 2024-09-30T15:05:57Z

@jaclark5 and I have been talking over things (she can discuss her side here) and the current thought is that MBAR is better for entropy and enthalpies, though there are some questions about the calculations she is still interested in.

The BAR algorithm is more surely converging since there are better methods for 1-dimensional root finding (I think I coded up false position) than N-dimensional root finding. One can start with BAR, and then use those solutions to initialize 2-state MBAR (should just take 1-2 steps as it locks in numerically, as the values should be the same).

jaclark5 · 2024-09-30T21:01:08Z

So the suggestion is to remove the use_mbar flag, and instead run 2-state mbar only for the entropies / enthalpies? If that's the case, then if calculate_enthalpy_entropy is True then BAR.fit() will run BAR and 2-state MBAR in fit(). I can make that change if it sounds agreeable.

xiki-tempula · 2024-10-01T08:46:42Z

Would this kind of interface work?

bar = BAR()
bar.fit(u_nk)
mbar=MBAR(initial_guess=bar._delta_f) # Need to do some data massage to be able to get this in as initial guess.
mbar.fit(u_nk, compute_entropy_enthalpy=True)

Note that if you just need to use BAR to provide an initial guess for the MBAR, you could just do

mbar=MBAR(initial_guess='bar') 
mbar.fit(u_nk, compute_entropy_enthalpy=True)

jaclark5 · 2024-10-01T12:45:31Z

If you had a populated matrix with all the U(X_i, lambda_j) values, that would yield a different results than 2-state MBAR, which is more akin to BAR. We could include a method that strips u_nk of energy values that are not from adjacent states, then your suggestion would work, but getting an estimate from BAR isn't needed, it can go straight to MBAR.
What if BAR is left alone and has no entropy/enthalpy functionality, but an option is added to the initialization of the MBAR class to strip energy values from nonadjacent states in u_nk; something like 2_state=False by default, which would produce the same result as the current BAR(use_mbar=True)... I think... I would want to test how MBAR handles a sparse matrix. If that isn't equivalent, then a for-loop, like the loop in the BAR class, is needed in the MBAR.fit() method and I think that will look worse, but maybe more appropriate than calling the MBAR class in the BAR class.

Ultimately my motivation is to get enthalpy/entropy values using 2-state MBAR with the simplicity that alchemlyb offers. It turns out that nonadjacent values of U(x_i,lambda_j) in LAMMPS have a numerical issue and I cannot use MBAR, except in the 2-state case. Although it seems that the enthalpy and entropy values for the gromacs benzene example are pretty different for full MBAR vs 2-state MBAR case so I'm not sure if I'll move forward with reporting these values until that is rectified. When/if it is, I would like for my results to be reproducible in some way with this PR.

mrshirts · 2024-10-01T13:46:48Z

I think... I would want to test how MBAR handles a sparse matrix.

It can't, it will give incorrect results. It's not equivalent.

The advantage of a BAR interface to enthalpy-entropy, I see, is when you did NOT collect the full MBAR data, and only have neighboring values.

Although it seems that the enthalpy and entropy values for the gromacs benzene example are pretty different for full MBAR vs 2-state MBAR case so I'

Yeah, I realize this is still pending with me to look at, things have been piling up.

jaclark5 · 2024-10-01T19:19:27Z

Ok, so if there is merit to offering a calculation of 2-state MBAR (which I have need for, and it seems could be more generally used as-well), there are a few options:

Keep the PR as is where the BAR class can contain a 2-state MBAR option within the existing for-loop structure for BAR
Leave the BAR class alone, and and add a 2-state MBAR option to the MBAR class and add the same sort of for-loop structure that is present in the BAR class.
Fill in the blank

xiki-tempula · 2024-10-02T07:42:57Z

For the MBAR part, I think I'm happy with the current state, where a simple call to pymbar gives the entropy and enthalpy.
With regard to the BAR part, I think it is a rather peculiar use case. We might need more discussion for it to be included.

jaclark5 · 2024-11-12T20:09:42Z

Hey @xiki-tempula, I removed the changes to BAR. I was able to resolve my issues with MBAR and LAMMPS. I think you'll find this PR acceptable.

xiki-tempula · 2024-11-13T08:27:18Z

src/alchemlyb/estimators/mbar_.py

@@ -57,6 +57,20 @@ class MBAR(BaseEstimator, _EstimatorMixOut):
        The estimated statistical uncertainty (one standard deviation) in
        dimensionless free energy differences.

+    delta_h_ : DataFrame
+        The estimated dimensionless enthalpy difference between each state.


Is enthalpy dimensionless? Should it be having energy unit?

@xiki-tempula Do you want me to change the docstring? I put this since the existing documentation for free energy is:

delta_f_ : DataFrame The estimated dimensionless free energy difference between each state.

and I'm treating enthalpy and entropy the same as free energy... it seems I should change entropy appropriately though. See the new commit.

Ok, then I think this is fine. I will change this in a later PR.

CHANGES

xiki-tempula · 2024-11-13T08:29:19Z

src/alchemlyb/estimators/mbar_.py


        self._delta_f_.attrs = u_nk.attrs
        self._d_delta_f_.attrs = u_nk.attrs
+        if compute_entropy_enthalpy:


I see that you do add the units back.

xiki-tempula

This is good. Thanks. I'm a bit concerned with the description of dimensionless enthalpy as it has unit on it but I guess it is trivial.

xiki-tempula · 2024-11-13T13:39:26Z

src/alchemlyb/estimators/mbar_.py

+            self._d_delta_h_.attrs = u_nk.attrs
+            self._delta_s_.attrs = u_nk.attrs
+            self._d_delta_s_.attrs = u_nk.attrs
+            self._delta_s_.attrs["energy_unit"] = "k"


Why does entropy has unit of k? I think it shall be u_nk.attrs["energy_unit"] + '/K'?

Personally, I'm fine for this unit to be the same as other energy unit (though I know it is wrong) as it would mean that the unit conversion would work as it is (between J and cal).

Free energy is in units of energy (J) and when scaled by kT, is in units of kT. Similarly entropy is in units of energy/unit temperature (J/K) and so is scaled by k, with the units of k.

Reduced entropy should be unitless, entropy should be J/degree, right? Even if passed through some unitless quantities, probably the code should be clear what is going on.

I think it's important to have the correct units, so I'll look into the unit conversion, I didn't consider that feature.

The way alchemlyb defines the units as u_nk.attrs['energy_unit'] = "kT". Adopting the same treatment for entropy to similarly have the correct scaling quantity, 'k' would be used to follow suit.
If alchemlyb's definition of u_nk.attrs['energy_unit'] = "kT" is not the message that is best to send, that's outside of this PR. Although I like it, I think implies that it's dimensionless while also indicating what quantities should be used to restore the units (although I hope that an average user would know).

The energy_unit, allowed to be kT; kcal/mol and kJ/mol, it is just defaults to kT. I think k is more like a scientific constant for most people so having it as unit is kind of bad.
The unit conversion is supported between these three units, which is just a scaling factor.
So the easiest solution is to just set the entropy to have the same unit as enthalpy.

To adjust the unit conversion functionality to accommodate entropy, should I add the appropriate duplicated function of to_kT as to_k or add if-statements to detect the case where it's entropy? I'm leaning toward the former since running a matrix of entropy values through to_kT() would feel wrong. This would mean I would duplicate to_kcalmol(df, T=None) as to_kcalmolK(df) and to_kJmol(df, T=None) as to_kJmolK(df).

@xiki-tempula entropy can't have the same units at kT, otherwise the existing functions would multiply it by a temperature resulting in S*T instead of S. If that's what you want to do we can and be clear in the documentation.

Although the units would still be incorrect all the time with a label of "entropy", maybe adjusting the attribute to be delta_sT would be best.

So we could have enthalpy in kT, kJ/mol and kcal/mol.
The entropy should then be in k, kJ/mol/T and kcal/mol/T.

In the current implementation we could have a case where enthalpy is kJ/mol and it is labelled correctly as kJ/mol. But we have entropy in kJ/mol/T but is labelled as k.

xiki-tempula · 2024-11-13T20:03:23Z

src/alchemlyb/estimators/mbar_.py

+            self._d_delta_h_ = pd.DataFrame(
+                out["dDelta_u"], columns=self._states_, index=self._states_
+            )
+            self._delta_sT_ = pd.DataFrame(


I'm happy for this value to be s*T. Is the mbar output s or s*T?

Well the mbar output is reduced entropy, so S/k or sT/(kT) so it's consistent.

I know that the unit is the same but I worried about the magnitude.
The G=H-sT so if the Delta_s is s then the Delta_f should be the same as Delta_u-Delta_s * T. If Delta_s is sT, then Delta_f should be Delta_u-Delta_s

Yes I agree, so I'm changing the variable names to make that clear. I'll do whatever the maintainers agree on, maybe @mrshirts or @orbeckst have an opinion?

Summary: Entropy has different units and which need to be handled properly. While the units of enthalpy and free energy are attr['energy_units'] = 'kT', 'kcal/mol' or 'kJ/mol', the equivalent for entropy would then be 'k', 'kcal/mol/K' or 'kJ/mol/K'. The downside of this difference in units would require either having copies of the units.py functions for entropy specifically or add a series of if statements to the functions in units.py to detect and handle entropy units. Also, it's awkward to say that dimensionless entropy has units of 'k'

Alternative: By having the entropy matrix represent S*T (denoted with the variable name delta_sT_) the object can retain the same units as enthalpy and free energy and use the same unit conversion functions, making everything cleaner internally, and avoid the awkward unit "definition" of 'k' for S.

Thanks. I'm merging this PR. I have checked the numerical values and can verify that Delta_s is not s but rather s*T. So it is in the unit of kT, kJ/mol and kcal/mol.

src/alchemlyb/parsing/__init__.py

xiki-tempula

LGTM

* Add enthalpy and entropy calculations * Add enthalpy and entropy calculations * Update tests for coverage * black * Add bootstrapping to BAR().fit(use_mbar=True) and adapt tests for code coverage * Update CHANGES * Update get_group syntax * Update method default in bar when use_mbar==True * Remove enthalpy/entropy for BAR * Address review comments * Update docstrings for units.py for the case of entropy values * Update entropy attribute name from delta_s to delta_sT * Undo irrelevant addition

jaclark5 added 2 commits September 18, 2024 17:24

Add enthalpy and entropy calculations

8b5820b

Add enthalpy and entropy calculations

37d6a17

jaclark5 mentioned this pull request Sep 19, 2024

Enthalpic and Entropic Contributions from BAR choderalab/pymbar#540

Closed

jaclark5 added 6 commits September 19, 2024 09:17

Merge branch 'master' into issue_entropy_enthalpy

f8668ee

Update tests for coverage

f8de2d8

black

ec3d7d8

Add bootstrapping to BAR().fit(use_mbar=True) and adapt tests for cod…

214bbfe

…e coverage

Update CHANGES

ffd7080

Update get_group syntax

5574c3b

orbeckst added enhancement estimators labels Sep 19, 2024

Update method default in bar when use_mbar==True

5e1a393

xiki-tempula reviewed Sep 30, 2024

View reviewed changes

orbeckst assigned xiki-tempula Oct 2, 2024

jaclark5 added 2 commits November 12, 2024 14:32

Merge branch 'master' into issue_entropy_enthalpy

bef7d06

Remove enthalpy/entropy for BAR

74b8341

jaclark5 force-pushed the issue_entropy_enthalpy branch from 163a0ff to 74b8341 Compare November 12, 2024 20:07

xiki-tempula reviewed Nov 13, 2024

View reviewed changes

CHANGES Show resolved Hide resolved

xiki-tempula reviewed Nov 13, 2024

View reviewed changes

xiki-tempula approved these changes Nov 13, 2024

View reviewed changes

Address review comments

eb2fa34

xiki-tempula reviewed Nov 13, 2024

View reviewed changes

jaclark5 added 2 commits November 13, 2024 10:15

Update docstrings for units.py for the case of entropy values

f312b6d

Update entropy attribute name from delta_s to delta_sT

d8850fd

jaclark5 force-pushed the issue_entropy_enthalpy branch from c09fbb1 to d8850fd Compare November 13, 2024 15:34

xiki-tempula reviewed Nov 13, 2024

View reviewed changes

src/alchemlyb/parsing/__init__.py Outdated Show resolved Hide resolved

Undo irrelevant addition

d4dbeb0

xiki-tempula approved these changes Nov 14, 2024

View reviewed changes

xiki-tempula merged commit 968d4f2 into alchemistry:master Nov 14, 2024
4 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add entropy and enthalpy #406

Add entropy and enthalpy #406

jaclark5 commented Sep 19, 2024

codecov bot commented Sep 19, 2024 •

edited

Loading

xiki-tempula Sep 30, 2024

xiki-tempula commented Sep 30, 2024

mrshirts commented Sep 30, 2024

jaclark5 commented Sep 30, 2024

xiki-tempula commented Oct 1, 2024

jaclark5 commented Oct 1, 2024

mrshirts commented Oct 1, 2024

jaclark5 commented Oct 1, 2024

xiki-tempula commented Oct 2, 2024

jaclark5 commented Nov 12, 2024

xiki-tempula Nov 13, 2024

jaclark5 Nov 13, 2024 •

edited

Loading

xiki-tempula Nov 13, 2024

xiki-tempula Nov 13, 2024

xiki-tempula left a comment

xiki-tempula Nov 13, 2024

xiki-tempula Nov 13, 2024 •

edited

Loading

jaclark5 Nov 13, 2024

mrshirts Nov 13, 2024

jaclark5 Nov 13, 2024

jaclark5 Nov 13, 2024 •

edited

Loading

xiki-tempula Nov 13, 2024

jaclark5 Nov 13, 2024

jaclark5 Nov 13, 2024 •

edited

Loading

xiki-tempula Nov 13, 2024

xiki-tempula Nov 13, 2024 •

edited

Loading

jaclark5 Nov 14, 2024

xiki-tempula Nov 14, 2024 •

edited

Loading

jaclark5 Nov 14, 2024

xiki-tempula Nov 14, 2024

xiki-tempula left a comment

Add entropy and enthalpy #406

Add entropy and enthalpy #406

Conversation

jaclark5 commented Sep 19, 2024

codecov bot commented Sep 19, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

xiki-tempula commented Sep 30, 2024

mrshirts commented Sep 30, 2024

jaclark5 commented Sep 30, 2024

xiki-tempula commented Oct 1, 2024

jaclark5 commented Oct 1, 2024

mrshirts commented Oct 1, 2024

jaclark5 commented Oct 1, 2024

xiki-tempula commented Oct 2, 2024

jaclark5 commented Nov 12, 2024

Choose a reason for hiding this comment

jaclark5 Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiki-tempula left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiki-tempula Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaclark5 Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jaclark5 Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiki-tempula Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiki-tempula Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiki-tempula left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 19, 2024 •

edited

Loading

jaclark5 Nov 13, 2024 •

edited

Loading

xiki-tempula Nov 13, 2024 •

edited

Loading

jaclark5 Nov 13, 2024 •

edited

Loading

jaclark5 Nov 13, 2024 •

edited

Loading

xiki-tempula Nov 13, 2024 •

edited

Loading

xiki-tempula Nov 14, 2024 •

edited

Loading