-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add entropy and enthalpy #406
Add entropy and enthalpy #406
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #406 +/- ##
==========================================
+ Coverage 98.78% 98.80% +0.01%
==========================================
Files 28 28
Lines 1982 2008 +26
Branches 354 356 +2
==========================================
+ Hits 1958 1984 +26
Misses 2 2
Partials 22 22 ☔ View full report in Codecov by Sentry. |
src/alchemlyb/estimators/bar_.py
Outdated
Parameters | ||
---------- | ||
u_nk : DataFrame | ||
u_nk[n,k] is the reduced potential energy of uncorrelated | ||
configuration n evaluated at state k. | ||
|
||
use_mbar : bool, optional, default=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry do you mind explain to me why do you want to do this in BAR? I think if you want the enthalpic and entropic contributions, you would just run MBAR?
I'm happy with the MBAR changes, it looks simple and good. I kind of not too happy with the BAR one as it looks very complicated and I think if one want to use MBAR to compute entropy and enthalpy. Or one could do a BAR calculation first and then run MBAR again to get entropy and enthalpy. Instead of use BAR and then run MBAR inside BAR to get entropy and enthalpy. |
@jaclark5 and I have been talking over things (she can discuss her side here) and the current thought is that MBAR is better for entropy and enthalpies, though there are some questions about the calculations she is still interested in. The BAR algorithm is more surely converging since there are better methods for 1-dimensional root finding (I think I coded up false position) than N-dimensional root finding. One can start with BAR, and then use those solutions to initialize 2-state MBAR (should just take 1-2 steps as it locks in numerically, as the values should be the same). |
So the suggestion is to remove the |
Would this kind of interface work?
Note that if you just need to use BAR to provide an initial guess for the MBAR, you could just do
|
If you had a populated matrix with all the U(X_i, lambda_j) values, that would yield a different results than 2-state MBAR, which is more akin to BAR. We could include a method that strips u_nk of energy values that are not from adjacent states, then your suggestion would work, but getting an estimate from BAR isn't needed, it can go straight to MBAR. Ultimately my motivation is to get enthalpy/entropy values using 2-state MBAR with the simplicity that alchemlyb offers. It turns out that nonadjacent values of U(x_i,lambda_j) in LAMMPS have a numerical issue and I cannot use MBAR, except in the 2-state case. Although it seems that the enthalpy and entropy values for the gromacs benzene example are pretty different for full MBAR vs 2-state MBAR case so I'm not sure if I'll move forward with reporting these values until that is rectified. When/if it is, I would like for my results to be reproducible in some way with this PR. |
It can't, it will give incorrect results. It's not equivalent. The advantage of a BAR interface to enthalpy-entropy, I see, is when you did NOT collect the full MBAR data, and only have neighboring values.
Yeah, I realize this is still pending with me to look at, things have been piling up. |
Ok, so if there is merit to offering a calculation of 2-state MBAR (which I have need for, and it seems could be more generally used as-well), there are a few options:
|
For the MBAR part, I think I'm happy with the current state, where a simple call to pymbar gives the entropy and enthalpy. |
163a0ff
to
74b8341
Compare
Hey @xiki-tempula, I removed the changes to BAR. I was able to resolve my issues with MBAR and LAMMPS. I think you'll find this PR acceptable. |
@@ -57,6 +57,20 @@ class MBAR(BaseEstimator, _EstimatorMixOut): | |||
The estimated statistical uncertainty (one standard deviation) in | |||
dimensionless free energy differences. | |||
|
|||
delta_h_ : DataFrame | |||
The estimated dimensionless enthalpy difference between each state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is enthalpy dimensionless? Should it be having energy unit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiki-tempula Do you want me to change the docstring? I put this since the existing documentation for free energy is:
delta_f_ : DataFrame
The estimated dimensionless free energy difference between each state.
and I'm treating enthalpy and entropy the same as free energy... it seems I should change entropy appropriately though. See the new commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then I think this is fine. I will change this in a later PR.
|
||
self._delta_f_.attrs = u_nk.attrs | ||
self._d_delta_f_.attrs = u_nk.attrs | ||
if compute_entropy_enthalpy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you do add the units back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good. Thanks. I'm a bit concerned with the description of dimensionless enthalpy
as it has unit on it but I guess it is trivial.
src/alchemlyb/estimators/mbar_.py
Outdated
self._d_delta_h_.attrs = u_nk.attrs | ||
self._delta_s_.attrs = u_nk.attrs | ||
self._d_delta_s_.attrs = u_nk.attrs | ||
self._delta_s_.attrs["energy_unit"] = "k" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does entropy has unit of k? I think it shall be u_nk.attrs["energy_unit"] + '/K'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'm fine for this unit to be the same as other energy unit (though I know it is wrong) as it would mean that the unit conversion would work as it is (between J and cal).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Free energy is in units of energy (J) and when scaled by kT, is in units of kT. Similarly entropy is in units of energy/unit temperature (J/K) and so is scaled by k, with the units of k.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduced entropy should be unitless, entropy should be J/degree, right? Even if passed through some unitless quantities, probably the code should be clear what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to have the correct units, so I'll look into the unit conversion, I didn't consider that feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way alchemlyb defines the units as u_nk.attrs['energy_unit'] = "kT"
. Adopting the same treatment for entropy to similarly have the correct scaling quantity, 'k' would be used to follow suit.
If alchemlyb's definition of u_nk.attrs['energy_unit'] = "kT"
is not the message that is best to send, that's outside of this PR. Although I like it, I think implies that it's dimensionless while also indicating what quantities should be used to restore the units (although I hope that an average user would know).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The energy_unit
, allowed to be kT; kcal/mol and kJ/mol, it is just defaults to kT. I think k
is more like a scientific constant for most people so having it as unit is kind of bad.
The unit conversion is supported between these three units, which is just a scaling factor.
So the easiest solution is to just set the entropy to have the same unit as enthalpy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To adjust the unit conversion functionality to accommodate entropy, should I add the appropriate duplicated function of to_kT
as to_k
or add if-statements to detect the case where it's entropy? I'm leaning toward the former since running a matrix of entropy values through to_kT()
would feel wrong. This would mean I would duplicate to_kcalmol(df, T=None)
as to_kcalmolK(df)
and to_kJmol(df, T=None)
as to_kJmolK(df)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiki-tempula entropy can't have the same units at kT, otherwise the existing functions would multiply it by a temperature resulting in S*T instead of S. If that's what you want to do we can and be clear in the documentation.
Although the units would still be incorrect all the time with a label of "entropy", maybe adjusting the attribute to be delta_sT
would be best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we could have enthalpy in kT
, kJ/mol
and kcal/mol
.
The entropy should then be in k
, kJ/mol/T
and kcal/mol/T
.
In the current implementation we could have a case where enthalpy is kJ/mol
and it is labelled correctly as kJ/mol
. But we have entropy in kJ/mol/T
but is labelled as k
.
c09fbb1
to
d8850fd
Compare
self._d_delta_h_ = pd.DataFrame( | ||
out["dDelta_u"], columns=self._states_, index=self._states_ | ||
) | ||
self._delta_sT_ = pd.DataFrame( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy for this value to be s*T
. Is the mbar output s
or s*T
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the mbar output is reduced entropy, so S/k or sT/(kT) so it's consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that the unit is the same but I worried about the magnitude.
The G=H-sT so if the Delta_s
is s then the Delta_f
should be the same as Delta_u
-Delta_s
* T
. If Delta_s is sT, then Delta_f
should be Delta_u
-Delta_s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree, so I'm changing the variable names to make that clear. I'll do whatever the maintainers agree on, maybe @mrshirts or @orbeckst have an opinion?
Summary: Entropy has different units and which need to be handled properly. While the units of enthalpy and free energy are attr['energy_units'] = 'kT', 'kcal/mol' or 'kJ/mol'
, the equivalent for entropy would then be 'k', 'kcal/mol/K' or 'kJ/mol/K'
. The downside of this difference in units would require either having copies of the units.py functions for entropy specifically or add a series of if statements to the functions in units.py to detect and handle entropy units. Also, it's awkward to say that dimensionless entropy has units of 'k'
Alternative: By having the entropy matrix represent S*T (denoted with the variable name delta_sT_
) the object can retain the same units as enthalpy and free energy and use the same unit conversion functions, making everything cleaner internally, and avoid the awkward unit "definition" of 'k'
for S.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'm merging this PR. I have checked the numerical values and can verify that Delta_s is not s but rather s*T. So it is in the unit of kT, kJ/mol and kcal/mol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Add enthalpy and entropy calculations * Add enthalpy and entropy calculations * Update tests for coverage * black * Add bootstrapping to BAR().fit(use_mbar=True) and adapt tests for code coverage * Update CHANGES * Update get_group syntax * Update method default in bar when use_mbar==True * Remove enthalpy/entropy for BAR * Address review comments * Update docstrings for units.py for the case of entropy values * Update entropy attribute name from delta_s to delta_sT * Undo irrelevant addition
Add enthalpy and entropy outputs for MBAR and BAR (via 2-state MBAR).