Skip to content

v5.0.0

Compare
Choose a tag to compare
@joshuaspear joshuaspear released this 01 Mar 10:46
· 62 commits to master since this release
10308ac
  • Correctly implemented per-decision weighted importance sampling
  • Expanded the different types of weights that can be implemented based on:
    • http://proceedings.mlr.press/v48/jiang16.pdf: Per-decision weights are defined as the average weight at a given timepoint. This results in a different denominator for different timepoints. This is implemented with the following WISWeightNorm(avg_denom=True)
    • https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1079&context=cs_faculty_pubs: Per-decision weights are defined as the sum of discounted weights across all timesteps. This is implemented with the following WISWeightNorm(discount=discount_value)
    • Combinations of different weights can be easily implemented for example 'average discounted weights' WISWeightNorm(discount=discount_value, avg_denom=True) however, these do not necessaily have backing from literature.
  • EffectiveSampleSize metric optinally returns nan if all weights are 0
  • Bug fixes:
    • Fix bug when running on cuda where tensors were not being pushed to CPU
    • Improved static typing