Skip to content

Commit

Permalink
added continuous test cases. Added warnings for untested functionalit…
Browse files Browse the repository at this point in the history
…y. updated readme
  • Loading branch information
joshuaspear committed Jun 3, 2024
1 parent d7ecc1a commit 5404dc9
Show file tree
Hide file tree
Showing 18 changed files with 69 additions and 20 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -615,4 +615,5 @@ MigrationBackup/

/tmp
/d3rlpy_data
/d3rlpy_logs
/d3rlpy_logs
/propensity_output
15 changes: 6 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,9 @@
# offline_rl_ope (BETA RELEASE)
# offline_rl_ope

**WARNING**
- All IS methods implemented incorrectly in versions < 6.x
- Per-decision weighted importance sampling was incorrectly implemented in versions < 5.X
- Weighted importance sampling was incorrectly implemented in versions 1.X.X and 2.1.X, 2.2.X
- Unit testing currently only running in Python 3.11. 3.10 will be supported in the future
- Only 1 dimensional discrete action spaces are currently supported!

**IMPORTANT: THIS IS A BETA RELEASE. FUNCTIONALITY IS STILL BEING TESTED** Feedback/contributions are welcome :)
- Not all functionality has been tested i.e., d3rlpy api and LowerBounds are still in beta

### Testing progress
- [x] components/
Expand All @@ -21,11 +17,12 @@
- [x] Metrics
- [x] EffectiveSampleSize.py
- [x] ValidWeightsProp.py
- [ ] PropensityModels
- [x] PropensityModels
- [ ] LowerBounds
- [ ] api/d3rlpy

* Insufficient functionality to test i.e., currently only wrapper classes are implemented for the OPEEstimation/DirectMethod.py
Insufficient functionality to test DirectMethod.py i.e., currently only wrapper classes are implemented for the OPEEstimation/DirectMethod.py


#### Overview
Basic unit testing has been implemented for all the core functionality of the package. The d3rlpy/api for importance sampling adds minimal additional functionality therefore, it is likely to function as expected however, no sepcific unit testing has been implemented!
Expand All @@ -34,7 +31,7 @@ Basic unit testing has been implemented for all the core functionality of the pa
* More documentation needs to be added however, please refer to examples/ for an illustration of the functionality
* examples/static.py provides an illustration of the package being used for evaluation post training. Whilst the d3rlpy package is used for model training, the script is agnostic to the evaluation model used
* examples/d3rlpy_training_api.py provides an illustration of how the package can be used to obtain incremental performance statistics during the training of d3rlpy models. It provides greater functionality to the native scorer metrics included in d3rlpy
* The current focus has been on discrete action spaces. Continuous action spaces are intended to be addressed at a later date
* For continuous action spaces, only deterministic policies are fully supported. Supprt for stochastic policies is in development

### Description
* offline_rl_ope aims to provide flexible and efficient implementations of OPE algorithms for use when training offline RL models. The main audience is researchers developing smaller, non-distributed models i.e., those who do not want to use packages such as ray (https://github.com/ray-project/ray).
Expand Down
Binary file removed propensity_output/epoch_1_train_preds.pkl
Binary file not shown.
Binary file removed propensity_output/epoch_1_val_preds.pkl
Binary file not shown.
Binary file removed propensity_output/epoch_2_train_preds.pkl
Binary file not shown.
Binary file removed propensity_output/epoch_2_val_preds.pkl
Binary file not shown.
Binary file removed propensity_output/epoch_3_train_preds.pkl
Binary file not shown.
Binary file removed propensity_output/epoch_3_val_preds.pkl
Binary file not shown.
Binary file removed propensity_output/epoch_4_train_preds.pkl
Binary file not shown.
Binary file removed propensity_output/epoch_4_val_preds.pkl
Binary file not shown.
Binary file removed propensity_output/mdl_chkpnt_epoch_1.pt
Binary file not shown.
Binary file removed propensity_output/mdl_chkpnt_epoch_2.pt
Binary file not shown.
Binary file removed propensity_output/mdl_chkpnt_epoch_3.pt
Binary file not shown.
Binary file removed propensity_output/mdl_chkpnt_epoch_4.pt
Binary file not shown.
9 changes: 0 additions & 9 deletions propensity_output/training_metric_df.csv

This file was deleted.

3 changes: 3 additions & 0 deletions src/offline_rl_ope/LowerBounds/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from ..import logger

logger.warn("LowerBound functionality still in beta")
5 changes: 4 additions & 1 deletion src/offline_rl_ope/api/d3rlpy/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
from . import Scorers, Callbacks, Misc
from . import Scorers, Callbacks, Misc
from ...import logger

logger.warn("api/d3rlpy functionality still in beta")
54 changes: 54 additions & 0 deletions tests/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,60 @@ def __post_init__(self):
}
)


test_action_vals = [
[[0.9], [4], [0.001], [0]],
[[1], [0], [0.9]]
]

test_eval_action_vals = [
[[0.9], [0.9], [0.001], [0]],
[[1], [1], [0.9]]
]


test_configs.update(
{
"continuous_action": TestConfig(
test_state_vals=test_state_vals,
test_action_vals=test_action_vals,
test_action_probs=test_action_probs,
test_eval_action_vals=test_eval_action_vals,
test_eval_action_probs=test_eval_action_probs,
test_reward_values=test_reward_values,
test_dm_s_values=test_dm_s_values,
test_dm_sa_values=test_dm_sa_values
)
}
)


test_action_vals = [
[[0.9,1], [4,0.9], [0.001, 1], [0,-1.2]],
[[1,-0.8], [0,-1], [0.9,1]]
]

test_eval_action_vals = [
[[0.9,1], [1,0.9], [0.001, 1], [0,-1.2]],
[[1,-0.8], [0,-1], [1,1]]
]

test_configs.update(
{
"multi_continuous_action": TestConfig(
test_state_vals=test_state_vals,
test_action_vals=test_action_vals,
test_action_probs=test_action_probs,
test_eval_action_vals=test_eval_action_vals,
test_eval_action_probs=test_eval_action_probs,
test_reward_values=test_reward_values,
test_dm_s_values=test_dm_s_values,
test_dm_sa_values=test_dm_sa_values
)
}
)


test_configs_fmt = [[key,test_configs[key]] for key in test_configs.keys()]
test_configs_fmt_class = [
{"test_conf":test_configs[key]} for key in test_configs.keys()
Expand Down

0 comments on commit 5404dc9

Please sign in to comment.