Cleanup.

MatthewGerber · Jun 25, 2024 · 185f2c6 · 185f2c6
1 parent 30e734d
commit 185f2c6
Show file tree

Hide file tree

Showing 11 changed files with 13 additions and 13 deletions.
diff --git a/docs/ch_Feature_Extractors.md b/docs/ch_Feature_Extractors.md
@@ -10,7 +10,7 @@ A feature extractor for the gridworld. This extractor, being based on the `State
 A feature extractor for the gridworld. This extractor does not interact feature values with actions. Its primary use
     is in state-value estimation (e.g., for the baseline of policy gradient methods).
 ```
-### [rlai.core.environments.gymnasium.CartpoleFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L915)
+### [rlai.core.environments.gymnasium.CartpoleFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L917)
 ```
 A feature extractor for the Gym cartpole environment. This extractor, being based on the
     `StateActionInteractionFeatureExtractor`, directly extracts the fully interacted state-action feature matrix. It
@@ -19,20 +19,20 @@ A feature extractor for the Gym cartpole environment. This extractor, being base
     separate intercept term being present for each state segment and action combination. The function approximator
     should not add its own intercept term.
 ```
-### [rlai.core.environments.gymnasium.ContinuousLunarLanderFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L1456)
+### [rlai.core.environments.gymnasium.ContinuousLunarLanderFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L1458)
 ```
 Feature extractor for the continuous lunar lander environment.
 ```
-### [rlai.core.environments.gymnasium.ContinuousMountainCarFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L1232)
+### [rlai.core.environments.gymnasium.ContinuousMountainCarFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L1234)
 ```
 Feature extractor for the continuous mountain car environment.
 ```
-### [rlai.core.environments.gymnasium.ScaledFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L701)
+### [rlai.core.environments.gymnasium.ScaledFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L703)
 ```
 A feature extractor for continuous Gym environments. Extracts a scaled (standardized) version of the Gym state
     observation.
 ```
-### [rlai.core.environments.gymnasium.SignedCodingFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L783)
+### [rlai.core.environments.gymnasium.SignedCodingFeatureExtractor](https://github.com/MatthewGerber/rlai/tree/master/src/rlai/core/environments/gymnasium.py#L785)
 ```
 Signed-coding feature extractor. Forms a category from the conjunction of all state-feature signs and then places
     the continuous feature vector into its associated category. Works for all continuous-valued state spaces in Gym.

diff --git a/docs/cli_guide.md b/docs/cli_guide.md
@@ -135,6 +135,6 @@ agent for the Gym cartpole (inverted pendulum) environment.
 rlai train --agent rlai.gpi.state_action_value.ActionValueMdpAgent --gamma 1.0 --environment rlai.core.environments.gymnasium.Gym --T 1000 --gym-id CartPole-v1 --render-every-nth-episode 5000 --train-function rlai.gpi.temporal_difference.iteration.iterate_value_q_pi --mode Q_LEARNING --num-improvements 100 --num-episodes-per-improvement 50 --epsilon 0.01 --q-S-A rlai.gpi.state_action_value.tabular.TabularStateActionValueEstimator --continuous-state-discretization-resolution 0.1 --make-final-policy-greedy True --num-improvements-per-plot 100 --save-agent-path ~/Desktop/cartpole_agent.pickle
 ```
 A video should be rendered at the start of training, and a plot will be rendered at the end similar to the following.
-![cartpole](cli-cartpole.png)
+![cartpole](images/cli-cartpole.png)
 Details of training plots like this one are provided in the Case Studies 
 (e.g., [cartpole](case_studies/inverted_pendulum.md)).
diff --git a/docs/cli-cartpole.png → docs/images/cli-cartpole.png b/docs/cli-cartpole.png → docs/images/cli-cartpole.png
diff --git a/docs/gridworld_sgd.png → docs/images/gridworld_sgd.png b/docs/gridworld_sgd.png → docs/images/gridworld_sgd.png
diff --git a/docs/jupyterlab-composer.png → docs/images/jupyterlab-composer.png b/docs/jupyterlab-composer.png → docs/images/jupyterlab-composer.png
diff --git a/docs/jupyterlab-diag.png → docs/images/jupyterlab-diag.png b/docs/jupyterlab-diag.png → docs/images/jupyterlab-diag.png
diff --git a/docs/jupyterlab-running.png → docs/images/jupyterlab-running.png b/docs/jupyterlab-running.png → docs/images/jupyterlab-running.png
diff --git a/docs/jupyterlab_guide.md b/docs/jupyterlab_guide.md
@@ -6,17 +6,17 @@
 A companion JupyterLab notebook is provided to ease the use of RLAI. The goal of the interface is to assist with the 
 composition, execution, and real-time inspection of RLAI commands. The primary composer interface is show below:
 
-![jupyterlab](jupyterlab-composer.png)
+![jupyterlab](images/jupyterlab-composer.png)
 
 The notebook provides controls for starting, pausing, and resuming the execution of RLAI commands. All plots are
 interactive and support zooming, panning, and axis rescaling. An example is shown below:
 
-![jupyterlab-running](jupyterlab-running.png)
+![jupyterlab-running](images/jupyterlab-running.png)
 
 Certain state-active value function estimators (e.g., the scikit-learn stochastic gradient descent model) support 
 diagnostic plots. An example is shown below:
 
-![jupyterlab-diag](jupyterlab-diag.png)
+![jupyterlab-diag](images/jupyterlab-diag.png)
 
 For single-click access to the notebook, please click below:
 

diff --git a/docs/model_diagnostics_and_interpretation.md b/docs/model_diagnostics_and_interpretation.md
@@ -18,7 +18,7 @@ shown below (see [continuous mountain car](./case_studies/mountain_car_continuou
 * Top right:  Action values.
 * Bottom left:  State-value estimate, which is used as a baseline in the REINFORCE policy gradient algorithm.
 * Bottom right:  Shape parameters `a` and `b` for the beta PDF.
-* 
+
 # Model Coefficient Plots
 Consider the gridworld of Example 4.1 solved with temporal-difference q-learning and stochastic gradient descent based 
 on the four features extracted by 
@@ -30,7 +30,7 @@ rlai train --agent rlai.gpi.state_action_value.ActionValueMdpAgent --gamma 1 --e
 ```
 The above command should generate plots such as the following:
 
-![gridworld-sgd-plot](gridworld_sgd.png)
+![gridworld-sgd-plot](images/gridworld_sgd.png)
 
 As indicated by the title, this figure shows boxplots of model coefficients (y-values) over time (x-values) for each
 feature (row) and action (column). The coefficients quantify the relationships among the feature-action pairs and the
@@ -52,7 +52,7 @@ The [JupyterLab interface](jupyterlab_guide.md) provides detailed instrumentatio
 shown below. This information can be useful when diagnosing convergence and stability issues in state-action value
 function approximation.
 
-![sgd-instrumentation](jupyterlab-diag.png)
+![sgd-instrumentation](images/jupyterlab-diag.png)
 
 The left plot above shows per-iteration averages of return (green), model loss (red), and step size (blue). The right
 plot shows the same variables for a single iteration, so that each time step is visible. The JupyterLab interface allows

diff --git a/src/rlai/figures/Epsilon-greedy, nonstationary bandit.pdf b/src/rlai/figures/Epsilon-greedy, nonstationary bandit.pdf
diff --git a/src/rlai/meta/__init__.py → src/rlai/meta.py b/src/rlai/meta/__init__.py → src/rlai/meta.py
@@ -124,7 +124,7 @@ def main():
     # noinspection PyTypeChecker
     summarize(rlai, chapter_page_descriptions)
 
-    docs_dir = f'{os.path.dirname(__file__)}/../../../docs/'
+    docs_dir = f'{os.path.dirname(__file__)}/../../docs/'
     meta_md_path = f'{docs_dir}links_to_code.md'
 
     ch_num_name = {