From 169adfdaea19b75863cf8c75bee70be7ecae230a Mon Sep 17 00:00:00 2001
From: reinterpretcat <ilya.builuk@gmail.com>
Date: Sun, 5 Nov 2023 19:39:04 +0100
Subject: [PATCH] Update documentation for internals

---
 docs/src/internals/algorithms/heuristics.md | 33 +++++++++++++--------
 docs/src/internals/algorithms/index.md      | 23 +++++++++-----
 docs/src/internals/algorithms/rosomaxa.md   | 13 +++++---
 3 files changed, 45 insertions(+), 24 deletions(-)

diff --git a/docs/src/internals/algorithms/heuristics.md b/docs/src/internals/algorithms/heuristics.md
index bdc841db1..5d179b5f2 100644
--- a/docs/src/internals/algorithms/heuristics.md
+++ b/docs/src/internals/algorithms/heuristics.md
@@ -17,11 +17,15 @@ To build initial solutions to start with, the solver internally can use differen
 
 [Related documentation](https://docs.rs/vrp-core/latest/vrp_core/construction/heuristics/index.html)
 
-Typically, the solver builds four initial solutions and then they are memorized as initial population state by the one
-of the population algorithms:
+To support faster evaluation of jobs insertion in large tours, insertion algorithm uses a search optimization to prevent
+greedy evaluation. This is useful in case of large scale VRPs when greedy evaluation significantly increases running time.
 
-- `greedy`: only best solution is kept
-- `elitism`: N best solutions are kept which are met to some diversification criteria
+[Related documentation](https://docs.rs/vrp-core/latest/vrp_core/utils/trait.SelectionSamplingSearch.html)
+
+Typically, the solver builds four initial solutions, then they are memorized as initial state by the one of the population algorithms:
+
+- `greedy`: only the best solution is kept
+- `elitism`: n best solutions are kept using some diversification criteria
 - `rosomaxa`: a custom population-based algorithm which focuses on improving exploration/exploitation ratio.
 
 The latter is default, however, others can be used if amount of available CPU is low.
@@ -30,24 +34,27 @@ The latter is default, however, others can be used if amount of available CPU is
 
 ## Searching for better solution: meta heuristics
 
-The goal of metaheuristic (or just heuristic for simplicity) is to refine one (or many) of the known solutions. Currently available heuristics:
+The goal of metaheuristic (or just heuristic for simplicity) is to refine one (or many) of the known solutions.
+Currently available heuristics:
 
-- `ruin and recreate` principle: ruin parts of solution and recreates them. Key ideas:
+- `ruin and recreate` principle (Adaptive Large Neighborhood Search): ruin parts of solution and recreates them. Key ideas:
   - use multiple ruin/recreate methods and combine them differently
   - make a larger moves in solution space
 - `local search`: use different local search operators. The main difference from R&R:
-  - try to avoid making a big steps in solution space
+  - avoids making big steps in a solution space
   - target to improve specific aspects in solution
 - `explorative heuristics`: these can be seen as generators for explorative moves in solutions space:
   - `redistribute search`: removes jobs from specific route and prevents their insertion back to it
   - `infeasible search`: allows constraint violations to explore infeasible solutions space. It has recovery step
      to move back to feasible space.
-- `decomposition search`: splits existing solution into multiple smaller ones (e.g. not more than 2-4 routes) and tries
-   to improve them in isolation. Typically, it uses all heuristics just mentioned.
+- `decomposition search` (some kind of Divide and Conquer algorithm): splits existing solution into multiple smaller ones
+   (e.g. not more than 2-4 routes) and tries to improve them in isolation. Typically, it uses all heuristics just mentioned.
 
-Each heuristic accept one of solution from population (not necessary the best known one) and tries to improve it (or diversify).
-During one of refinement iterations, many solutions are picked at the same time and many heuristics are called then. This step
-is called a `generation`.
+Each heuristic accepts one of solutions from the population (not necessary the best known) and tries to improve it (or diversify).
+During one of refinement iterations, many solutions are picked at the same time and many heuristics are called then in parallel.
+Such incremental step is called a `generation`. Once it is completed, all found solutions are introduced to the population,
+which decides how to store them. With elitism/rosomaxa population types, to order solution from Pareto optimal front,
+`Non-Dominated Sorting Genetic Algorithm II` is used.
 
 [Related documentation](https://docs.rs/vrp-core/latest/vrp_core/solver/search/index.html)
 
@@ -72,7 +79,7 @@ termination criteria are supported:
 
 - `time`: stop after some specified amount of seconds
 - `generation`: stop after some specified amount of generations
-- `coefficient variation`: stop if there is no `significant` improvement in specific time or amount of generations
+- `coefficient of variation`: stop if there is no `significant` improvement in specific time or amount of generations
 - `user interrupted` from command line, e.g. by pressing Ctrl + C
 
 Interruption when building initial solutions is supported. Default is 300 seconds or 3000 generations max.
diff --git a/docs/src/internals/algorithms/index.md b/docs/src/internals/algorithms/index.md
index b58f4e052..dd3ac2e46 100644
--- a/docs/src/internals/algorithms/index.md
+++ b/docs/src/internals/algorithms/index.md
@@ -1,14 +1,23 @@
 # Algorithms
 
-This chapter describes some used algorithms (WIP).
+This chapter describes some used algorithms.
 
 
 ## References
 
-TODO: An incomplete list of most important references:
+An incomplete list of important references:
 
-- Clarke, G & Wright, JW 1964. Scheduling of vehicles from a Central Depot to a Number of the Delivery Point. Operations Research, 12 (4): 568-581
-- Schrimpf, G., Schneider, K., Stamm-Wilbrandt, H., Dueck, V.: Record Breaking Optimization Results Using the Ruin and Recreate Principle. J. of Computational Physics 159 (2000) 139–171
-- Jan Christiaens, Greet Vanden Berghe: Slack Induction by String Removals for Vehicle Routing Problems
-- Thibaut Vidal: Hybrid Genetic Search for the CVRP: Open-Source Implementation and SWAP* Neighborhood
-- Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband and Zheng Wen: A Tutorial on Thompson Sampling https://web.stanford.edu/~bvr/pubs/TS_Tutorial.pdf
\ No newline at end of file
+- Clarke, G & Wright, JW 1964: `Scheduling of vehicles from a Central Depot to a Number of the Delivery Point. Operations Research, 12 (4): 568-581`
+- Pisinger, David; Røpke, Stefan: `A general heuristic for vehicle routing problems`
+- Schrimpf, G., Schneider, K., Stamm-Wilbrandt, H., Dueck, V.: `Record Breaking Optimization Results Using the Ruin and Recreate Principle. J. of Computational Physics 159 (2000) 139–171`
+- Jan Christiaens, Greet Vanden Berghe: `Slack Induction by String Removals for Vehicle Routing Problems`
+- Thibaut Vidal: `Hybrid Genetic Search for the CVRP: Open-Source Implementation and SWAP* Neighborhood`
+- Richard F. Hartl, Thibaut Vidal: `Workload Equity in Vehicle Routing Problems: A Survey and Analysis`
+
+- K. Deb; A. Pratap; S. Agarwal; T. Meyarivan: `A fast and elitist multiobjective genetic algorithm: NSGA-II`
+- Damminda Alahakoon, Saman K Halgamuge, Srinivasan Bala: `Dynamic self-organizing maps with controlled growth for knowledge discovery`
+- Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband and Zheng Wen: `A Tutorial on Thompson Sampling` https://web.stanford.edu/~bvr/pubs/TS_Tutorial.pdf
+
+- Florian Arnold, Kenneth Sörensen: `What makes a solution good? The generation of problem-speciﬁc knowledge for heuristics`
+- Flavien Lucas, Romain Billot, Marc Sevaux: `A comment on "what makes a VRP solution good? The generation of problem-specific knowledge for heuristics"`
+- Erik Pitzer, Michael Affenzeller: `A Comprehensive Survey on Fitness Landscape Analysis`
\ No newline at end of file
diff --git a/docs/src/internals/algorithms/rosomaxa.md b/docs/src/internals/algorithms/rosomaxa.md
index 1e61333b6..3da559244 100644
--- a/docs/src/internals/algorithms/rosomaxa.md
+++ b/docs/src/internals/algorithms/rosomaxa.md
@@ -12,14 +12,14 @@ The `rosomaxa` algorithm is based on the following key ideas:
 
 * use [Growing Self-Organizing Map](https://en.wikipedia.org/wiki/Growing_self-organizing_map)(GSOM) to cluster discovered solutions and retain good, but different ones
 * choice clustering characteristics which are specific to solution geometry rather to objectives
-* use 2D visualization to analyze and understand algorithm behavior. See an interactive demo [here](https://reinterpretcat.github.io/heuristics/www/)
 * utilize reinforcement learning technics in dynamic hyper-heuristic to choose one of pre-defined meta-heuristics on each solution refinement step.
+* use 2D visualization to analyze and understand algorithm behavior. See an interactive demo [here](https://reinterpretcat.github.io/heuristics/www/)
 
 
 ### Clustering
 
-Solution clustering is preformed by custom implementation of a growing self-organizing map which is a growing variant
-of a self-organizing map. In `rosomaxa`, it has the following characteristics:
+Solution clustering is preformed by custom implementation of GSOM which is a growing variant of a self-organizing map.
+In `rosomaxa`, it has the following characteristics:
 
 * each node maintains a small population which keeps track of a few solutions selected by elitism approach
 * nodes are created and split based on selected solution characteristics. For VRP domain, they are such as:
@@ -47,6 +47,9 @@ Here:
 * `t_matrix` and `l_matrix` shows how often nodes are updated
 * `objective_0`, `objective_1`, `objective_2`: objective values such as amount of unassigned jobs, tours, and cost
 
+The new experimental visualization tool is part of the repo: `experiments/heuristic-research`.
+Online version is available here: https://reinterpretcat.github.io/heuristics/www/vector.html
+
 
 ### Dynamic hyper-heuristic
 
@@ -55,6 +58,8 @@ with [Thompson sampling](https://en.wikipedia.org/wiki/Thompson_sampling) approa
 This helps to address [exploration-exploitation dilemma](https://en.wikipedia.org/wiki/Exploration-exploitation_dilemma)
 in applying a strategy of picking heuristics.
 
+Implementation can be found [here](https://github.com/reinterpretcat/vrp/blob/master/rosomaxa/src/algorithms/rl/slot_machine.rs)
+
 
 ### Additional used techniques
 
@@ -71,4 +76,4 @@ TODO: describe additional explorative techniques:
 * rebalance GSOM parameters based on search progression
 * analyze "heat" map dynamically to adjust GSOM parameters
 * more fine-grained control of `exploration` vs `exploitation` ratio
-* try to calculate gradients
\ No newline at end of file
+* try to calculate gradients based on GSOM nodes
\ No newline at end of file