Modularise Constraints (#386)

* Start structural rework on constraint objectives. * Sketch the idea of a unified vector function objective. * Fix tests. * A bit of vale. * Update Changelog. * Sketch Hessian vector function. * add type upper bounds to ConstrainedManifoldObjective * Apply suggestions from code review * minor improvements --------- Co-authored-by: Mateusz Baran <mateuszbaran89@gmail.com>
JuliaManifolds · Jun 8, 2024 · 2943e0d · 2943e0d · kellertuer · Jun 8, 2024
1 parent 2a7cc0d
commit 2943e0d
Show file tree

Hide file tree

Showing 70 changed files with 3,633 additions and 1,661 deletions.
diff --git a/.vale.ini b/.vale.ini
@@ -5,42 +5,47 @@ Vocab = Manopt
 Packages = Google
 
 [formats]
-# code blocks with Julia inMarkdown do not yet work well
-# so let's npot value qmd files for now
-# qmd = md
+# code blocks with Julia in Markdown do not yet work well
+qmd = md
 jl = md
 
 [docs/src/*.md]
 BasedOnStyles = Vale, Google
 
 [docs/src/contributing.md]
-Google.FirstPerson = No
-Google.We = No
+BasedOnStyles =
+
+[Changelog.md, CONTRIBUTING.md]
+BasedOnStyles = Vale, Google
+Google.Will = false ; given format and really with intend a _will_
+Google.Headings = false ; some might jeally ahabe [] in their headers
+Google.FirstPerson = false ; we pose a few contribution points as first-person questions
 
+[src/*.md] ; actually .jl
+BasedOnStyles = Vale, Google
 
-[src/*.md] ; actually .jl but they are identified those above I think?
+[test/*.md] ; actually .jl
 BasedOnStyles = Vale, Google
 
+[docs/src/changelog.md]
+; ignore since it is derived
+BasedOnStyles =
+
 [src/plans/debug.md]
-Google.Units = false #wto ignore formats= for now.
-TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n
+Google.Units = false ;w to ignore formats= for now.
+Google.Ellipses = false ; since vale gets confused by the DebugFactory Docstring (line 1066)
+TokenIgnores = \$(.+)\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n
 
 [test/plans/test_debug.md] #repeat previous until I find out how to combine them
 Google.Units = false #wto ignore formats= for now.
-TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n
-
-[test/*.md] ; see last comment as well
-BasedOnStyles = Vale, Google
-; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks
+TokenIgnores = \$(.+)\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n
 
 [tutorials/*.md] ; actually .qmd for the first, second autogenerated
-BasedOnStyles =
-; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks
-TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n,```.*```
-Google.We = false # For tutorials we want to adress the user directly.
-
-[docs/src/tutorials/*.md] #repeat previous until I find out how to combine them since these are rendered versions of the previous ones
 BasedOnStyles = Vale, Google
 ; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks
-TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n,```.*```
+TokenIgnores = (\$+[^\n$]+\$+)
 Google.We = false # For tutorials we want to adress the user directly.
+
+[docs/src/tutorials/*.md]
+ ; ignore since they are derived files
+BasedOnStyles =
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -32,7 +32,7 @@ If you found a bug or want to propose a feature, please open an issue in within
 ### Add a missing method
 
 There is still a lot of methods for within the optimization framework of  `Manopt.jl`, may it be functions, gradients, differentials, proximal maps, step size rules or stopping criteria.
-If you notice a method missing and can contribute an implementation, please do so, and the maintainers will try help with the necessary details.
+If you notice a method missing and can contribute an implementation, please do so, and the maintainers try help with the necessary details.
 Even providing a single new method is a good contribution.
 
 ### Provide a new algorithm
@@ -77,4 +77,4 @@ Concerning documentation
 - if possible provide both mathematical formulae and literature references using [DocumenterCitations.jl](https://juliadocs.org/DocumenterCitations.jl/stable/) and BibTeX where possible
 - Always document all input variables and keyword arguments
 
-If you implement an algorithm with a certain application in mind, it would be great, if this could be added to the [ManoptExamples.jl](https://github.com/JuliaManifolds/ManoptExamples.jl) package as well.
+If you implement an algorithm with a certain numerical example in mind, it would be great, if this could be added to the [ManoptExamples.jl](https://github.com/JuliaManifolds/ManoptExamples.jl) package as well.
diff --git a/Changelog.md b/Changelog.md
diff --git a/Project.toml b/Project.toml
@@ -1,7 +1,7 @@
 name = "Manopt"
 uuid = "0fc0a36d-df90-57f3-8f93-d78a9fc72bb5"
 authors = ["Ronny Bergmann <manopt@ronnybergmann.net>"]
-version = "0.4.63"
+version = "0.4.64"
 
 [deps]
 ColorSchemes = "35d6a980-a343-548e-a6ea-1d62b119f2f4"
@@ -50,7 +50,7 @@ LinearAlgebra = "1.6"
 LineSearches = "7.2.0"
 ManifoldDiff = "0.3.8"
 Manifolds = "0.9.11"
-ManifoldsBase = "0.15.9"
+ManifoldsBase = "0.15.10"
 ManoptExamples = "0.1.4"
 Markdown = "1.6"
 Plots = "1.30"

diff --git a/Readme.md b/Readme.md
@@ -97,20 +97,9 @@ If you are also using [`Manifolds.jl`](https://juliamanifolds.github.io/Manifold
 as well.
 Note that all citations are in [BibLaTeX](https://ctan.org/pkg/biblatex) format.
 
-## Further and Similar Packages & Links
-
 `Manopt.jl` belongs to the Manopt family:
 
 * [www.manopt.org](https://www.manopt.org): the MATLAB version of Manopt, see also their :octocat: [GitHub repository](https://github.com/NicolasBoumal/manopt)
 * [www.pymanopt.org](https://www.pymanopt.org): the Python version of Manopt—providing also several AD backends, see also their :octocat: [GitHub repository](https://github.com/pymanopt/pymanopt)
 
-but there are also more packages providing tools on manifolds:
-
-* [Jax Geometry](https://bitbucket.org/stefansommer/jaxgeometry/src/main/) (Python/Jax): differential geometry and stochastic dynamics with deep learning
-* [Geomstats](https://geomstats.github.io) (Python with several backends): focusing on statistics and machine learning :octocat: [GitHub repository](https://github.com/geomstats/geomstats)
-* [Geoopt](https://geoopt.readthedocs.io/en/latest/) (Python & PyTorch): Riemannian ADAM & SGD. :octocat: [GitHub repository](https://github.com/geoopt/geoopt)
-* [McTorch](https://github.com/mctorch/mctorch) (Python & PyToch): Riemannian SGD, Adagrad, ASA & CG.
-* [ROPTLIB](https://www.math.fsu.edu/~whuang2/papers/ROPTLIB.htm) (C++): a Riemannian OPTimization LIBrary :octocat: [GitHub repository](https://github.com/whuang08/ROPTLIB)
-* [TF Riemopt](https://github.com/master/tensorflow-riemopt) (Python & TensorFlow): Riemannian optimization using TensorFlow
-
 Did you use `Manopt.jl` somewhere? Let us know! We'd love to collect those here as well.
diff --git a/docs/src/about.md b/docs/src/about.md
@@ -1,7 +1,7 @@
 # About
 
 Manopt.jl inherited its name from [Manopt](https://manopt.org), a Matlab toolbox for optimization on manifolds.
-This Julia package was started and is currently maintained by [Ronny Bergmann](https://ronnybergmann.net/about.html).
+This Julia package was started and is currently maintained by [Ronny Bergmann](https://ronnybergmann.net/).
 
 The following people contributed
 * [Constantin Ahlmann-Eltze](https://const-ae.name) implemented the [gradient and differential `check` functions](helpers/checks.md)
@@ -20,7 +20,7 @@ the [GitHub repository](https://github.com/JuliaManifolds/Manopt.jl/)
 to clone/fork the repository or open an issue.
 
 
-# further packages
+# Further packages
 
 `Manopt.jl` belongs to the Manopt family:
 
@@ -29,7 +29,7 @@ to clone/fork the repository or open an issue.
 
 but there are also more packages providing tools on manifolds:
 
-* [Jax Geometry](https://bitbucket.org/stefansommer/jaxgeometry/src/main/) (Python/Jax) for differential geometry and stochastic dynamics with deep learning
+* [Jax Geometry](https://github.com/ComputationalEvolutionaryMorphometry/jaxgeometry) (Python/Jax) for differential geometry and stochastic dynamics with deep learning
 * [Geomstats](https://geomstats.github.io) (Python with several backends) focusing on statistics and machine learning :octocat: [GitHub repository](https://github.com/geomstats/geomstats)
 * [Geoopt](https://geoopt.readthedocs.io/en/latest/) (Python & PyTorch) Riemannian ADAM & SGD. :octocat: [GitHub repository](https://github.com/geoopt/geoopt)
 * [McTorch](https://github.com/mctorch/mctorch) (Python & PyToch) Riemannian SGD, Adagrad, ASA & CG.

diff --git a/docs/src/plans/index.md b/docs/src/plans/index.md
@@ -45,7 +45,7 @@ The following symbols are used.
 
 Any other lower case name or letter as well as single upper case letters access fields of the corresponding first argument.
 for example `:p` could be used to access the field `s.p` of a state.
-This is often, where the iterate is stored, so the recommended way is to use `:Iterate` from above-
+This is often, where the iterate is stored, so the recommended way is to use `:Iterate` from before.
 
 Since the iterate is often stored in the states fields `s.p` one _could_ access the iterate
 often also with `:p` and similarly the gradient with `:X`.

diff --git a/docs/src/plans/objective.md b/docs/src/plans/objective.md
@@ -201,37 +201,59 @@ linearized_forward_operator
 
 ### Constrained objective
 
-Besides the [`AbstractEvaluationType`](@ref) there is one further property to
-distinguish among constraint functions, especially the gradients of the constraints.
-
 ```@docs
-ConstraintType
-FunctionConstraint
-VectorConstraint
+ConstrainedManifoldObjective
 ```
 
-The [`ConstraintType`](@ref) is a parameter of the corresponding Objective.
+It might be beneficial to use the adapted problem to specify different ranges for the gradients of the constraints
 
 ```@docs
-ConstrainedManifoldObjective
+ConstrainedManoptProblem
 ```
 
 #### Access functions
 
 ```@docs
-get_constraints
+equality_constraints_length
+inequality_constraints_length
+get_unconstrained_objective
 get_equality_constraint
-get_equality_constraints
 get_inequality_constraint
-get_inequality_constraints
 get_grad_equality_constraint
-get_grad_equality_constraints
-get_grad_equality_constraints!
-get_grad_equality_constraint!
 get_grad_inequality_constraint
-get_grad_inequality_constraint!
-get_grad_inequality_constraints
-get_grad_inequality_constraints!
+get_hess_equality_constraint
+get_hess_inequality_constraint
+```
+
+### A vectorial cost function
+
+```@docs
+Manopt.AbstractVectorFunction
+Manopt.AbstractVectorGradientFunction
+Manopt.VectorGradientFunction
+Manopt.VectorHessianFunction
+```
+
+
+```@docs
+Manopt.AbstractVectorialType
+Manopt.CoordinateVectorialType
+Manopt.ComponentVectorialType
+Manopt.FunctionVectorialType
+```
+
+#### Access functions
+
+```@docs
+Manopt.get_value
+Manopt.get_value_function
+Base.length(::VectorGradientFunction)
+```
+
+#### Internal functions
+
+```@docs
+Manopt._to_iterable_indices
 ```
 
 ### Subproblem objective

diff --git a/docs/src/plans/problem.md b/docs/src/plans/problem.md
@@ -18,8 +18,15 @@ Usually, such a problem is determined by the manifold or domain of the optimisat
 DefaultManoptProblem
 ```
 
-The exception to these are the primal dual-based solvers ([Chambolle-Pock](../solvers/ChambollePock.md) and the [PD Semi-smooth Newton](../solvers/primal_dual_semismooth_Newton.md)),
-which both need two manifolds as their domains, hence there also exists a
+For the constraint optimisation, there are different possibilities to represent the gradients
+of the constraints. This can be done with a
+
+```
+ConstraintProblem
+```
+
+The primal dual-based solvers ([Chambolle-Pock](../solvers/ChambollePock.md) and the [PD Semi-smooth Newton](../solvers/primal_dual_semismooth_Newton.md)),
+both need two manifolds as their domains, hence there also exists a
 
 ```@docs
 TwoManifoldProblem

diff --git a/docs/src/solvers/cma_es.md b/docs/src/solvers/cma_es.md
@@ -4,7 +4,7 @@
 CurrentModule = Manopt
 ```
 
-The CMA-ES algorithm has been implemented based on [Hansen:2023](@cite) with basic Riemannian adaptations, related to transport of covariance matrix and its update vectors. Other attempts at adapting CMA-ES to Riemannian optimzation include [ColuttoFruhaufFuchsScherzer:2010](@cite).
+The CMA-ES algorithm has been implemented based on [Hansen:2023](@cite) with basic Riemannian adaptations, related to transport of covariance matrix and its update vectors. Other attempts at adapting CMA-ES to Riemannian optimization include [ColuttoFruhaufFuchsScherzer:2010](@cite).
 The algorithm is suitable for global optimization.
 
 Covariance matrix transport between consecutive mean points is handled by `eigenvector_transport!` function which is based on the idea of transport of matrix eigenvectors.
@@ -19,7 +19,7 @@ cma_es
 CMAESState
 ```
 
-## Stopping Criteria
+## Stopping criteria
 
 ```@docs
 StopWhenBestCostInGenerationConstant

diff --git a/docs/src/solvers/convex_bundle_method.md b/docs/src/solvers/convex_bundle_method.md
@@ -1,4 +1,4 @@
-# [Convex Bundle Method](@id ConvexBundleMethodSolver)
+# Convex bundle method
 
 ```@meta
 CurrentModule = Manopt
@@ -15,13 +15,13 @@ convex_bundle_method!
 ConvexBundleMethodState
 ```
 
-## Stopping Criteria
+## Stopping criteria
 
 ```@docs
 StopWhenLagrangeMultiplierLess
 ```
 
-## Debug Functions
+## Debug functions
 
 ```@docs
 DebugWarnIfLagrangeMultiplierIncreases

diff --git a/docs/src/solvers/index.md b/docs/src/solvers/index.md
@@ -6,25 +6,25 @@ CurrentModule = Manopt
 ```
 
 Optimisation problems can be classified with respect to several criteria.
-In the following we provide a grouping of the algorithms with respect to the “information”
-available about your optimisation problem
+The following list of the algorithms is a grouped with respect to the “information”
+available about a optimisation problem
 
 ```math
 \operatorname*{arg\,min}_{p∈\mathbb M} f(p)
 ```
 
-Within the groups we provide short notes on advantages of the individual solvers, pointing our properties the cost ``f`` should have.
-We use 🏅 to indicate state-of-the-art solvers, that usually perform best in their corresponding group and 🫏 for a maybe not so fast, maybe not so state-of-the-art method, that nevertheless gets the job done most reliably.
+Within each group short notes on advantages of the individual solvers, and required properties the cost ``f`` should have, are provided.
+In that list a 🏅 is used to indicate state-of-the-art solvers, that usually perform best in their corresponding group and 🫏 for a maybe not so fast, maybe not so state-of-the-art method, that nevertheless gets the job done most reliably.
 
-## Derivative Free
+## Derivative free
 
 For derivative free only function evaluations of ``f`` are used.
 
 * [Nelder-Mead](NelderMead.md) a simplex based variant, that is using ``d+1`` points, where ``d`` is the dimension of the manifold.
 * [Particle Swarm](particle_swarm.md) 🫏 use the evolution of a set of points, called swarm, to explore the domain of the cost and find a minimizer.
 * [CMA-ES](cma_es.md) uses a stochastic evolutionary strategy to perform minimization robust to local minima of the objective.
 
-## First Order
+## First order
 
 ### Gradient
 
@@ -42,7 +42,7 @@ While the subgradient might be set-valued, the function should provide one of th
 * The [Convex Bundle Method](convex_bundle_method.md) (CBM) uses a former collection of sub gradients at the previous iterates and iterate candidates to solve a local approximation to `f` in every iteration by solving a quadratic problem in the tangent space.
 * The [Proximal Bundle Method](proximal_bundle_method.md) works similar to CBM, but solves a proximal map-based problem in every iteration.
 
-## Second Order
+## Second order
 
 * [Adaptive Regularisation with Cubics](adaptive-regularization-with-cubics.md) 🏅 locally builds a cubic model to determine the next descent direction.
 * The [Riemannian Trust-Regions Solver](trust_regions.md) builds a quadratic model within a trust region to determine the next descent direction.
@@ -58,17 +58,18 @@ The following methods require that the splitting, for example into several summa
 
 * [Levenberg-Marquardt](LevenbergMarquardt.md) minimizes the square norm of ``f: \mathcal M→ℝ^d`` provided the gradients of the component functions, or in other words the Jacobian of ``f``.
 * [Stochastic Gradient Descent](stochastic_gradient_descent.md) is based on a splitting of ``f`` into a sum of several components ``f_i`` whose gradients are provided. Steps are performed according to gradients of randomly selected components.
-* The [Alternating Gradient Descent](@ref solver-alternating-gradient-descent) alternates gradient descent steps on the components of the product manifold. All these components should be smooth aso the gradient exists, and (locally) convex.
+* The [Alternating Gradient Descent](@ref solver-alternating-gradient-descent) alternates gradient descent steps on the components of the product manifold. All these components should be smooth as it is required, that the gradient exists, and is (locally) convex.
 
 ### Nonsmooth
 
 If the gradient does not exist everywhere, that is if the splitting yields summands that are nonsmooth, usually methods based on proximal maps are used.
 
 * The [Chambolle-Pock](ChambollePock.md) algorithm uses a splitting ``f(p) = F(p) + G(Λ(p))``,
-  where ``G`` is defined on a manifold ``\mathcal N`` and we need the proximal map of its Fenchel dual. Both these functions can be non-smooth.
+  where ``G`` is defined on a manifold ``\mathcal N`` and the proximal map of its Fenchel dual is required.
+  Both these functions can be non-smooth.
 * The [Cyclic Proximal Point](cyclic_proximal_point.md) 🫏 uses proximal maps of the functions from splitting ``f`` into summands ``f_i``
-* [Difference of Convex Algorithm](@ref solver-difference-of-convex) (DCA) uses a splitting of the (nonconvex) function ``f = g - h`` into a difference of two functions; for each of these we require the gradient of ``g`` and the subgradient of ``h`` to state a sub problem in every iteration to be solved.
-* [Difference of Convex Proximal Point](@ref solver-difference-of-convex-proximal-point) uses a splitting of the (nonconvex) function ``f = g - h`` into a difference of two functions; provided the proximal map of ``g`` and the subgradient of ``h``, the next iterate is computed. Compared to DCA, the correpsonding sub problem is here written in a form that yields the proximal map.
+* [Difference of Convex Algorithm](@ref solver-difference-of-convex) (DCA) uses a splitting of the (non-convex) function ``f = g - h`` into a difference of two functions; for each of these it is required to have access to the gradient of ``g`` and the subgradient of ``h`` to state a sub problem in every iteration to be solved.
+* [Difference of Convex Proximal Point](@ref solver-difference-of-convex-proximal-point) uses a splitting of the (non-convex) function ``f = g - h`` into a difference of two functions; provided the proximal map of ``g`` and the subgradient of ``h``, the next iterate is computed. Compared to DCA, the corresponding sub problem is here written in a form that yields the proximal map.
 * [Douglas—Rachford](DouglasRachford.md) uses a splitting ``f(p) = F(x) + G(x)`` and their proximal maps to compute a minimizer of ``f``, which can be non-smooth.
 * [Primal-dual Riemannian semismooth Newton Algorithm](@ref solver-pdrssn) extends Chambolle-Pock and requires the differentials of the proximal maps additionally.
 

diff --git a/docs/src/solvers/particle_swarm.md b/docs/src/solvers/particle_swarm.md
@@ -15,7 +15,7 @@ CurrentModule = Manopt
 ParticleSwarmState
 ```
 
-## Stopping Criteria
+## Stopping criteria
 
 ```@docs
 StopWhenSwarmVelocityLess

diff --git a/docs/src/solvers/proximal_bundle_method.md b/docs/src/solvers/proximal_bundle_method.md
@@ -1,4 +1,4 @@
-# [Proximal Bundle Method](@id ProxBundleMethodSolver)
+# Proximal bundle method
 
 ```@meta
 CurrentModule = Manopt