Skip to content

Commit

Permalink
Modularise Constraints (#386)
Browse files Browse the repository at this point in the history
* Start structural rework on constraint objectives.
* Sketch the idea of a unified vector function objective.
* Fix tests.
* A bit of vale.
* Update Changelog.
* Sketch Hessian vector function.
* add type upper bounds to ConstrainedManifoldObjective
* Apply suggestions from code review
* minor improvements

---------

Co-authored-by: Mateusz Baran <mateuszbaran89@gmail.com>
  • Loading branch information
kellertuer and mateuszbaran committed Jun 8, 2024
1 parent 2a7cc0d commit 2943e0d
Show file tree
Hide file tree
Showing 70 changed files with 3,633 additions and 1,661 deletions.
45 changes: 25 additions & 20 deletions .vale.ini
Original file line number Diff line number Diff line change
Expand Up @@ -5,42 +5,47 @@ Vocab = Manopt
Packages = Google

[formats]
# code blocks with Julia inMarkdown do not yet work well
# so let's npot value qmd files for now
# qmd = md
# code blocks with Julia in Markdown do not yet work well
qmd = md
jl = md

[docs/src/*.md]
BasedOnStyles = Vale, Google

[docs/src/contributing.md]
Google.FirstPerson = No
Google.We = No
BasedOnStyles =

[Changelog.md, CONTRIBUTING.md]
BasedOnStyles = Vale, Google
Google.Will = false ; given format and really with intend a _will_
Google.Headings = false ; some might jeally ahabe [] in their headers
Google.FirstPerson = false ; we pose a few contribution points as first-person questions

[src/*.md] ; actually .jl
BasedOnStyles = Vale, Google

[src/*.md] ; actually .jl but they are identified those above I think?
[test/*.md] ; actually .jl
BasedOnStyles = Vale, Google

[docs/src/changelog.md]
; ignore since it is derived
BasedOnStyles =

[src/plans/debug.md]
Google.Units = false #wto ignore formats= for now.
TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n
Google.Units = false ;w to ignore formats= for now.
Google.Ellipses = false ; since vale gets confused by the DebugFactory Docstring (line 1066)
TokenIgnores = \$(.+)\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n

[test/plans/test_debug.md] #repeat previous until I find out how to combine them
Google.Units = false #wto ignore formats= for now.
TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n

[test/*.md] ; see last comment as well
BasedOnStyles = Vale, Google
; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks
TokenIgnores = \$(.+)\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n

[tutorials/*.md] ; actually .qmd for the first, second autogenerated
BasedOnStyles =
; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks
TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n,```.*```
Google.We = false # For tutorials we want to adress the user directly.

[docs/src/tutorials/*.md] #repeat previous until I find out how to combine them since these are rendered versions of the previous ones
BasedOnStyles = Vale, Google
; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks
TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n,```.*```
TokenIgnores = (\$+[^\n$]+\$+)
Google.We = false # For tutorials we want to adress the user directly.

[docs/src/tutorials/*.md]
; ignore since they are derived files
BasedOnStyles =
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ If you found a bug or want to propose a feature, please open an issue in within
### Add a missing method

There is still a lot of methods for within the optimization framework of `Manopt.jl`, may it be functions, gradients, differentials, proximal maps, step size rules or stopping criteria.
If you notice a method missing and can contribute an implementation, please do so, and the maintainers will try help with the necessary details.
If you notice a method missing and can contribute an implementation, please do so, and the maintainers try help with the necessary details.
Even providing a single new method is a good contribution.

### Provide a new algorithm
Expand Down Expand Up @@ -77,4 +77,4 @@ Concerning documentation
- if possible provide both mathematical formulae and literature references using [DocumenterCitations.jl](https://juliadocs.org/DocumenterCitations.jl/stable/) and BibTeX where possible
- Always document all input variables and keyword arguments

If you implement an algorithm with a certain application in mind, it would be great, if this could be added to the [ManoptExamples.jl](https://github.com/JuliaManifolds/ManoptExamples.jl) package as well.
If you implement an algorithm with a certain numerical example in mind, it would be great, if this could be added to the [ManoptExamples.jl](https://github.com/JuliaManifolds/ManoptExamples.jl) package as well.
174 changes: 97 additions & 77 deletions Changelog.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Manopt"
uuid = "0fc0a36d-df90-57f3-8f93-d78a9fc72bb5"
authors = ["Ronny Bergmann <manopt@ronnybergmann.net>"]
version = "0.4.63"
version = "0.4.64"

[deps]
ColorSchemes = "35d6a980-a343-548e-a6ea-1d62b119f2f4"
Expand Down Expand Up @@ -50,7 +50,7 @@ LinearAlgebra = "1.6"
LineSearches = "7.2.0"
ManifoldDiff = "0.3.8"
Manifolds = "0.9.11"
ManifoldsBase = "0.15.9"
ManifoldsBase = "0.15.10"
ManoptExamples = "0.1.4"
Markdown = "1.6"
Plots = "1.30"
Expand Down
11 changes: 0 additions & 11 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,20 +97,9 @@ If you are also using [`Manifolds.jl`](https://juliamanifolds.github.io/Manifold
as well.
Note that all citations are in [BibLaTeX](https://ctan.org/pkg/biblatex) format.

## Further and Similar Packages & Links

`Manopt.jl` belongs to the Manopt family:

* [www.manopt.org](https://www.manopt.org): the MATLAB version of Manopt, see also their :octocat: [GitHub repository](https://github.com/NicolasBoumal/manopt)
* [www.pymanopt.org](https://www.pymanopt.org): the Python version of Manopt—providing also several AD backends, see also their :octocat: [GitHub repository](https://github.com/pymanopt/pymanopt)

but there are also more packages providing tools on manifolds:

* [Jax Geometry](https://bitbucket.org/stefansommer/jaxgeometry/src/main/) (Python/Jax): differential geometry and stochastic dynamics with deep learning
* [Geomstats](https://geomstats.github.io) (Python with several backends): focusing on statistics and machine learning :octocat: [GitHub repository](https://github.com/geomstats/geomstats)
* [Geoopt](https://geoopt.readthedocs.io/en/latest/) (Python & PyTorch): Riemannian ADAM & SGD. :octocat: [GitHub repository](https://github.com/geoopt/geoopt)
* [McTorch](https://github.com/mctorch/mctorch) (Python & PyToch): Riemannian SGD, Adagrad, ASA & CG.
* [ROPTLIB](https://www.math.fsu.edu/~whuang2/papers/ROPTLIB.htm) (C++): a Riemannian OPTimization LIBrary :octocat: [GitHub repository](https://github.com/whuang08/ROPTLIB)
* [TF Riemopt](https://github.com/master/tensorflow-riemopt) (Python & TensorFlow): Riemannian optimization using TensorFlow

Did you use `Manopt.jl` somewhere? Let us know! We'd love to collect those here as well.
6 changes: 3 additions & 3 deletions docs/src/about.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# About

Manopt.jl inherited its name from [Manopt](https://manopt.org), a Matlab toolbox for optimization on manifolds.
This Julia package was started and is currently maintained by [Ronny Bergmann](https://ronnybergmann.net/about.html).
This Julia package was started and is currently maintained by [Ronny Bergmann](https://ronnybergmann.net/).

The following people contributed
* [Constantin Ahlmann-Eltze](https://const-ae.name) implemented the [gradient and differential `check` functions](helpers/checks.md)
Expand All @@ -20,7 +20,7 @@ the [GitHub repository](https://github.com/JuliaManifolds/Manopt.jl/)
to clone/fork the repository or open an issue.


# further packages
# Further packages

`Manopt.jl` belongs to the Manopt family:

Expand All @@ -29,7 +29,7 @@ to clone/fork the repository or open an issue.

but there are also more packages providing tools on manifolds:

* [Jax Geometry](https://bitbucket.org/stefansommer/jaxgeometry/src/main/) (Python/Jax) for differential geometry and stochastic dynamics with deep learning
* [Jax Geometry](https://github.com/ComputationalEvolutionaryMorphometry/jaxgeometry) (Python/Jax) for differential geometry and stochastic dynamics with deep learning
* [Geomstats](https://geomstats.github.io) (Python with several backends) focusing on statistics and machine learning :octocat: [GitHub repository](https://github.com/geomstats/geomstats)
* [Geoopt](https://geoopt.readthedocs.io/en/latest/) (Python & PyTorch) Riemannian ADAM & SGD. :octocat: [GitHub repository](https://github.com/geoopt/geoopt)
* [McTorch](https://github.com/mctorch/mctorch) (Python & PyToch) Riemannian SGD, Adagrad, ASA & CG.
Expand Down
2 changes: 1 addition & 1 deletion docs/src/plans/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ The following symbols are used.

Any other lower case name or letter as well as single upper case letters access fields of the corresponding first argument.
for example `:p` could be used to access the field `s.p` of a state.
This is often, where the iterate is stored, so the recommended way is to use `:Iterate` from above-
This is often, where the iterate is stored, so the recommended way is to use `:Iterate` from before.

Since the iterate is often stored in the states fields `s.p` one _could_ access the iterate
often also with `:p` and similarly the gradient with `:X`.
Expand Down
56 changes: 39 additions & 17 deletions docs/src/plans/objective.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,37 +201,59 @@ linearized_forward_operator

### Constrained objective

Besides the [`AbstractEvaluationType`](@ref) there is one further property to
distinguish among constraint functions, especially the gradients of the constraints.

```@docs
ConstraintType
FunctionConstraint
VectorConstraint
ConstrainedManifoldObjective
```

The [`ConstraintType`](@ref) is a parameter of the corresponding Objective.
It might be beneficial to use the adapted problem to specify different ranges for the gradients of the constraints

```@docs
ConstrainedManifoldObjective
ConstrainedManoptProblem
```

#### Access functions

```@docs
get_constraints
equality_constraints_length
inequality_constraints_length
get_unconstrained_objective
get_equality_constraint
get_equality_constraints
get_inequality_constraint
get_inequality_constraints
get_grad_equality_constraint
get_grad_equality_constraints
get_grad_equality_constraints!
get_grad_equality_constraint!
get_grad_inequality_constraint
get_grad_inequality_constraint!
get_grad_inequality_constraints
get_grad_inequality_constraints!
get_hess_equality_constraint
get_hess_inequality_constraint
```

### A vectorial cost function

```@docs
Manopt.AbstractVectorFunction
Manopt.AbstractVectorGradientFunction
Manopt.VectorGradientFunction
Manopt.VectorHessianFunction
```


```@docs
Manopt.AbstractVectorialType
Manopt.CoordinateVectorialType
Manopt.ComponentVectorialType
Manopt.FunctionVectorialType
```

#### Access functions

```@docs
Manopt.get_value
Manopt.get_value_function
Base.length(::VectorGradientFunction)
```

#### Internal functions

```@docs
Manopt._to_iterable_indices
```

### Subproblem objective
Expand Down
11 changes: 9 additions & 2 deletions docs/src/plans/problem.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,15 @@ Usually, such a problem is determined by the manifold or domain of the optimisat
DefaultManoptProblem
```

The exception to these are the primal dual-based solvers ([Chambolle-Pock](../solvers/ChambollePock.md) and the [PD Semi-smooth Newton](../solvers/primal_dual_semismooth_Newton.md)),
which both need two manifolds as their domains, hence there also exists a
For the constraint optimisation, there are different possibilities to represent the gradients
of the constraints. This can be done with a

```
ConstraintProblem
```

The primal dual-based solvers ([Chambolle-Pock](../solvers/ChambollePock.md) and the [PD Semi-smooth Newton](../solvers/primal_dual_semismooth_Newton.md)),
both need two manifolds as their domains, hence there also exists a

```@docs
TwoManifoldProblem
Expand Down
4 changes: 2 additions & 2 deletions docs/src/solvers/cma_es.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
CurrentModule = Manopt
```

The CMA-ES algorithm has been implemented based on [Hansen:2023](@cite) with basic Riemannian adaptations, related to transport of covariance matrix and its update vectors. Other attempts at adapting CMA-ES to Riemannian optimzation include [ColuttoFruhaufFuchsScherzer:2010](@cite).
The CMA-ES algorithm has been implemented based on [Hansen:2023](@cite) with basic Riemannian adaptations, related to transport of covariance matrix and its update vectors. Other attempts at adapting CMA-ES to Riemannian optimization include [ColuttoFruhaufFuchsScherzer:2010](@cite).
The algorithm is suitable for global optimization.

Covariance matrix transport between consecutive mean points is handled by `eigenvector_transport!` function which is based on the idea of transport of matrix eigenvectors.
Expand All @@ -19,7 +19,7 @@ cma_es
CMAESState
```

## Stopping Criteria
## Stopping criteria

```@docs
StopWhenBestCostInGenerationConstant
Expand Down
6 changes: 3 additions & 3 deletions docs/src/solvers/convex_bundle_method.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# [Convex Bundle Method](@id ConvexBundleMethodSolver)
# Convex bundle method

```@meta
CurrentModule = Manopt
Expand All @@ -15,13 +15,13 @@ convex_bundle_method!
ConvexBundleMethodState
```

## Stopping Criteria
## Stopping criteria

```@docs
StopWhenLagrangeMultiplierLess
```

## Debug Functions
## Debug functions

```@docs
DebugWarnIfLagrangeMultiplierIncreases
Expand Down
23 changes: 12 additions & 11 deletions docs/src/solvers/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,25 @@ CurrentModule = Manopt
```

Optimisation problems can be classified with respect to several criteria.
In the following we provide a grouping of the algorithms with respect to the “information”
available about your optimisation problem
The following list of the algorithms is a grouped with respect to the “information”
available about a optimisation problem

```math
\operatorname*{arg\,min}_{p∈\mathbb M} f(p)
```

Within the groups we provide short notes on advantages of the individual solvers, pointing our properties the cost ``f`` should have.
We use 🏅 to indicate state-of-the-art solvers, that usually perform best in their corresponding group and 🫏 for a maybe not so fast, maybe not so state-of-the-art method, that nevertheless gets the job done most reliably.
Within each group short notes on advantages of the individual solvers, and required properties the cost ``f`` should have, are provided.
In that list a 🏅 is used to indicate state-of-the-art solvers, that usually perform best in their corresponding group and 🫏 for a maybe not so fast, maybe not so state-of-the-art method, that nevertheless gets the job done most reliably.

## Derivative Free
## Derivative free

For derivative free only function evaluations of ``f`` are used.

* [Nelder-Mead](NelderMead.md) a simplex based variant, that is using ``d+1`` points, where ``d`` is the dimension of the manifold.
* [Particle Swarm](particle_swarm.md) 🫏 use the evolution of a set of points, called swarm, to explore the domain of the cost and find a minimizer.
* [CMA-ES](cma_es.md) uses a stochastic evolutionary strategy to perform minimization robust to local minima of the objective.

## First Order
## First order

### Gradient

Expand All @@ -42,7 +42,7 @@ While the subgradient might be set-valued, the function should provide one of th
* The [Convex Bundle Method](convex_bundle_method.md) (CBM) uses a former collection of sub gradients at the previous iterates and iterate candidates to solve a local approximation to `f` in every iteration by solving a quadratic problem in the tangent space.
* The [Proximal Bundle Method](proximal_bundle_method.md) works similar to CBM, but solves a proximal map-based problem in every iteration.

## Second Order
## Second order

* [Adaptive Regularisation with Cubics](adaptive-regularization-with-cubics.md) 🏅 locally builds a cubic model to determine the next descent direction.
* The [Riemannian Trust-Regions Solver](trust_regions.md) builds a quadratic model within a trust region to determine the next descent direction.
Expand All @@ -58,17 +58,18 @@ The following methods require that the splitting, for example into several summa

* [Levenberg-Marquardt](LevenbergMarquardt.md) minimizes the square norm of ``f: \mathcal M→ℝ^d`` provided the gradients of the component functions, or in other words the Jacobian of ``f``.
* [Stochastic Gradient Descent](stochastic_gradient_descent.md) is based on a splitting of ``f`` into a sum of several components ``f_i`` whose gradients are provided. Steps are performed according to gradients of randomly selected components.
* The [Alternating Gradient Descent](@ref solver-alternating-gradient-descent) alternates gradient descent steps on the components of the product manifold. All these components should be smooth aso the gradient exists, and (locally) convex.
* The [Alternating Gradient Descent](@ref solver-alternating-gradient-descent) alternates gradient descent steps on the components of the product manifold. All these components should be smooth as it is required, that the gradient exists, and is (locally) convex.

### Nonsmooth

If the gradient does not exist everywhere, that is if the splitting yields summands that are nonsmooth, usually methods based on proximal maps are used.

* The [Chambolle-Pock](ChambollePock.md) algorithm uses a splitting ``f(p) = F(p) + G(Λ(p))``,
where ``G`` is defined on a manifold ``\mathcal N`` and we need the proximal map of its Fenchel dual. Both these functions can be non-smooth.
where ``G`` is defined on a manifold ``\mathcal N`` and the proximal map of its Fenchel dual is required.
Both these functions can be non-smooth.
* The [Cyclic Proximal Point](cyclic_proximal_point.md) 🫏 uses proximal maps of the functions from splitting ``f`` into summands ``f_i``
* [Difference of Convex Algorithm](@ref solver-difference-of-convex) (DCA) uses a splitting of the (nonconvex) function ``f = g - h`` into a difference of two functions; for each of these we require the gradient of ``g`` and the subgradient of ``h`` to state a sub problem in every iteration to be solved.
* [Difference of Convex Proximal Point](@ref solver-difference-of-convex-proximal-point) uses a splitting of the (nonconvex) function ``f = g - h`` into a difference of two functions; provided the proximal map of ``g`` and the subgradient of ``h``, the next iterate is computed. Compared to DCA, the correpsonding sub problem is here written in a form that yields the proximal map.
* [Difference of Convex Algorithm](@ref solver-difference-of-convex) (DCA) uses a splitting of the (non-convex) function ``f = g - h`` into a difference of two functions; for each of these it is required to have access to the gradient of ``g`` and the subgradient of ``h`` to state a sub problem in every iteration to be solved.
* [Difference of Convex Proximal Point](@ref solver-difference-of-convex-proximal-point) uses a splitting of the (non-convex) function ``f = g - h`` into a difference of two functions; provided the proximal map of ``g`` and the subgradient of ``h``, the next iterate is computed. Compared to DCA, the corresponding sub problem is here written in a form that yields the proximal map.
* [Douglas—Rachford](DouglasRachford.md) uses a splitting ``f(p) = F(x) + G(x)`` and their proximal maps to compute a minimizer of ``f``, which can be non-smooth.
* [Primal-dual Riemannian semismooth Newton Algorithm](@ref solver-pdrssn) extends Chambolle-Pock and requires the differentials of the proximal maps additionally.

Expand Down
2 changes: 1 addition & 1 deletion docs/src/solvers/particle_swarm.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ CurrentModule = Manopt
ParticleSwarmState
```

## Stopping Criteria
## Stopping criteria

```@docs
StopWhenSwarmVelocityLess
Expand Down
2 changes: 1 addition & 1 deletion docs/src/solvers/proximal_bundle_method.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# [Proximal Bundle Method](@id ProxBundleMethodSolver)
# Proximal bundle method

```@meta
CurrentModule = Manopt
Expand Down
Loading

2 comments on commit 2943e0d

@kellertuer
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register

Release Notes:

Added

  • Remodel the constraints and their gradients into separate VectorGradientFunctions
    to reduce code duplication and encapsulate the inner model of these functions and their gradients
  • Introduce a ConstrainedManoptProblem to model different ranges for the gradients in the
    new VectorGradientFunctions beyond the default NestedPowerRepresentation
  • introduce a VectorHessianFunction to also model that one can provide the vector of Hessians
    to constraints
  • introduce a more flexible indexing beyond single indexing, to also include arbitrary ranges
    when accessing vector functions and their gradients and hence also for constraints and
    their gradients.

Changed

  • Remodel ConstrainedManifoldObjective to store an AbstractManifoldObjective
    internally instead of directly f and grad_f, allowing also Hessian objectives
    therein and implementing access to this Hessian
  • Fixed a bug that Lanczos produced NaNs when started exactly in a minimizer, since we divide by the gradient norm.

Deprecated

  • deprecate get_grad_equality_constraints(M, o, p), use get_grad_equality_constraint(M, o, p, :)
    from the more flexible indexing instead.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/108549

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.4.64 -m "<description of version>" 2943e0d7098bfef31ef25411ada25fbe90d773b5
git push origin v0.4.64

Please sign in to comment.