Comparing CG optimization with Optim.jl #219
-
I have the following code which optimizes a generalized Rosenbrock function using both Manopt and Optim: using BenchmarkTools, Profile
using Manifolds, Optim, Manopt
const p = [1.0, 100.0]
function rosenbrock(M::AbstractManifold, x)
return rosenbrock(x)
end
function rosenbrock(x)
val = zero(eltype(x))
for i in 1:(length(x)-1)
val += (p[1] - x[i])^2 + p[2] * (x[i+1] - x[i]^2)^2
end
return val
end
function rosenbrock_grad!(M::AbstractManifold, storage, x)
# the first part can be computed using AD tools
rosenbrock_grad!(storage, x)
# projection is needed because Riemannian optimizers expect
# Riemannian gradients instead of Euclidean ones.
project!(M, storage, x, storage)
end
function rosenbrock_grad!(storage, x)
# the first part can be computed using AD tools
storage .= 0.0
for i in 1:(length(x)-1)
storage[i] += -2.0 * (p[1] - x[i]) - 4.0 * p[2] * (x[i+1] - x[i]^2) * x[i]
storage[i+1] += 2.0 * p[2] * (x[i+1] - x[i]^2)
end
return storage
end
function rosenbrock_grad(M, x)
# the first part can be computed using AD tools
storage = similar(x)
return rosenbrock_grad!(M, storage, x)
end
function test_cg()
n_dims = 5
M = Euclidean(n_dims)
x0 = vcat(zeros(n_dims-1), 1.0)
x_opt = conjugate_gradient_descent(
M,
rosenbrock,
rosenbrock_grad!,
x0;
evaluation=InplaceEvaluation(),
stepsize=ArmijoLinesearch(M),
coefficient=HagerZhangCoefficient(),
stopping_criterion=StopAfterIteration(15),
return_state=true,
)
return x_opt.p
end
function test_cg_optim()
n_dims = 5
x0 = vcat(zeros(n_dims-1), 1.0)
optimize(rosenbrock, rosenbrock_grad!, x0, ConjugateGradient())
end Optim works fine: julia> test_cg_optim()
* Status: success
* Candidate solution
Final objective value: 4.077585e-18
* Found with
Algorithm: Conjugate Gradient
* Convergence measures
|x - x'| = 5.05e-11 ≰ 0.0e+00
|x - x'|/|x'| = 5.05e-11 ≰ 0.0e+00
|f(x) - f(x')| = 1.55e-19 ≰ 0.0e+00
|f(x) - f(x')|/|f(x')| = 3.80e-02 ≰ 0.0e+00
|g(x)| = 8.13e-09 ≤ 1.0e-08
* Work counters
Seconds run: 0 (vs limit Inf)
Iterations: 389
f(x) calls: 976
∇f(x) calls: 592 but Manopt returns NaNs: julia> test_cg()
5-element Vector{Float64}:
NaN
NaN
NaN
NaN
NaN A couple of leads:
EDIT: fixed gradient code. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 3 replies
-
BTW, I'm using this branch of Manopt: #212 . |
Beta Was this translation helpful? Give feedback.
-
If Armijo returns near zero step sizes, that often indicates the gradient is wrong (or the CG-direction) and is not a descent direction any more.
Which line search does Optim use? |
Beta Was this translation helpful? Give feedback.
-
It really seems to be their quite-advanced HagerZhang step size () which I do not understand just by the code (and have not checked the paper; it does check Wolfe conditions, though. So the approach might be
note that this returns a Debug state, so it is better to use The debug lines yields a bit of information during the iterations, the WolfePowell needs this stop check (which is more like a numerical fallback). The last line of debug and the solver result read as
which I also expect for a tough example like Rosenbrock (which we definetly should and could add to ManoptExamples). |
Beta Was this translation helpful? Give feedback.
-
Looking at the original paper of the LInesearch used in Optim.jl – https://www.math.lsu.edu/~hozhang/papers/cg_compare.pdf – the line search is tailored very specifically for CG and a bit technical so sure with such a specific line search Optim.jl is very good, I think this can be generalised to manifolds “easily” in the sense that all their assumptions and values they use are easily statable on manifolds (with retractions, inverse retractions and vector transport where needed), it is still a bit technical to implement I think. |
Beta Was this translation helpful? Give feedback.
-
Sorry, I've made a typo in gradient code. By coincidence it was correct at the one point where I checked it. Now it's... better but not quite fine yet. I have to investigate a bit more before I bother you further 😉 . |
Beta Was this translation helpful? Give feedback.
Sorry, I've made a typo in gradient code. By coincidence it was correct at the one point where I checked it. Now it's... better but not quite fine yet. I have to investigate a bit more before I bother you further 😉 .