Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust loss functions for outlier rejection #332

Open
Affie opened this issue Dec 15, 2023 · 8 comments
Open

Robust loss functions for outlier rejection #332

Affie opened this issue Dec 15, 2023 · 8 comments

Comments

@Affie
Copy link
Contributor

Affie commented Dec 15, 2023

Is it possible to use loss functions such as Huber, Tukey, and adaptive in Manopt.jl (specifically RLM)?
Similar to ceres solver: http://ceres-solver.org/nnls_modeling.html#lossfunction

@mateuszbaran
Copy link
Member

Not at the moment but it looks like a relatively simple modification of RLM to support it. The rescaling described in http://ceres-solver.org/nnls_modeling.html#theory looks like it would work for RLM too.

@mateuszbaran
Copy link
Member

@kellertuer do you think it would be better to add robustification support to the current implementation of RLM or make a separate implementation?

@kellertuer
Copy link
Member

kellertuer commented Dec 15, 2023

In the Exact Penalty Method (EPM) that is included in the method itself, so I think that would be fitting here as well. We even have types already for different types of relaxations (e.g. Huber).

If we extend those and use them in RLM as well, I think that would be great.

edit: to provide a link, we currently have https://manoptjl.org/stable/solvers/exact_penalty_method/#Manopt.SmoothingTechnique – those could either be extended or combined into a common framework of robustifcation / smoothing.

@mateuszbaran
Copy link
Member

Cool 👍 . RLM seems to need a somewhat different interface though.

@kellertuer
Copy link
Member

Sure, no problem – maybe we can also revise the interface with EPM a bit to have a common one then.

@mateuszbaran mateuszbaran mentioned this issue May 5, 2024
13 tasks
@kellertuer
Copy link
Member

I revisited this and for the most general case (outside of RLM), I have no real idea how that could be done or what that would even mean for an algorithm like Douglas-Rachford for example.

Within RLM, one could really just store the ρ function mentioned in the docs linked above and and apply that element wise.
It would probably be best to have that as a HessianObjective (though a bit verbose for a 1D-function) to store its derivative and second derivative as well, since the Jacobian is affected by this choice though a chain rule as well.

Ah I am not so sure we need the second derivative for now? We only use first order information in RLM for now I think. Then besides that field only the get_cost and get_jacobian! for the NonlinearLeastSquaresObjective have to be adapted.
And probably a GradientObjective (a function and its derivative in the 1D case) I enough flexibility.

@mateuszbaran
Copy link
Member

I revisited this and for the most general case (outside of RLM), I have no real idea how that could be done or what that would even mean for an algorithm like Douglas-Rachford for example.

I don't think it can work for any loss function, only those without splitting, nonlinear least squares and most likely stochastic optimization, though I couldn't find any papers about what would be the Euclidean variant. Maybe let's tackle each case separately?

Within RLM, one could really just store the ρ function mentioned in the docs linked above and and apply that element wise.
It would probably be best to have that as a HessianObjective (though a bit verbose for a 1D-function) to store its derivative and second derivative as well, since the Jacobian is affected by this choice though a chain rule as well.

Ah I am not so sure we need the second derivative for now? We only use first order information in RLM for now I think. Then besides that field only the get_cost and get_jacobian! for the NonlinearLeastSquaresObjective have to be adapted.
And probably a GradientObjective (a function and its derivative in the 1D case) I enough flexibility.

As far as I can tell, ρ is not applied elementwise to the output of f but only after the squared norm is computed. Also, maybe it would be most natural to store ρ in NonlinearLeastSquaresObjective.

In an earlier post you had the idea of combining robust loss functions with SmoothingTechnique, maybe we can do that instead of representing it as a HessianObjective?

@kellertuer
Copy link
Member

You are right in full generality that might not be possible, so let's just do it for RLM for now.

Yes, storing \rho in the NLSObj would be my idea for now as well, but we have to store its derivative as well, so that is why I was thinking about storing it as a gradient objective to stay generic.

Yes the smoothing techniques are basically the same, though there we handle that with storing a symbol (and only support 2 functions). I am not yet sure how well this can be combined, but it would most probably be the ELM being adapted to the mode here (since their symbol-approach is far more restrictive than storing the smoothing function).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants