An optimisation method is how we optimise an [[objective function]]. In the context of classification, it is a method to search through
However using 0-1 loss is not always feasible. See [[loss functions and convex optimisation]].
- Draw a 'large enough' set of samples
$(X, Y)^n$ - Output hypothesis
$h \in \mathcal{H}$ which minimizes disagreements with$(X,Y)^n$
the best we can do to learn a classifier from training data is empirical risk minimisation (as opposed to minimising 'expected risk', which is ideal but not possible via statistical methods, as discussed in [[objective function]]).