| Hyper-Ellipsoidal Conjugate Gradient Descent |
|---|
Please report bugs and errors to Roland Vollgraf.
If you use this software, please cite:
| [1] | R. Vollgraf and K. Obermayer, Sparse optimization for second order kernel methods. in IJCNN 2006 Conference Proceedings, in press. (Abstract HTML, FTP PDF) |
Use this function to minimize a differentiable cost function subject to one positive definite quadratic constraint and none or a number of linear orthogonality constraints. The function is particularly suited to perform sparse optimization for second order constrained optimization problems
[w,elg,plg] = HECGD(w, C, D, be, W)
minimizes
w'*D*w + be*sum(abs(w))^2/(w'*w)
s.t.
w'*C*w ==1
w'*C*W ==0
| w | is an Nx1 vector of initial weights |
| C | is a square, symmetric, and positive definite matrix, and |
| D | is an arbitrary square matrix |
| W | is an NxM matrix with M<N (otherwise there would be no solution), or [] if there are no orthogonality constraints |
| be | >=0 weigths the strength of the regularization term |
use
[w,elg,plg] = HECGD(w, C, D, be, W, opt)
to pass a structure of user specified parameters. opt can have the following fields:
| field | admissible values | default meaning | |
| 'Nit' | integer >0 | 100 | maximum number of iterarions |
| 'Nit_inner' | integer >0 | 500 | maximum number of line search iterarions |
| 'Nit_restart' | integer >=0 | 0 | conjugate gradient reset period (0=never) |
| 'reduction' | boolean | false | clamp elements of w to zero that come closer to zero than tol(3). |
| 'tol' | 1x3 double>=0 | [1e-4 1e-4 1e-3] | termination tolerance values The optimization terminates when the changes in w are below tol(1) and the changes in the cost function are below tol(2). The maximum number of iterations is Nit. If a weight falls below tol(3) it is discarded if reduction is activated |
| 'verbose' | boolean | true | print status information |
| 'symmetry' | boolean | true | specifies whether the error function is symmetric on the line search ellipse. This affects the maximum search interval, which is either 2*pi for non-symmetric functions or otherwise pi |
| 'errfun' | function handle | @errsokm | user specified unconstrained error/gradient function. See errsokm for description of the interface |
compute 10 sparse kernel principal components from an NxT data set X
1. compute the TxT kernel matrix between all pairs of data pointsE = errsokm( w, D, be) computes the regularized error function E=w'*D*w+be*sum(abs(w))^2/sum(w^2)
[E,G] = errsokm( w, D, be) also returns the gradient wrt. w