Optimizers
Class BFGS

java.lang.Object
  extended byOptimizers.Optimizer
      extended byOptimizers.BFGS

public abstract class BFGS
extends Optimizer

BFGS minimizer for multivariate functions as described in NR. Code clarity is preferred to efficiency. The code in NR is more compact and consequently more efficient. Here the view is taken that the bulk of the computational effort is spent on evaluations of the function to be minimized.

Warning: the BFGS algorithm chooses a step size which finds the local minimum in one step if the function is a parabola. This implementation always takes a time step of length maxstep in the current direction and then performs a line search backward toward the last point. Why do we do that? The functions we are trying to minimize are of the form f(u)=E[X(u)], where X(u) is a random variable depending on the parameter vector u and E denotes the expectation as usual.

The function f(u) may very well be smooth but unfortuantely we cannot work with the true expectation but instead have to work with the Monte Carlo computed approximation

f_N(u)=(X_1(u)+...+X_N(u))/N,

where the X_j(u) are independent observations of X(u). This function will most likely not be smooth as a function of the parameter vector u. Instead the parameter u will have to move at least for a minimal threshhold value before the finite sample of X_j(u) recognizes a difference. We should therefore think of f_N(u) as blocky, instead of a smooth parabola we will have a sequence of concentric flat terraces descending toward the minimum.

Obviously we have to choose a step size which gets us off each terrace. Likewise we have to compute the function gradients with variable increments of sufficient size lest we find ourselves with a zero gradient despite being far from the local minimum.


Field Summary
static double EPS
           
static double EPSG
           
static double EPSX
           
static double KF
           
 
Constructor Summary
BFGS(double[] x, int nVals, double stepmax, double[] h, boolean verbose, boolean fullInitialization)
          Full initialization if the flag fullInitialization is set to true.
 
Method Summary
 void bfgsUpdate()
          The bfgs update of the approximate inverse Hessian.
 double[] getD()
          Current direction.
 double[] getGrad()
          Current gradient.
 double[] getX()
          Current point.
 double[] gradcdF(double[] x)
          Computes the gradient of Optimizer.f(double[]) at the point x by central finite differencing from 2n points (x+-h*e_j).
 double[] gradF(double[] x, double fx)
          Computes the gradient of Optimizer.f(double[]) at the point x by finite differencing against n other points (x+h*e_j).
static void main(java.lang.String[] args)
          Test program.
 double relativeSize(double[] x, double[] y)
          Measure of the relative size of the vector x relative to the vector y coordinate by coordinate rather than through norms (global).
 double[] search()
          Unconstrained search for the minimum of the function Optimizer.f(double[]).
 void setInitialConditions()
          Sets function value, gradient and initial direction by making calls to Optimizer.f(double[]).
 
Methods inherited from class Optimizers.Optimizer
f
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EPSX

public static final double EPSX
See Also:
Constant Field Values

KF

public static final double KF
See Also:
Constant Field Values

EPSG

public static final double EPSG
See Also:
Constant Field Values

EPS

public static final double EPS
See Also:
Constant Field Values
Constructor Detail

BFGS

public BFGS(double[] x,
            int nVals,
            double stepmax,
            double[] h,
            boolean verbose,
            boolean fullInitialization)

Full initialization if the flag fullInitialization is set to true. If this flag is set to false then all memoryis allocated but no calls are made to the function Optimizer.f(double[]) to compute the initial function value and gradient. Use this option if a subclass has to initialize additional structure to be able to call f. In this case the initialization can be finished by calling setInitialConditions() from the subclass constructor as soon as calls to f can be made.

Parameters:
x - starting point of min search
nVals - maximum number of function evaluations allowed
stepmax - maximum step length in line search
h - gradient_j computed as (f(x+h[j]*e_j)-f(x-h[j]*e_j))/(2*h[j]).
verbose - messages about results
fullInitialization - see above.
Method Detail

getGrad

public double[] getGrad()
Current gradient.


getX

public double[] getX()
Current point.


getD

public double[] getD()
Current direction.


relativeSize

public double relativeSize(double[] x,
                           double[] y)
Measure of the relative size of the vector x relative to the vector y coordinate by coordinate rather than through norms (global). Used for termination criteria.


bfgsUpdate

public void bfgsUpdate()
The bfgs update of the approximate inverse Hessian.


gradF

public double[] gradF(double[] x,
                      double fx)

Computes the gradient of Optimizer.f(double[]) at the point x by finite differencing against n other points (x+h*e_j).

The gradient is crucial for this algorithm. If an exact form for the gradient is available by all means use that and override this method.

The default implementation writes the result into the field gradient and returns a reference to this field. No new memory is allocated. It is not necessary to follow this pattern if the method is overwritten.

Parameters:
x - point at which the gradient is computed.
fx - function value f(x).

gradcdF

public double[] gradcdF(double[] x)

Computes the gradient of Optimizer.f(double[]) at the point x by central finite differencing from 2n points (x+-h*e_j).

The gradient is crucial for this algorithm. If an exact form for the gradient is available by all means use that and override this method.

The default implementation writes the result into the field gradient and returns a reference to this field. No new memory is allocated. It is not necessary to follow this pattern if the method is overwritten.

Parameters:
x - point at which the gradient is computed.

setInitialConditions

public void setInitialConditions()
Sets function value, gradient and initial direction by making calls to Optimizer.f(double[]). Call this from a subclass constructor in case the option complete==false was chosen in the constructor to delay this action after more structure was allocated in the subclass.


search

public double[] search()
Unconstrained search for the minimum of the function Optimizer.f(double[]).

Specified by:
search in class Optimizer
Returns:
the minimizing vector x.

main

public static void main(java.lang.String[] args)

Test program. Very nasty test function which consists of n=dim very very narrow tubes T_j (centered along x_j arbitrary all others = 0) intersecting at right angles. The minimizer must crawl along each valley minimizing each variable separately.

Fine tuning of the directional increments h[j] used to compute the gradient is extremely important. If these are too no change in the function will be detected (zero gradient coordinate), if they are too large the numerical gradient may be so inaccurate as not to point into a descent direction.

The present example is already nasty enough to display the pathologies. Simply experiment with the parameters.