API · RAFF- Robust Algebraic Fitting Function

Summary

There are three main RAFF structures:

main functions: called by user;
auxiliary functions: used like internal auxiliary function but can be modify user;
output type: type defined to manipulate output information.

Main functions

RAFF.lmlovo — Function.

lmlovo(model::Function [, x::Vector{Float64} = zeros(n)], data::Array{Float64, 2},
       n::Int, p::Int [; kwargs...])

lmlovo(model::Function, gmodel!::Function [, x::Vector{Float64} = zeros(n)],
       data::Array{Float64,2}, n::Int, p::Int [; MAXITER::Int=200,
       ε::Float64=10.0^-4])

Fit the n-parameter model model to the data given by matrix data. The strategy is based on the LOVO function, which means that only p (0 < p <= rows of data) points are trusted. The Levenberg-Marquardt algorithm is implemented in this version.

Matriz data is the data to be fit. This matrix should be in the form

t11 t12 ... t1N y1
t21 t22 ... t2N y2
:

where N is the dimension of the argument of the model (i.e. dimension of t).

If 'x' is provided, the it is used as the starting point.

The signature of function model should be given by

model(x::Vector{Float64}, t::Union{Vector{Float64}, SubArray})

where x is a n-dimensional vector of parameters and t is the argument. If the gradient of the model gmodel!

gmodel!(x::Vector{Float64}, t::Union{Vector{Float64}, SubArray},
        g::Vector{Float64})

is not provided, then the function ForwardDiff.gradient! is called to compute it. Note that this choice has an impact in the computational performance of the algorithm. In addition, if ForwardDiff is being used, then one MUST remove the signature of vector x from the model.

The optional arguments are

MAXITER: maximum number of iterations
ε: tolerance for the gradient of the function

Returns a RAFFOutput object.

source

RAFF.raff — Function.

raff(model::Function, data::Array{Float64, 2}, n::Int; MAXMS::Int=1,
     SEEDMS::Int=123456789, initguess=zeros(Float64, n))

raff(model::Function, gmodel!::Function, data::Array{Float64, 2}, n::Int;
     [MAXMS::Int=1, SEEDMS::Int=123456789, initguess=zeros(Float64, n),
      kwargs...])

Robust Algebric Fitting Function (RAFF) algorithm. This function uses a voting system to automatically find the number of trusted data points to fit the model.

model: function to fit data. Its signature should be given by
```
model(x, t)
```
where x is a n-dimensional vector of parameters and t is the multidimensional argument
gmodel!: gradient of the model function. Its signature should be given by
```
gmodel!(x, t, g)
```
where x is a n-dimensional vector of parameters, t is the multidimensional argument and the gradient is written in g.
data: data to be fit. This matrix should be in the form
```
t11 t12 ... t1N y1
t21 t22 ... t2N y2
:
```
where N is the dimension of the argument of the model (i.e. dimension of t).
n: dimension of the parameter vector in the model function

The optional arguments are

MAXMS: number of multistart points to be used
SEEDMS: integer seed for random multistart points
initialguess: a good guess for the starting point and for generating random points in the multistart strategy
ε: gradient stopping criteria to lmlovo

Returns a RAFFOutput object with the best parameter found.

source

RAFF.praff — Function.

praff(model::Function, data::Array{Float64, 2}, n::Int; MAXMS::Int=1,
      SEEDMS::Int=123456789, batches::Int=1, initguess=zeros(Float64, n),
      ε=1.0e-4)

praff(model::Function, gmodel!::Function, data::Array{Float64, 2}, n::Int;
      MAXMS::Int=1, SEEDMS::Int=123456789, batches::Int=1,
      initguess=zeros(Float64, n), ε::Float64=1.0e-4)

Multicore distributed version of RAFF. See the description of the raff function for the main (non-optional) arguments. All the communication is performed by channels.

This function uses all available local workers to run RAFF algorithm. Note that this function does not use Tasks, so all the parallelism is based on the Distributed package.

The optional arguments are

MAXMS: number of multistart points to be used
SEEDMS: integer seed for random multistart points
batches: size of batches to be send to each worker
initguess: starting point to be used in the multistart procedure
ε: stopping tolerance

Returns a RAFFOutput object containing the solution.

source

RAFF.setRAFFOutputLevel — Function.

setRAFFOutputLevel(level::LogLevel)

Set the output level of raff and praff algorithms to the desired logging level. Options are (from highly verbose to just errors): Logging.Debug, Logging.Info, Logging.Warn and Logging.Error. The package Logging needs to be loaded.

Defaults to Logging.Error.

source

RAFF.setLMOutputLevel — Function.

setLMOutputLevel(level::LogLevel)

Set the output level of lmlovo algorithm to the desired logging level. Options are (from highly verbose to just errors): Logging.Debug, Logging.Info, Logging.Warn and Logging.Error. The package Logging needs to be loaded.

Defaults to Logging.Error.

source

Auxiliary functions

RAFF.eliminate_local_min! — Function.

eliminate_local_min!(sols::Vector{RAFFOutput})

Check if the function value of the solution found by smaller values of p is not greater when compared with larger ones. This certainly indicates that a local minimizer was found by the smaller p.

source

RAFF.SortFun! — Function.

This function is an auxiliary function. It finds the p smallest values of vector V and brings them to the first p positions. The indexes associated with the p smallest values are stored in ind.

source

RAFF.update_best — Function.

update_best(channel::RemoteChannel, bestx::SharedArray{Float64, 1})

Listen to a channel for results found by lmlovo. If there is an improvement for the objective function, the shared array bestx is updated.

Attention: There might be an unstable state if there is a process reading bestx while this function is updating it. This should not be a problem, since it is used as a starting point.

Attention 2: this function is currently out of use.

source

RAFF.consume_tqueue — Function.

function consume_tqueue(bqueue::RemoteChannel, tqueue::RemoteChannel,
                        squeue::RemoteChannel, model::Function, gmodel!::Function,
                        data::Array{Float64, 2}, n::Int, pliminf::Int,
                        plimsup::Int, MAXMS::Int, seedMS::MersenneTwister)

This function represents one worker, which runs lmlovo in a multistart fashion.

It takes a job from the RemoteChannel tqueue and runs lmlovo function to it. It might run using a multistart strategy, if MAXMS>1. It sends the best results found for each value obtained in tqueue to channel squeue, which will be consumed by the main process. All the other arguments are the same for praff function.

source

RAFF.check_and_close — Function.

check_and_close(bqueue::RemoteChannel, tqueue::RemoteChannel,
                squeue::RemoteChannel, futures::Vector{Future};
                secs::Float64=0.1)

Check if there is at least one worker process in the vector of futures that has not prematurely finished. If there is no alive worker, close task, solution and best queues, tqueue, squeue and bqueue, respectively.

source

RAFF.generateTestProblems — Function.

generateTestProblems(datFilename::String, solFilename::String,
                     model::Function, modelStr::String, n::Int,
                     np::Int, p::Int)

Generate random data files for testing fitting problems.

datFilename and solFilename are strings with the name of the files for storing the random data and solution, respectively.
model is the model function and modelStr is a string representing this model function, e.g.
```
 model = (x, t) -> x[1] * t[1] + x[2]
 modelStr = "(x, t) -> x[1] * t[1] + x[2]"
```
where vector x represents the parameters (to be found) of the model and vector t are the variables of the model.
n is the number of parameters
np is the number of points to be generated.
p is the number of trusted points to be used in the LOVO approach.

source

RAFF.get_unique_random_points — Function.

get_unique_random_points(np::Int, npp::Int)

Choose exactly npp unique random points from a set containing np points. This function is similar to rand(vector), but does not allow repetitions.

Return a vector with the selected points.

source

RAFF.generateNoisyData — Function.

generateNoisyData(model::Function, n::Int, np::Int, p::Int;
                  tMin::Float64=-10.0, tMax::Float64=10.0,
                  xSol::Vector{Float64}=10.0 * randn(Float64, n),
                  std::Float64=200.0, outTimes::Float64=7.0)

generateNoisyData(model::Function, n, np, p, tMin::Float64, tMax::Float64)

generateNoisyData(model::Function, n::Int, np::Int, p::Int,
                  xSol::Vector{Float64}, tMin::Float64, tMax::Float64)

Random generate a fitting one-dimensional data problem.

This function receives a model(x, t) function, the number of parameters n, the number of points np to be generated and the number of trusted points p.

If the n-dimensional vector xSol is provided, the the exact solution will not be random generated. The interval [tMin, tMax] for generating the values to evaluate model can also be provided.

It returns a tuple (data, xSol, outliers) where

data: (np x 2) array, where each row contains t and model(xSol, t).
xSol: n-dimensional vector with the exact solution.
outliers: the outliers of this data set

source

Output type

RAFF.RAFFOutput — Type.

This type defines the output file for the RAFF algorithm.

RAFFOutput(status::Int, solution::Vector{Float64}, iter::Int,
           p::Int, f::Float64, outliers::Vector{Int})

where

status: is 1 if converged and 0 if not
solution: vector with the parameters of the model
iter: number of iterations up to convergence
p: number of trusted points
f: the residual value
outliers: the possible outliers detected by the method, for the given p

RAFFOutput()

Creates a null version of output, equivalent to RAFFOutput(0, [], -1, 0, Inf, [])

RAFFOuput(p::Int)

Creates a null version of output for the given p.

source