| Title: | Automated Parameter Estimation for Complex Models |
|---|---|
| Description: | General optimisation and specific tools for the parameter estimation (i.e. calibration) of complex models, including stochastic ones. It implements generic functions that can be used for fitting any type of models, especially those with non-differentiable objective functions, with the same syntax as 'stats::optim()'. It supports multiple phases estimation (sequential parameter masking), constrained optimization (bounding box restrictions) and automatic parallel computation of numerical gradients. Some common maximum likelihood estimation methods and automated construction of the objective function from simulated model outputs is provided. See <https://roliveros-ramos.github.io/calibrar/> for more details. |
| Authors: | Ricardo Oliveros-Ramos [aut, cre] (ORCID: <https://orcid.org/0000-0002-8069-2101>) |
| Maintainer: | Ricardo Oliveros-Ramos <[email protected]> |
| License: | GPL-2 |
| Version: | 0.9.0.9011 |
| Built: | 2026-05-22 09:20:39 UTC |
| Source: | https://github.com/roliveros-ramos/calibrar |
Automated Calibration for Complex Models
calibrar package: Automated Calibration for Complex Models
This package allows the parameter estimation (i.e. calibration) of complex models, including stochastic ones. It implements generic functions that can be used for fitting any type of models, especially those with non-differentiable objective functions, with the same syntax as base::optim. It supports multiple phases estimation (sequential parameter masking), constrained optimization (bounding box restrictions) and automatic parallel computation of numerical gradients. Some common maximum likelihood estimation methods and automated construction of the objective function from simulated model outputs is provided. See <https://roliveros-ramos.github.io/calibrar/> for more details.
Ricardo Oliveros-Ramos Maintainer: Ricardo Oliveros-Ramos <[email protected]>
calibrar: an R package for the calibration of ecological models (Oliveros-Ramos and Shin 2014)
Useful links:
Report bugs at https://github.com/roliveros-ramos/calibrar/issues
## Not run: require(calibrar) set.seed(880820) path = NULL # NULL to use the current directory # create the demonstration files demo = calibrar_demo(model="PoissonMixedModel", L=5, T=100) # get calibration information calibrationInfo = calibration_setup(file=demo$path) # get observed data observed = calibration_data(setup=calibrationInfo, path=demo$path) # read forcings for the model forcing = read.csv(file.path(demo$path, "master", "environment.csv"), row.names=1) # Defining 'runModel' function runModel = function(par, forcing) { output = calibrar:::.PoissonMixedModel(par=par, forcing=forcing) # adding gamma parameters for penalties output = c(output, list(gammas=par$gamma)) return(output) } # real parameters cat("Real parameters used to simulate data\n") print(demo$par) # objective functions obj = calibration_objFn(model=runModel, setup=calibrationInfo, observed=observed, forcing=forcing) cat("Starting calibration...\n") control = list(weights=calibrationInfo$weights, maxit=3.6e5) # control parameters cat("Running optimization algorithms\n", "\t", date(), "\n") cat("Running optim AHR-ES\n") ahr = calibrate(par=demo$guess, fn=obj, lower=demo$lower, upper=demo$upper, control=control) summary(ahr) ## End(Not run)## Not run: require(calibrar) set.seed(880820) path = NULL # NULL to use the current directory # create the demonstration files demo = calibrar_demo(model="PoissonMixedModel", L=5, T=100) # get calibration information calibrationInfo = calibration_setup(file=demo$path) # get observed data observed = calibration_data(setup=calibrationInfo, path=demo$path) # read forcings for the model forcing = read.csv(file.path(demo$path, "master", "environment.csv"), row.names=1) # Defining 'runModel' function runModel = function(par, forcing) { output = calibrar:::.PoissonMixedModel(par=par, forcing=forcing) # adding gamma parameters for penalties output = c(output, list(gammas=par$gamma)) return(output) } # real parameters cat("Real parameters used to simulate data\n") print(demo$par) # objective functions obj = calibration_objFn(model=runModel, setup=calibrationInfo, observed=observed, forcing=forcing) cat("Starting calibration...\n") control = list(weights=calibrationInfo$weights, maxit=3.6e5) # control parameters cat("Running optimization algorithms\n", "\t", date(), "\n") cat("Running optim AHR-ES\n") ahr = calibrate(par=demo$guess, fn=obj, lower=demo$lower, upper=demo$upper, control=control) summary(ahr) ## End(Not run)
Get an specific argument from the command line
.get_command_argument( x, argument, prefix = "--", default = FALSE, verbose = FALSE ).get_command_argument( x, argument, prefix = "--", default = FALSE, verbose = FALSE )
x |
The command line arguments, from |
argument |
The name of the argument. |
prefix |
The prefix to any argument of interest, the default is "–" |
default |
Default value to return is argument is missing, default to FALSE. |
verbose |
Boolean, if TRUE, shows a warning when the parameter is not found. |
The value of the argument, assumed to be followed after '=' or, TRUE if nothing but the argument was found. If the argument is not found, FALSE is returned.
.get_command_argument(commandArgs(), "interactive") .get_command_argument(commandArgs(), "RStudio") .get_command_argument(commandArgs(), "RStudio", prefix="") .get_command_argument(commandArgs(), "vanilla") .get_command_argument("--control.file=baz.txt", "control.file").get_command_argument(commandArgs(), "interactive") .get_command_argument(commandArgs(), "RStudio") .get_command_argument(commandArgs(), "RStudio", prefix="") .get_command_argument(commandArgs(), "vanilla") .get_command_argument("--control.file=baz.txt", "control.file")
File is expected to have lines of the form 'key SEP value' where key is the name of the parameter, SEP a separator (can be '=' ',', ';') and value the value of the parameter itself. The SEP for each line is determined and parameters values are returned as a list.
.read_configuration( file, recursive = TRUE, keep.names = TRUE, conf.key = NULL, ... ).read_configuration( file, recursive = TRUE, keep.names = TRUE, conf.key = NULL, ... )
file |
File to read the configuration |
recursive |
Should 'conf.key' keys be read as additional configuration files? Default is TRUE. |
keep.names |
Should names be kept as they are? By default, are converted to lower case. |
conf.key |
String indicating the leading key to find an additional configuration file. |
... |
Additional arguments, not currently in use. |
This function performs the optimization of a function using the Adaptative Hierarchical Recombination Evolutionary Strategy (AHR-ES, Oliveros & Shin, 2015).
ahres( par, fn, gr = NULL, ..., lower = -Inf, upper = +Inf, active = NULL, control = list(), hessian = FALSE, parallel = FALSE )ahres( par, fn, gr = NULL, ..., lower = -Inf, upper = +Inf, active = NULL, control = list(), hessian = FALSE, parallel = FALSE )
par |
A numeric vector or list. The length of |
fn |
The objective function to be minimised. It should accept a parameter vector (or list, depending on the wrapper) as first argument and return either a scalar value (single-objective) or a numeric vector (multi-objective). |
gr |
A function computing the gradient of |
... |
Additional arguments passed to |
lower |
Lower bounds for parameters. One value or a vector of the same length as
|
upper |
Upper bounds for parameters. One value or a vector of the same length as
|
active |
Boolean vector of the same length as |
control |
A list of control options. Common options include |
hessian |
Logical. Should a numerically differentiated Hessian matrix be returned? Currently not implemented. |
parallel |
Logical. Enable parallel computation (e.g., for replicated evaluations
and/or numerical gradients) using up to |
A list with components:
The best set of parameters found.
The value of fn corresponding to par.
A two-element integer vector giving the number of calls to fn and
gr respectively. This excludes those calls needed to compute the Hessian, if
requested, and any calls to fn to compute a finite-difference approximation to the
gradient.
An integer code. 0 indicates successful completion.
A character string giving any additional information returned by the
optimizer, or NULL.
Only if argument hessian is TRUE. A symmetric matrix giving an
estimate of the Hessian at the solution found. Note that this is the Hessian of the
unconstrained problem even if the box constraints are active.
Ricardo Oliveros-Ramos
Other optimisers:
calibrate(),
optim2(),
optimh()
## Not run: ahres(par=rep(1, 5), fn=sphereN)## Not run: ahres(par=rep(1, 5), fn=sphereN)
Creates demo files able to be processed for a full calibration using the calibrar package
calibrar_demo(path = NULL, model = NULL, ...)calibrar_demo(path = NULL, model = NULL, ...)
path |
Path to create the demo files |
model |
Model to be used in the demo files, see details. |
... |
Additional parameters to be used in the construction of the demo files. |
Current implemented models are:
Poisson Autoregressive Mixed model for the dynamics of a population in different sites:
where is the size of the population in site at year ,
is the value of an environmental variable in site at year .
The parameters to estimate were , , and , the
random effects for each year, , and the initial
population at each site . We assumed that the observations
follow a Poisson distribution with mean .
Lotka Volterra Predator-Prey model. The model is defined by a system of ordinary differential equations for the abundance of prey $N$ and predator $P$:
The parameters to estimate are the prey’s growth rate , the predator’s
mortality rate , the carrying capacity of the prey and
and for the predation interaction. Uses deSolve package
for numerical solution of the ODE system.
Susceptible-Infected-Recovered epidemiological model. The model is defined by a system of ordinary differential equations for the number of susceptible $S$, infected $I$ and recovered $R$ individuals:
The parameters to estimate are the average number of contacts per person per
time and the instant probability of an infectious individual
recovering . Uses deSolve package for numerical solution of the ODE system.
Stochastic Individual Based Model for Lotka-Volterra model. Uses ibm package for the simulation.
A list with the following elements:
path |
Path were the files were saved |
par |
Real value of the parameters used in the demo |
setup |
Path to the calibration setup file |
guess |
Values to be provided as initial guess to the calibrate function |
lower |
Values to be provided as lower bounds to the calibrate function |
upper |
Values to be provided as upper bounds to the calibrate function |
phase |
Values to be provided as phases to the calibrate function |
constants |
Constants used in the demo, any other variable not listed here. |
value |
NA, set for compatibility with summary methods. |
time |
NA, set for compatibility with summary methods. |
counts |
NA, set for compatibility with summary methods. |
Ricardo Oliveros–Ramos
Oliveros-Ramos and Shin (2014)
## Not run: summary(ahr) set.seed(880820) path = NULL # NULL to use the current directory # create the demonstration files demo = calibrar_demo(path=path, model="PredatorPrey", T=100) # get calibration information calibration_settings = calibration_setup(file = demo$setup) # get observed data observed = calibration_data(setup = calibration_settings, path=demo$path) # Defining 'run_model' function run_model = calibrar:::.PredatorPreyModel # real parameters cat("Real parameters used to simulate data\n") print(unlist(demo$par)) # parameters are in a list # objective functions obj = calibration_objFn(model=run_model, setup=calibration_settings, observed=observed, T=demo$T) obj2 = calibration_objFn(model=run_model, setup=calibration_settings, observed=observed, T=demo$T, aggregate=TRUE) cat("Starting calibration...\n") cat("Running optimization algorithms\n", "\t") cat("Running optim AHR-ES\n") ahr = calibrate(par=demo$guess, fn=obj, lower=demo$lower, upper=demo$upper, phases=demo$phase) summary(ahr) ## End(Not run)## Not run: summary(ahr) set.seed(880820) path = NULL # NULL to use the current directory # create the demonstration files demo = calibrar_demo(path=path, model="PredatorPrey", T=100) # get calibration information calibration_settings = calibration_setup(file = demo$setup) # get observed data observed = calibration_data(setup = calibration_settings, path=demo$path) # Defining 'run_model' function run_model = calibrar:::.PredatorPreyModel # real parameters cat("Real parameters used to simulate data\n") print(unlist(demo$par)) # parameters are in a list # objective functions obj = calibration_objFn(model=run_model, setup=calibration_settings, observed=observed, T=demo$T) obj2 = calibration_objFn(model=run_model, setup=calibration_settings, observed=observed, T=demo$T, aggregate=TRUE) cat("Starting calibration...\n") cat("Running optimization algorithms\n", "\t") cat("Running optim AHR-ES\n") ahr = calibrate(par=demo$guess, fn=obj, lower=demo$lower, upper=demo$upper, phases=demo$phase) summary(ahr) ## End(Not run)
calibrate() minimises an objective function fn for the calibration of
complex (possibly expensive and stochastic) models. It supports sequential calibration
in multiple phases (progressively activating parameters), replicated evaluations for
stochastic objectives, restartable runs, and parallel execution patterns for
computationally intensive models.
calibrate( par, fn, gr, ..., method, lower, upper, phases, control, hessian, replicates, parallel ) ## Default S3 method: calibrate( par, fn, gr = NULL, ..., method = NULL, lower = NULL, upper = NULL, phases = NULL, control = list(), hessian = FALSE, replicates = 1, parallel = FALSE )calibrate( par, fn, gr, ..., method, lower, upper, phases, control, hessian, replicates, parallel ) ## Default S3 method: calibrate( par, fn, gr = NULL, ..., method = NULL, lower = NULL, upper = NULL, phases = NULL, control = list(), hessian = FALSE, replicates = 1, parallel = FALSE )
par |
A numeric vector or list. The length of |
fn |
The objective function to be minimised. It should accept a parameter vector (or list, depending on the wrapper) as first argument and return either a scalar value (single-objective) or a numeric vector (multi-objective). |
gr |
A function computing the gradient of |
... |
Additional arguments passed to |
method |
Optimisation method(s) to be used. Can be a single method name or a
vector of method names (e.g., one per phase). If |
lower |
Lower bounds for parameters. One value or a vector of the same length as
|
upper |
Upper bounds for parameters. One value or a vector of the same length as
|
phases |
Optional integer vector of the same length as |
control |
A list of control options. Common options include |
hessian |
Logical. Should a numerically differentiated Hessian matrix be returned? Currently not implemented. |
replicates |
Integer or integer vector controlling the number of replicate
evaluations of |
parallel |
Logical. Enable parallel computation (e.g., for replicated evaluations
and/or numerical gradients) using up to |
Sequential phases. When phases is provided, parameters are activated
progressively across phases. In phase , parameters with phases <= k are
active and estimated, while all others are held fixed. If phases is omitted, all
parameters are estimated in a single phase.
Replicated evaluations. The argument replicates controls the number of
replicate evaluations of fn per phase. It can be a single integer (applied to all
phases) or a vector of length equal to the number of phases. Replication is useful for
stochastic models to reduce Monte Carlo noise and/or to target robust parameter sets.
Default method selection. If method is not provided, calibrate()
defaults to "Rvmmin" when all(replicates == 1) (deterministic objective),
and to "AHR-ES" otherwise (replicated/stochastic objective).
Multi-objective outputs. The objective function fn may return a scalar
(single-objective) or a numeric vector of objective components (multi-objective).
Multi-objective optimisation is currently supported only by methods in
multiMethods (e.g., "AHR-ES"). For single-objective methods, fn must be
scalar; alternatively, objective components can be aggregated into a scalar by defining
an aggregated objective in calibration_objFn(aggregate = TRUE).
Aggregation. When fn is created via calibration_objFn(), metadata
such as the number of components (nvar) and component weights (weights) can
be stored as attributes. If aggregate = TRUE, fn is scalar and can be used
with any method. If aggregate = FALSE (vector output), only multi-objective methods
can use it directly.
Parallel execution and run directories. If parallel = TRUE,
calibrate() may distribute replicate evaluations and/or finite-difference
gradient computations across multiple cores. The number of cores is controlled by
control$ncores. For file-based or externally executed models, control$master
and control$run can be used to manage a master/template directory and per-run
working directories (created if needed).
Restart. Partial results can be written and used to resume a calibration via
control$restart.file. Restart functionality is currently available only for
"AHR-ES", "Rvmmin", and "hjn".
An object of class "calibrar.results" with components:
Best parameter values found (returned with the same structure as input par).
Objective value at par.
Number of calls to fn and gr (where applicable).
Convergence code returned by the underlying optimiser.
Additional information returned by the optimiser, if any.
The optimisation method used in the final phase.
The objective function.
Logical vector indicating which parameters were active (estimated).
Elapsed time for the full calibration run.
Tracing information and per-phase outputs (when available).
For smooth deterministic objectives, gradient-based methods such as "Rvmmin",
"L-BFGS-B", "LBFGSB3", or "nlminb" are often efficient (with box
constraints when needed). For noisy or stochastic objectives (typically when
replicates > 1), heuristic/global methods such as "AHR-ES" are generally
more appropriate. Not all methods are directly comparable (local vs.\ global,
deterministic vs.\ stochastic, scalar vs.\ vector-valued objectives); method choice
should reflect the structure of fn and the presence of constraints.
"SANN" is included for compatibility with stats::optim(), but it is highly
sensitive to tuning and often performs poorly on continuous problems under default
settings. For stochastic objectives, "AHR-ES" is the intended default within
calibrate().
Ricardo Oliveros-Ramos
calibration_setup, calibration_data,
calibration_objFn
Other optimisers:
ahres(),
optim2(),
optimh()
calibrate(par=rep(NA, 5), fn=sphereN) ## Not run: calibrate(par=rep(NA, 5), fn=sphereN, replicates=3) calibrate(par=rep(0.5, 5), fn=sphereN, replicates=3, lower=-5, upper=5) calibrate(par=rep(0.5, 5), fn=sphereN, replicates=3, lower=-5, upper=5, phases=c(1,1,1,2,3)) calibrate(par=rep(0.5, 5), fn=sphereN, replicates=c(1,1,4), lower=-5, upper=5, phases=c(1,1,1,2,3)) ## End(Not run)calibrate(par=rep(NA, 5), fn=sphereN) ## Not run: calibrate(par=rep(NA, 5), fn=sphereN, replicates=3) calibrate(par=rep(0.5, 5), fn=sphereN, replicates=3, lower=-5, upper=5) calibrate(par=rep(0.5, 5), fn=sphereN, replicates=3, lower=-5, upper=5, phases=c(1,1,1,2,3)) calibrate(par=rep(0.5, 5), fn=sphereN, replicates=c(1,1,4), lower=-5, upper=5, phases=c(1,1,1,2,3)) ## End(Not run)
Create a list with the observed data with the information provided by its main argument.
calibration_data(setup, path = ".", file = NULL, verbose = TRUE, ...)calibration_data(setup, path = ".", file = NULL, verbose = TRUE, ...)
setup |
A data.frame with the information about the calibration,
normally created with the |
path |
Path to the directory to look up for the data. Paths in setup are considered relatives to this path. |
file |
Optional file to save the created object (as an 'rds' file.) |
verbose |
If TRUE, detailed messages of the process are printed. |
... |
Additional arguments to |
A list with the observed data needed for a calibration, to be used
in combination with the calibration_objFn.
Ricardo Oliveros-Ramos
calibration_objFn, calibration_setup.
Create a new function, to be used as the objective function in the calibration, given a function to run the model within R, observed data and information about the comparison with data.
calibration_objFn(model, setup, observed, aggFn = NULL, aggregate = FALSE, ...)calibration_objFn(model, setup, observed, aggFn = NULL, aggregate = FALSE, ...)
model |
Function to run the model and produce a list of outputs. |
setup |
A data.frame with the information about the calibration,
normally created with the |
observed |
A list of the observed variables created with the
function |
aggFn |
A function to aggregate |
aggregate |
boolean, if TRUE, a scalar value is returned using the
|
... |
More arguments passed to the |
A function, integrating the simulation of the model and the comparison with observed data.
Ricardo Oliveros-Ramos
calibration_data, calibration_setup.
calibrar package.A wrapper for read.csv checking column names and data types
for the table with the calibration information.
calibration_setup(file, control = list(), ...)calibration_setup(file, control = list(), ...)
file |
The file with the calibration information, see details. |
control |
Control arguments for generating the setup. See details. |
... |
Additional arguments to |
A data.frame with the information for the calibration of a
model, to be used with the calibration_objFn
and calibration_data.
Ricardo Oliveros-Ramos
calibration_objFn, calibration_data.
Calculate a discretization of the 2D Gaussian Kernel
gaussian_kernel(par, lower, upper, n = 10, checkSymmetry = TRUE, ...)gaussian_kernel(par, lower, upper, n = 10, checkSymmetry = TRUE, ...)
par |
A list, including the mean and covariance matrix. |
lower |
A vector, indicating the lower bound for the calculation. |
upper |
A vector, indicating the upper bound for the calculation. |
n |
The number of cells for each dimension, can be one or two numbers. |
checkSymmetry |
TRUE by default, checks if the covariance matrix is symmetric. |
... |
Additional arguments, currently not used. |
A list, with 'x', 'y' and 'z' components.
This function calculates the gradient of a function, numerically, including the possibility of doing it in parallel.
gradient(fn, x, method, control, parallel, ...)gradient(fn, x, method, control, parallel, ...)
fn |
The function to calculate the gradient. |
x |
The value to compute the gradient at. |
method |
The method used. Currently implemented: central, backward, forward and Richardson. See details. |
control |
A list of control arguments. |
parallel |
Boolean, should numerical derivatives be calculated in parallel? |
... |
Additional arguments to be passed to |
The gradient of fn at x.
gradient(fn=function(x) sum(x^3), x=0)gradient(fn=function(x) sum(x^3), x=0)
Compute an objective value (error, negative log-likelihood, and/or penalty) given observed data ('obs') and model outputs ('sim'). This function is a thin dispatcher: it resolves 'FUN' via [match.fun()] and returns 'FUN(obs = obs, sim = sim, ...)'.
objFn(obs, sim, FUN, ...) fitness(obs, sim, FUN, ...)objFn(obs, sim, FUN, ...) fitness(obs, sim, FUN, ...)
obs |
Observed data as expected by 'FUN'. Typically a numeric vector, matrix, or array. Missing values ('NA') are permitted; see *Missing values* below. |
sim |
Simulated data matching 'obs', in the sense expected by 'FUN'. For most pointwise criteria (e.g. 'norm2', 'lnorm2', 'pois'), 'sim' should have the same shape as 'obs'. For composition criteria (e.g. 'multinom'), 'obs' and 'sim' are expected to be matrices with rows representing samples and columns representing classes. |
FUN |
Objective function to apply. Can be:
The function must accept arguments named 'obs' and 'sim' (additional arguments may be supplied via '...') and must return a single numeric scalar. |
... |
Additional arguments forwarded to 'FUN'. |
The returned value is intended to be **minimised** by an optimiser: lower values indicate a better match between 'sim' and 'obs' (or a lower penalty).
A numeric scalar: the value of 'FUN(obs = obs, sim = sim, ...)'. By convention this is an objective to be **minimised**.
**Minimisation**: all provided objectives are formulated so that lower values indicate a better fit (or weaker penalty).
**Shapes**: 'objFn()' does not reshape or recycle data. It is the caller's responsibility to supply 'obs' and 'sim' in a compatible form for the chosen 'FUN'.
**Scalar output**: 'FUN' should return a length-one numeric value. Returning vectors is not supported by 'objFn()' itself (although higher level workflows may aggregate vector-valued objectives elsewhere).
Most built-in objectives stop if 'all(is.na(obs))'. Otherwise, they ignore missing values using 'na.rm = TRUE' inside 'sum()'/'mean()'. This means:
partial 'NA's in 'obs' are ignored;
'NA's in 'sim' will also be dropped from sums when 'na.rm = TRUE' (possibly masking simulation failures if not checked upstream).
Some objectives impose additional constraints:
**Poisson** ('pois'): uses 'log(sim)'; 'sim' must be strictly positive wherever 'obs > 0' (otherwise '-Inf' may occur). Consider flooring the simulated intensity, e.g. 'sim <- pmax(sim, 1e-12)'.
**Log-scale** ('lnorm2', 'lnorm3', 'lnorm4', 'lnorm4b'): apply 'log(x + tiny)'; both 'obs + tiny' and 'sim + tiny' must be positive. If your data may contain zeros, 'tiny' should be chosen accordingly.
**Compositions** ('multinom'): expects matrix inputs and uses row sums to convert counts/weights into proportions. See details below.
The following functions are available with their current behaviour. All are formulated as objectives to be minimised.
norm2Sum of squared errors on the original scale:
Typical use: continuous observations with approximately additive errors.
lnorm2Sum of squared errors on the log scale:
Arguments: 'tiny' (default '1e-2'), added before the log to avoid 'log(0)'. Typical use: positive-valued data with multiplicative (lognormal) error structure.
lnorm3Log-scale squared error with an estimated multiplicative scaling factor
. Internally:
compute element-wise ratios 'ratio <- obs/sim';
set 'NaN' ratios to 'NA';
estimate 'q <- mean(ratio, na.rm = TRUE)';
return .
Typical use: positive data where an overall multiplicative bias (scale mismatch) is expected and should not be fully penalised.
lnorm4 / lnorm4b
Extensions of 'lnorm3' that add a penalty term to discourage extreme
values of the scaling factor (or extreme per-observation ratios).
They rely on the helper rangeq():
compute 'ratio <- obs/sim', estimate 'q <- mean(ratio, na.rm=TRUE)';
compute a penalty using parameters 'b' and 'c':
when 'dump = TRUE' (used by 'lnorm4'), or
when 'dump = FALSE' (used by 'lnorm4b'), where 'n' is the number of non-missing ratios.
add the penalty to the 'lnorm3'-style objective.
Arguments: 'tiny', 'b' (default '1'), 'c' (default '2'). Typical use: log-scale fitting where scale drift must be controlled.
poisPoisson negative log-likelihood (up to constants):
Typical use: counts (or count-like rates) with Poisson observation error. Note: 'sim' must be positive where 'obs > 0'.
multinomA composition (multinomial-like) objective operating on matrices. Inputs are expected as 'obs' and 'sim' matrices with:
rows = samples (e.g. time steps, hauls, sites),
columns = classes (e.g. age/size bins, categories).
Internal steps (high-level):
Let be the number of classes ('A <- ncol(sim)').
Rows of 'sim' that are all zeros (excluding rows that are all 'NA') are replaced by '1' on that row (interpreted as a uniform prior).
Row sums are used to compute proportions:
, (row-wise).
Rows with 'sum(sim) == 0' are set to 'NA' for numerical convenience.
Rows with 'sum(obs) == 0' are set to 'NA' (interpreted as “no proportion data available”).
A variance term 'sigma2' and a small stabiliser 'tiny' are used to define an objective that penalises discrepancies between 'Pobs' and 'Psim'.
Arguments: 'size' (default '20') and 'tiny' (default '1e-3'). Interpretation: 'size' plays the role of an effective sample size (larger values typically increase the weight of the composition fit).
normp / re
Pure penalty on simulated values:
This ignores 'obs' and can be used as a regulariser, or when 'sim' represents a residual vector or deviates already computed upstream. 're' is an alias of 'normp'.
penaltyScaled quadratic penalty:
Arguments: 'n' (default '100'). This assumes a fixed sample size and can be used to put the penalty on a comparable scale across datasets.
You can supply any custom function via 'FUN' provided it:
accepts arguments named 'obs' and 'sim' (plus optional '...');
returns a length-one numeric scalar to be minimised;
defines its own parameter checks and missing-value policy.
match.fun
## Basic squared-error objective obs <- c(1, 2, 3, NA, 5) sim <- c(1.2, 1.9, 2.7, 4.0, 5.1) objFn(obs, sim, FUN = "norm2") ## Log-scale objective (positive data) obs <- c(0.1, 1, 10) sim <- c(0.2, 0.9, 11) objFn(obs, sim, FUN = lnorm2, tiny = 1e-2) ## Poisson objective (counts) with flooring for numerical safety obs <- c(0, 3, 10, 2) sim <- c(0, 2.5, 9.8, 1.9) sim <- pmax(sim, 1e-12) objFn(obs, sim, FUN = "pois") ## Composition objective (matrices: rows = samples, cols = classes) obs <- rbind(c(10, 5, 0), c( 0, 0, 0), # interpreted as “no composition data” c( 2, 1, 7)) sim <- rbind(c( 9, 6, 1), c( 0, 0, 0), # replaced internally by sim+1 on that row c( 1, 2, 6)) objFn(obs, sim, FUN = "multinom", size = 20, tiny = 1e-3) ## Custom objective function my_obj <- function(obs, sim, ...) { if (all(is.na(obs))) stop("All observed values are NA.") sum(abs(obs - sim), na.rm = TRUE) # L1 error } objFn(obs = c(1, 2, NA), sim = c(1.1, 1.7, 3), FUN = my_obj)## Basic squared-error objective obs <- c(1, 2, 3, NA, 5) sim <- c(1.2, 1.9, 2.7, 4.0, 5.1) objFn(obs, sim, FUN = "norm2") ## Log-scale objective (positive data) obs <- c(0.1, 1, 10) sim <- c(0.2, 0.9, 11) objFn(obs, sim, FUN = lnorm2, tiny = 1e-2) ## Poisson objective (counts) with flooring for numerical safety obs <- c(0, 3, 10, 2) sim <- c(0, 2.5, 9.8, 1.9) sim <- pmax(sim, 1e-12) objFn(obs, sim, FUN = "pois") ## Composition objective (matrices: rows = samples, cols = classes) obs <- rbind(c(10, 5, 0), c( 0, 0, 0), # interpreted as “no composition data” c( 2, 1, 7)) sim <- rbind(c( 9, 6, 1), c( 0, 0, 0), # replaced internally by sim+1 on that row c( 1, 2, 6)) objFn(obs, sim, FUN = "multinom", size = 20, tiny = 1e-3) ## Custom objective function my_obj <- function(obs, sim, ...) { if (all(is.na(obs))) stop("All observed values are NA.") sum(abs(obs - sim), na.rm = TRUE) # L1 error } objFn(obs = c(1, 2, NA), sim = c(1.1, 1.7, 3), FUN = my_obj)
optim2() provides a unified interface to multiple deterministic and stochastic
optimisation methods, combining the algorithms available through stats::optim()
with a small set of additional solvers accessed via calibrar's internal dispatcher.
optim2( par, fn, gr = NULL, ..., method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN", "Brent", "nlm", "nlminb", "Rcgmin", "Rvmmin", "hjn", "spg", "LBFGSB3", "AHR-ES"), lower = -Inf, upper = +Inf, active = NULL, control = list(), hessian = FALSE, parallel = FALSE )optim2( par, fn, gr = NULL, ..., method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN", "Brent", "nlm", "nlminb", "Rcgmin", "Rvmmin", "hjn", "spg", "LBFGSB3", "AHR-ES"), lower = -Inf, upper = +Inf, active = NULL, control = list(), hessian = FALSE, parallel = FALSE )
par |
A numeric vector or list. The length of |
fn |
The objective function to be minimised. It should accept a parameter vector (or list, depending on the wrapper) as first argument and return either a scalar value (single-objective) or a numeric vector (multi-objective). |
gr |
A function computing the gradient of |
... |
Additional arguments passed to |
method |
Optimisation method(s) to be used. Can be a single method name or a
vector of method names (e.g., one per phase). If |
lower |
Lower bounds for parameters. One value or a vector of the same length as
|
upper |
Upper bounds for parameters. One value or a vector of the same length as
|
active |
Boolean vector of the same length as |
control |
A list of control options. Common options include |
hessian |
Logical. Should a numerically differentiated Hessian matrix be returned? Currently not implemented. |
parallel |
Logical. Enable parallel computation (e.g., for replicated evaluations
and/or numerical gradients) using up to |
Methods. The current selection includes
(i) base stats::optim() methods, (ii) additional gradient-based solvers from external
packages, and (iii) heuristic/global methods.
Comparability. Not all methods are directly comparable: some are deterministic local optimisers (e.g., quasi-Newton), others are derivative-free local searches, and others are stochastic/global heuristics. Method choice should reflect the objective function (smooth vs. non-smooth, deterministic vs. stochastic) and the presence of constraints.
Control arguments. optim2() standardises a set of common control arguments
(e.g., iteration limits, tolerances, tracing, and scaling) where supported, while still
allowing method-specific control parameters to be passed through to the underlying solver.
Parallel numerical gradients. When analytic gradients are not provided,
optim2() can compute finite-difference numerical gradients and distribute function
evaluations across multiple cores, which can substantially reduce wall-clock time for
expensive objective functions.
A list with components:
The best set of parameters found.
The value of fn corresponding to par.
A two-element integer vector giving the number of calls to fn and
gr respectively. This excludes those calls needed to compute the Hessian, if
requested, and any calls to fn to compute a finite-difference approximation to the
gradient.
An integer code. 0 indicates successful completion.
A character string giving any additional information returned by the
optimizer, or NULL.
Only if argument hessian is TRUE. A symmetric matrix giving an
estimate of the Hessian at the solution found. Note that this is the Hessian of the
unconstrained problem even if the box constraints are active.
For smooth deterministic objectives, quasi-Newton methods (e.g., "BFGS" or
"L-BFGS-B" with box constraints) are often efficient when gradients are available or
can be reliably approximated. For box-constrained problems, consider "L-BFGS-B" or
the extended bounded solvers (e.g., "Rvmmin", "spg"). For noisy or stochastic
objectives, heuristic/global methods (e.g., "AHR-ES") may be more appropriate.
"SANN" is included for compatibility with stats::optim(), but it is highly
sensitive to tuning and often performs poorly on continuous problems under default
settings. For noisy or rugged objective functions, "AHR-ES" is generally the
recommended heuristic alternative within optim2().
Ricardo Oliveros-Ramos
Other optimisers:
ahres(),
calibrate(),
optimh()
optim2(par=rep(NA, 5), fn=sphereN)optim2(par=rep(NA, 5), fn=sphereN)
General-purpose optimization using heuristic algorithms
optimh( par, fn, gr = NULL, ..., method = c("AHR-ES", "Nelder-Mead", "SANN", "hjn", "bobyqa", "CMA-ES", "genSA", "DE", "soma", "genoud", "PSO", "hybridPSO", "mads", "hjk", "hjkb", "nmk", "nmkb"), lower = -Inf, upper = +Inf, active = NULL, control = list(), hessian = FALSE, parallel = FALSE )optimh( par, fn, gr = NULL, ..., method = c("AHR-ES", "Nelder-Mead", "SANN", "hjn", "bobyqa", "CMA-ES", "genSA", "DE", "soma", "genoud", "PSO", "hybridPSO", "mads", "hjk", "hjkb", "nmk", "nmkb"), lower = -Inf, upper = +Inf, active = NULL, control = list(), hessian = FALSE, parallel = FALSE )
par |
A numeric vector or list. The length of |
fn |
The objective function to be minimised. It should accept a parameter vector (or list, depending on the wrapper) as first argument and return either a scalar value (single-objective) or a numeric vector (multi-objective). |
gr |
Function to compute the gradient of |
... |
Additional arguments passed to |
method |
Optimisation method(s) to be used. Can be a single method name or a
vector of method names (e.g., one per phase). If |
lower |
Lower bounds for parameters. One value or a vector of the same length as
|
upper |
Upper bounds for parameters. One value or a vector of the same length as
|
active |
Boolean vector of the same length as |
control |
A list of control options. Common options include |
hessian |
Logical. Should a numerically differentiated Hessian matrix be returned? Currently not implemented. |
parallel |
Logical. Enable parallel computation (e.g., for replicated evaluations
and/or numerical gradients) using up to |
A list with components:
The best set of parameters found.
The value of fn corresponding to par.
A two-element integer vector giving the number of calls to fn and
gr respectively. This excludes those calls needed to compute the Hessian, if
requested, and any calls to fn to compute a finite-difference approximation to the
gradient.
An integer code. 0 indicates successful completion.
A character string giving any additional information returned by the
optimizer, or NULL.
Only if argument hessian is TRUE. A symmetric matrix giving an
estimate of the Hessian at the solution found. Note that this is the Hessian of the
unconstrained problem even if the box constraints are active.
Ricardo Oliveros-Ramos
Other optimisers:
ahres(),
calibrate(),
optim2()
optim2(par=rep(NA, 5), fn=sphereN)optim2(par=rep(NA, 5), fn=sphereN)
This function calculates the Euclidian distance from a point to the origin after a random displacement of its position.
sphereN(x, sd = 0.1, aggregate = TRUE)sphereN(x, sd = 0.1, aggregate = TRUE)
x |
The coordinates of the point |
sd |
The standard deviation of the noise
to be added to the position of |
aggregate |
If |
The distance from the point x to the origin after a random
displacement.
Ricardo Oliveros–Ramos
sphereN(rep(0, 10))sphereN(rep(0, 10))
Predict time-varying parameters using splines.
spline_par(par, n, knots = NULL, periodic = FALSE, period = NULL)spline_par(par, n, knots = NULL, periodic = FALSE, period = NULL)
par |
Values at knots |
n |
Number of points. Time (independent variable) is assumed to be between 0 and n with length(par) equidistant points (including 0 and n). |
knots |
Position of knots. Default, is length(x) equidistant points between 0 and 1. Always are re-scaled to 0 to 1. |
periodic |
boolean, is the spline periodic? |
period |
If periodic is TRUE, it specify the time period. |
A list with the interpolates values as 'x' and 'time'.