modMCMC {FME} | R Documentation |
Performs a Markov Chain Monte Carlo simulation, using an adaptive Metropolis (AM) algorithm and including a delayed rejection (DR) procedure
modMCMC(f, p, ..., jump=NULL, lower=-Inf, upper=+Inf, prior=NULL, var0 = NULL, wvar0 = NULL, n0= NULL, niter=1000, outputlength = niter, burninlength=0, updatecov=niter, covscale = 2.4^2/length(p), ntrydr=1, drscale=NULL, verbose=TRUE) ## S3 method for class 'modMCMC': summary(object, remove = NULL, ...) ## S3 method for class 'modMCMC': pairs(x, Full=FALSE, which = 1:ncol(x$pars), remove = NULL, nsample = NULL, ...) ## S3 method for class 'modMCMC': hist(x, Full=FALSE, which = 1:ncol(x$pars), remove = NULL, ask = NULL, ...) ## S3 method for class 'modMCMC': plot(x, Full=FALSE, which = 1:ncol(x$pars), trace=TRUE, remove = NULL, ask = NULL, ...)
f |
the function to be evaluated, with first argument the vector of
parameters which should be varied. It should return either the model
residuals, an element of class modCost (as returned by a call to
modCost ) or -2*log(likelihood).
The latter is equivalent to the sum-of-squares functions when using a
Gaussian likelihood and prior.
|
p |
initial values for the parameters to be optimized over. |
... |
additional arguments passed to function f or
to the methods.
|
jump |
jump length, either a number, a vector with length equal to the total number of parameters, a covariance matrix, or a function that takes as input the current values of the parameters and produces as output the perturbed parameters. See details. |
prior |
prior probability of the parameters, either a function that is
called as prior(p) or NULL ;
in the latter case a non-informative prior is used (i.e. all parameters
are equally likely).
|
var0 |
initial model variance; if NULL , it is assumed that
the model variance is 1, and the return element from f
is -2*log (probability), else it is assumed that the return element from
f contain the model residuals or a list of class modFit . -
see details.
|
wvar0 |
"weight" for the initial model variance - see details. |
n0 |
parameter used for weighing the initial model variance -
if NULL , it is estimated as n0=wvar0*n , where n = number
of observations. see details
|
lower |
lower bounds on the parameters; for unbounded parameters
set equal to -Inf .
|
upper |
upper bounds on the parameters; for unbounded parameters
set equal to Inf .
|
niter |
number of iterations for the MCMC. |
outputlength |
number of iterations kept in the output; should be
smaller or equal to niter .
|
updatecov |
number of iterations after which the parameter covariance matrix is (re)evaluated based on the parameters kept thus far, and used to update the MCMC jumps. |
covscale |
scale factor for the parameter covariance matrix, used to perform the MCMC jumps. |
burninlength |
number of initial iterations to be removed from output. |
ntrydr |
maximal number of tries for the delayed rejection procedure. It is generally not a good idea to set this to a too large value. |
drscale |
for each try during delayed rejection, the cholesky
decomposition of the proposal matrix is scaled with this amount;
if NULL , it is assumed to be c(0.2,0.25, 0.333, 0.333,...)
|
verbose |
if TRUE : prints extra output.
|
object |
an object of class modMCMC .
|
x |
an object of class modMCMC .
|
Full |
If TRUE then not only the parameters will be plotted, but also the function value and (if appropriate) the model variance(s). |
which |
the name or the index to the parameters that should be plotted.
Default = all parameters. If Full=TRUE , setting which =NULL
will plot only the function value and the model variance.
|
trace |
if TRUE , adds smoothed line to the plot.
|
remove |
a list with indices of the runs that should be removed (e.g. to remove runs during burnin. |
nsample |
the number of xy pairs to be plotted on the upper panel in
the pairs plot. When NULL all xy pairs plotted. Set to a lower number
in case the graph becomes too dense (and the exported picture too large).
This does not affect the histograms on the diagonal plot
(which are estimated using all MCMC draws).
|
ask |
logical; if TRUE , the user is asked before
each plot, if NULL the user is only asked if more than one
page of plots is necessary and the current graphics device is set
interactive, see par(ask=.) and
dev.interactive .
|
Note that arguments after ... must be matched exactly.
R-function f
is called as f(p,...)
. It should return either
-2 times the log likelihood of the model (one value), the residuals between
model and data or an item of class modFit
(as created by function
modFit
.
In the latter two cases, it is assumed that the prior distribution for theta is either non-informative or gaussian. If gaussian, it can be treated as a sum of squares (SS). If the measurement function is defined as:
y=f(p)+xi; xi~N(0,σ^2
where xi is the measurement error, assumed normally distribution, then the posterior for the parameters will be estimated as:
prob(p|y,σ^2)~exp(-0.5 SS(p)/σ^2)+SSpri(p)
and where σ^2 is the error variance, SS is the sum of squares function SS(p)=sum(yi-f(p))^2. If non-informative priors are used, then SSpri(p)=0.
The error variance σ^2 is considered a nuisance parameter. A prior distribution of it should be specified and a posterior distribution is estimated.
If wvar0
or n0
is >0, then the variances are sampled as conjugate
priors from the inverse gamma distribution with parameters var0
and n0=wvar0*n
.
Larger values of wvar0
keep these samples closer to var0
.
Thus, at each step, 1/ the error variance (σ^{-2} ) is sampled from a gamma distribution:
prob(σ^2|y,p)~Gam(0.5*(n0+n), 0.5*(n0+var0+SS(p)
where n
is the number of data points and where n0=n*wvar0, and
where the second argument to the gamma function is the shape parameter.
The prior parameters (var0
and wvar0
) are the prior mean
for σ^2 and the prior accuracy.
By setting wvar0
equal to 1 (default), equal weight is given to the
prior and the current value.
If wvar0
is 0 (the default) then the error variances are assumed to
be fixed.
var0
estimates the variance of the measured components. In case
independent estimates are not available, these variances can be obtained
from the mean squares of fitted residuals. (e.g. as reported in
modFit
). See the examples. (but note that this is not truly independent
information)
var0
is either one value, or a value for each observed variable,
or a value for each observed data point.
When var0
is not NULL
, then f
is assumed to return the
model residuals OR an instance of class modCost
.
When var0=NULL
, then f
should return either
-2*log(probability of the model), or an instance of class modCost
.
modMCMC
implements the Metropolis-Hastings method. The proposal
distribution, which is used to generate new parameter values is the
(multidimensional) Gaussian density distribution, with standard
deviation given by jump
.
jump
can be either one value, a vector of length = number of
parameters or a parameter covariance matrix (nrow=ncol=number parameters).
The jump parameter, jump
thus determines how much the new parameter
set will deviate from the old one.
If jump
is one value, or a vector, then the new parameter values are
generated by sampling a normal distribution with standard deviation equal
to jump
.
A larger value will lead to larger jumps in the parameter space, but
acceptance of new points can get very low. Smaller jump lengths increase
the acceptance rate, but the algorithm may move too slowly, and too many
runs may be needed to scan the parameter space.
If jump
is NULL, then the jump length is taken as 10% of the
parameter value as given in p
.
jump
can also be a proposal covariance matrix. In this case, the
new parameter values are generated by sampling a multidimensional normal
distribution. It can be efficient to initialise jump
using the
parameter covariance as resulting from fitting the model (e.g. using
modFit
) - see examples.
Finally, jump
can also be an R-function that takes as input the
current values of the parameters and returns the new parameter values.
Two methods are implemented to increase the number of accepted runs.
Setting updatecov
smaller than niter
will trigger this
functionality. In this case, every updatecov
iterations, the
jump covariance matrix will be estimated from the covariance matrix of
the saved parameter values. The covariance matrix is scaled
with (2.4^2/npar) where npar is the number of parameters,
unless covscale
has been given a different value.
Thus, Jump=cov(p1,p2,...pn) *
diag(np+1e-16)*2.4^2/npar
where the small number 1e^{-16} is added
on the diagonal of the covariance matrix to prevent it from becoming
singular.
Note that a problem of adapting the proposal distribution using the MCMC results so far is that standard convergence results do not apply. One solution is to use adaptation only for the burn-in period and discard the part of the chain where adaptation has been used.
Thus, when using updatecov
with a positive value of
burninlength
, the proposal distribution is only updated during
burnin. If burninlength
= 0 though, the updates occur throughout
the entire simulation.
When using the adaptive Metropolis method, it is best to start with a small value of the jump length.
ntrydr
trials. Setting ntrydr
equal
to 1 (the default) toggles off delayed rejection.
During the delayed rejection procedure, new parameters are generated
from the last accepted value by scaling the jump covariance matrix with
a factor as specified in drscale
. The acceptance
probability of this new set depends on the candidates so far proposed
and rejected, in such a way that reversibility of the Markov chain is
preserved. See Haario et al. (2005, 2006) for more details.
Convergence of the MCMC chain can be checked via plot
, which plots
for each iteration the values of all parameters, and if Full
is
TRUE
, of the function value (SS) and (if appropriate) the
modeled variance.
If converged, there should be no visible drift.
In addition, the methods from package coda
become available by making
the object returned by modMCMC
of class mcmc
, as used in the
methods of coda
.
For instance, if object MCMCres
is returned by modMCMC
then
as.mcmc(MCMCres$pars)
will make an instance of class mcmc
,
usable by coda
.
The burninlength
is the number of initial steps that is not included
in the output. It can be useful if the initial value of the parameters is
far from the optimal value. Starting the MCMC with the best fit parameter
set will alleviate the need for using burninlength
.
a list of class modMCMC containing the results as returned from the
Markov chain.
This includes the following:
pars |
an array with dimension (outputlength ,length(p )),
containing the parameters of the MCMC at each iteration that is kept |
SS |
vector with the sum of squares function, one for each
row in pars |
naccepted |
the number of accepted runs |
sig |
the sampled error variance σ^2, a matrix with one
row for each row in pars |
bestpar |
the parameter set that gave the highest probability |
bestfunp |
the function value corresponding to bestpar |
prior |
the parameter prior, one value for each row in pars |
count |
information about the MCMC chain: number of delayed rejection
step (dr_steps ), the number of alfa steps Alfasteps ,
the number of accepted runs (num_accepted ) and the number of times
the proposal covariance matrix has been updated (num_covupdate )
|
settings |
the settings for error covariance calculation, i.e. arguments
var0 , n0 and N the number of data points
|
The list returned by modMCMC
has methods for the generic functions
summary
, plot
, pairs
- see note.
The following S3 methods are provided:
gap = 0
, upper.panel = NA
, and diag.panel
.
It is also possible to use the methods from the coda
package, e.g.
densplot
.
To do that, first the modMCMC
object has to be converted to an mcmc object.
See the examples for an application
Karline Soetaert <k.soetaert@nioo.knaw.nl>
Marko Laine <Marko.Laine@fmi.fi>
Laine, M, 2008. Adaptive MCMC methods with applications in environmental and geophysical models, Finnish meteorological institute contributions n0 69 -ISBN 978-951-697-662-7, Finnish meteorological institute, Helsinki.
H. Haario, E. Saksman and J. Tamminen, 2001. An adaptive Metropolis algorithm Bernoulli 7, pp. 223-242.
H. Haario, M. Laine, A. Mira, and E. Saksman, 2006. DRAM: Efficient adaptive MCMC, Statistics and Computing, 16(4), pages 339 354.
H. Haario, E. Saksman and J. Tamminen, 2005. Componentwise adaptation for high dimensional MCMC. Computational Statistics 20(2), 265-274.
A. Gelman, JB Varlin, HS Stern and DB Rubin, 2004. Bayesian Data Analysis, second edition. Chapman and Hall,Boca Raton.
modFit
for constrained model fitting
## ======================================================================= ## Sampling a 3-dimensional normal distribution, ## ======================================================================= # mean=1:3, sd = 0.1 # f returns -2*log(probability) of the parameter values NN <- function(p) { mu <- c(1,2,3) -2*sum(log(dnorm(p,mean=mu,sd=0.1))) #-2*log(probability) } # simple Metropolis-Hastings MCMC <- modMCMC (f=NN,p=0:2, niter=10000, outputlength=1000,jump=0.5) # More accepted values by updating the jump covariance matrix... MCMC <- modMCMC (f=NN,p=0:2, niter=10000,updatecov=100, outputlength=1000,jump=0.5) summary(MCMC) plot(MCMC) # check convergence pairs(MCMC) ## ======================================================================= ## test 2: sampling a 3-D normal distribution, larger standard deviation... ## noninformative priors, lower and upper bounds imposed on parameters ## ======================================================================= NN <- function(p) { mu <- c(1,2,2.5) -2*sum(log(dnorm(p,mean=mu,sd=0.5))) #-2*log(probability) } MCMC2 <- modMCMC (f=NN,p=1:3,niter=3000,burninlength=500,updatecov=10, jump=0.5, lower=c(0,2,1),upper=c(1,3,3)) plot(MCMC2) hist(MCMC2,breaks=20) # Compare output of p3 with theoretical distribution hist(MCMC2,which="p3",breaks=20) lines(seq(1,3,0.1),dnorm(seq(1,3,0.1),mean=2.5,sd=0.5)/pnorm(3,2.5,0.5)) summary(MCMC2) # functions from package coda... cumuplot(as.mcmc(MCMC2$pars)) summary(as.mcmc(MCMC2$pars)) raftery.diag(MCMC2$pars) ## ======================================================================= ## test 3: sampling a log-normal distribution, log mean=1:4, log sd = 1 ## ======================================================================= NL <- function(p) { mu <- 1:4 -2*sum(log(dlnorm(p,mean=mu,sd=1))) #-2*log(probability) } MCMCl <- modMCMC (f=NL,p=log(1:4),niter=10000,outputlength=1000,jump=5) plot(MCMCl) # bad convergence cumuplot(as.mcmc(MCMCl$pars)) MCMCl <- modMCMC (f=NL,p=log(1:4),niter=10000,outputlength=1000,jump=2^(2:5)) plot(MCMCl) # better convergence but CHECK it! pairs(MCMCl) colMeans(log(MCMCl$pars)) sd(log(MCMCl$pars)) MCMCl <- modMCMC (f=NL,p=rep(1,4),niter=5000,outputlength=1000, jump=5,updatecov=100) plot(MCMCl) colMeans(log(MCMCl$pars)) sd(log(MCMCl$pars)) ## ======================================================================= ## Fitting a Monod (Michaelis-Menten) function to data ## ======================================================================= # the observations #--------------------- Obs <- data.frame(x=c( 28, 55, 83, 110, 138, 225, 375), # mg COD/l y=c(0.053,0.06,0.112,0.105,0.099,0.122,0.125)) # 1/hour plot(Obs,pch=16,cex=2,xlim=c(0,400),ylim=c(0,0.15), xlab="mg COD/l",ylab="1/hr",main="Monod") # the Monod model #--------------------- Model <- function(p,x) data.frame(x=x,N=p[1]*x/(x+p[2])) # Fitting the model to the data #--------------------- # define the residual function Residuals <- function(p) (Obs$y-Model(p,Obs$x)$N) #... model residuals # use modFit to find parameters P <- modFit(f=Residuals,p=c(0.1,1)) # plot best-fit model x <-0:375 lines(Model(P$par,x)) # summary of fit sP <- summary(P) sP[] print(sP) # Running an MCMC #--------------------- # estimate parameter covariances # (to efficiently generate new parameter values) Covar <- sP$cov.scaled * 2.4^2/2 # the model variance s2prior <- sP$modVariance # set nprior = 0 to avoid updating model variance MCMC <- modMCMC(f=Residuals,p=P$par,jump=Covar,niter=10000, var0=s2prior,wvar0=NULL,updatecov=100) plot(MCMC,Full=TRUE) pairs(MCMC) # function from the coda package. raftery.diag(as.mcmc(MCMC$pars)) cor(MCMC$pars) cov(MCMC$pars) # covariances by MCMC sP$cov.scaled # covariances by Hessian of fit x <- 1:400 SR <- summary(sensRange(parInput=MCMC$pars,func=Model,x=x)) plot(SR, xlab="mg COD/l",ylab="1/hr",main="Monod") points(Obs,pch=16,cex=1.5)