lagsarlm {spdep}R Documentation

Spatial simultaneous autoregressive lag model estimation

Description

Maximum likelihood estimation of spatial simultaneous autoregressive lag and mixed models of the form:

y = rho W y + X beta + e

where rho is found by optimize() first, and beta and other parameters by generalized least squares subsequently (one-dimensional search using optim performs badly on some platforms). In the mixed model, the spatially lagged independent variables are added to X. Note that interpretation of the fitted coefficients should use impact measures, because of the feedback loops induced by the data generation process for this model.

Usage

lagsarlm(formula, data = list(), listw, 
        na.action, type="lag", method="eigen", quiet=TRUE, 
        zero.policy=FALSE, interval=c(-1,0.999), tol.solve=1.0e-10, 
        tol.opt=.Machine$double.eps^0.5, withLL=FALSE, 
        fdHess=NULL, optimHess=FALSE, trs=NULL, 
        searchInterval=FALSE)

Arguments

formula a symbolic description of the model to be fit. The details of model specification are given for lm()
data an optional data frame containing the variables in the model. By default the variables are taken from the environment which the function is called.
listw a listw object created for example by nb2listw
na.action a function (default options("na.action")), can also be na.omit or na.exclude with consequences for residuals and fitted values - in these cases the weights list will be subsetted to remove NAs in the data. It may be necessary to set zero.policy to TRUE because this subsetting may create no-neighbour observations. Note that only weights lists created without using the glist argument to nb2listw may be subsetted.
type default "lag", may be set to "mixed"; when "mixed", the lagged intercept is dropped for spatial weights style "W", that is row-standardised weights, but otherwise included
method "eigen" (default) - the Jacobian is computed as the product of (1 - rho*eigenvalue) using eigenw, and "spam" or "Matrix" for strictly symmetric weights lists of styles "B" and "C", or made symmetric by similarity (Ord, 1975, Appendix C) if possible for styles "W" and "S", using code from the spam or Matrix packages to calculate the determinant.
quiet default=TRUE; if FALSE, reports function values during optimization.
zero.policy if TRUE assign zero to the lagged value of zones without neighbours, if FALSE (default) assign NA - causing lagsarlm() to terminate with an error
interval search interval for autoregressive parameter when not using method="eigen"; default is c(-1,1); method="Matrix" will attempt to search for an appropriate interval
tol.solve the tolerance for detecting linear dependencies in the columns of matrices to be inverted - passed to solve() (default=1.0e-10). This may be used if necessary to extract coefficient standard errors (for instance lowering to 1e-12), but errors in solve() may constitute indications of poorly scaled variables: if the variables have scales differing much from the autoregressive coefficient, the values in this matrix may be very different in scale, and inverting such a matrix is analytically possible by definition, but numerically unstable; rescaling the RHS variables alleviates this better than setting tol.solve to a very small value
tol.opt the desired accuracy of the optimization - passed to optimize() (default=square root of double precision machine tolerance)
withLL default FALSE; if TRUE, calculate likelihood ratio statistics for right hand side variables when using sparse matrix methods in addition to appriximating the coefficient covariance matrix with a numerical Hessian
fdHess default NULL, then set to (method != "eigen") internally; use fdHess to compute an approximate Hessian using finite differences when using sparse matrix methods; may be used to make a coefficient covariance matrix when the number of observations is large; may be turned off to save resources if need be, but required for impact measures
optimHess default FALSE, use fdHess from nlme, if TRUE, use optim to calculate Hessian at optimum
trs default NULL, if given, a vector of powered spatial weights matrix traces output by trW; when given, insert the asymptotic analytical values into the numerical Hessian instead of the approximated values; may be used to get around some problems raised when the numerical Hessian is poorly conditioned, generating NaNs in subsequent operations; the use of trs is recommended
searchInterval Default FALSE; when the Matrix method is used, a search may be made to approximate the ends of the line search interval.

Details

The asymptotic standard error of rho is only computed when method=eigen, because the full matrix operations involved would be costly for large n typically associated with the choice of method="spam" or "Matrix". The same applies to the coefficient covariance matrix. Taken as the asymptotic matrix from the literature, it is typically badly scaled, and with the elements involving rho being very small, while other parts of the matrix can be very large (often many orders of magnitude in difference). It often happens that the tol.solve argument needs to be set to a smaller value than the default, or the RHS variables can be centred or reduced in range.

Versions of the package from 0.4-38 include numerical Hessian values where asymptotic standard errors are not available. This change has been introduced to permit the simulation of distributions for impact measures. Likelihood ratio test output for right hand side variables may be obtained in addition by setting withLL=TRUE. The warnings made above with regard to variable scaling also apply in this case.

Note that the fitted() function for the output object assumes that the response variable may be reconstructed as the sum of the trend, the signal, and the noise (residuals). Since the values of the response variable are known, their spatial lags are used to calculate signal components (Cressie 1993, p. 564). This differs from other software, including GeoDa, which does not use knowledge of the response variable in making predictions for the fitting data.

Value

A list object of class sarlm

type "lag" or "mixed"
rho simultaneous autoregressive lag coefficient
coefficients GLS coefficient estimates
rest.se asymptotic standard errors if ase=TRUE, otherwise approximate numeriacal Hessian-based values
LL log likelihood value at computed optimum
s2 GLS residual variance
SSE sum of squared GLS errors
parameters number of parameters estimated
lm.model the lm object returned when estimating for rho=0
method the method used to calculate the Jacobian
call the call used to create this object
residuals GLS residuals
lm.target the lm object returned for the GLS fit
fitted.values Difference between residuals and response variable
se.fit Not used yet
formula model formula
ase TRUE if method=eigen
LLs if ase=FALSE and withLL=TRUE (for method="spam" or "Matrix"), the log likelihood values of models estimated dropping each of the independent variables in turn, used in the summary function as a substitute for variable coefficient significance tests
rho.se if ase=TRUE, the asymptotic standard error of rho, otherwise approximate numeriacal Hessian-based value
LMtest if ase=TRUE, the Lagrange Multiplier test for the absence of spatial autocorrelation in the lag model residuals
resvar the asymptotic coefficient covariance matrix for (s2, rho, B)
zero.policy zero.policy for this model
aliased the aliased explanatory variables (if any)
listw_style the style of the spatial weights used
interval the line search interval used to find rho
fdHess the numerical Hessian-based coefficient covariance matrix for (rho, B) if computed
optimHess if TRUE and fdHess returned, optim used to calculate Hessian at optimum
insert if TRUE and fdHess returned, the asymptotic analytical values are inserted into the numerical Hessian instead of the approximated values, and its size increased to include the first row/column for sigma2
LLNullLlm Log-likelihood of the null linear model
na.action (possibly) named vector of excluded or omitted observations if non-default na.action argument used


The internal sar.lag.mixed.* functions return the value of the log likelihood function at rho.

Author(s)

Roger Bivand Roger.Bivand@nhh.no, with thanks to Andrew Bernat for contributions to the asymptotic standard error code.

References

Cliff, A. D., Ord, J. K. 1981 Spatial processes, Pion; Ord, J. K. 1975 Estimation methods for models of spatial interaction, Journal of the American Statistical Association, 70, 120-126; Anselin, L. 1988 Spatial econometrics: methods and models. (Dordrecht: Kluwer); Anselin, L. 1995 SpaceStat, a software program for the analysis of spatial data, version 1.80. Regional Research Institute, West Virginia University, Morgantown, WV (www.spacestat.com); Anselin L, Bera AK (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DEA (eds) Handbook of applied economic statistics. Marcel Dekker, New York, pp. 237-289; Cressie, N. A. C. 1993 Statistics for spatial data, Wiley, New York.

See Also

lm, errorsarlm, eigenw, predict.sarlm, impacts.sarlm, residuals.sarlm

Examples

data(oldcol)
COL.lag.eig <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="W"), method="eigen", quiet=FALSE)
summary(COL.lag.eig, correlation=TRUE)
COL.lag.eig$fdHess
COL.lag.eig$resvar
W <- as(as_dgRMatrix_listw(nb2listw(COL.nb)), "CsparseMatrix")
trMatc <- trW(W, type="mult")
COL.lag.eig1 <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="W"), fdHess=TRUE, trs=trMatc)
COL.lag.eig1$fdHess
system.time(COL.lag.M <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb), method="Matrix", quiet=FALSE))
summary(COL.lag.M)
impacts.sarlm(COL.lag.M, listw=nb2listw(COL.nb))
system.time(COL.lag.M <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb), method="Matrix", quiet=FALSE, withLL=TRUE))
summary(COL.lag.M)
system.time(COL.lag.sp <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb), method="spam", quiet=FALSE))
summary(COL.lag.sp)
COL.lag.B <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="B"))
summary(COL.lag.B, correlation=TRUE)
COL.mixed.B <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="B"), type="mixed", tol.solve=1e-9)
summary(COL.mixed.B, correlation=TRUE)
COL.mixed.W <- lagsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
 nb2listw(COL.nb, style="W"), type="mixed")
summary(COL.mixed.W, correlation=TRUE)
NA.COL.OLD <- COL.OLD
NA.COL.OLD$CRIME[20:25] <- NA
COL.lag.NA <- lagsarlm(CRIME ~ INC + HOVAL, data=NA.COL.OLD,
 nb2listw(COL.nb), na.action=na.exclude, tol.opt=.Machine$double.eps^0.4)
COL.lag.NA$na.action
COL.lag.NA
resid(COL.lag.NA)
data(boston)
gp2mM <- lagsarlm(log(CMEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) + 
I(RM^2) +  AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT), 
data=boston.c, nb2listw(boston.soi), type="mixed", method="Matrix")
summary(gp2mM)
W <- as(as_dgRMatrix_listw(nb2listw(boston.soi)), "CsparseMatrix")
trMatb <- trW(W, type="mult")
gp2mMi <- lagsarlm(log(CMEDV) ~ CRIM + ZN + INDUS + CHAS + I(NOX^2) + 
I(RM^2) +  AGE + log(DIS) + log(RAD) + TAX + PTRATIO + B + log(LSTAT), 
data=boston.c, nb2listw(boston.soi), type="mixed", method="Matrix", 
trs=trMatb)
summary(gp2mMi)

[Package spdep version 0.4-50 Index]