Mort1Dsmooth {MortalitySmooth} | R Documentation |
Returns an object of class Mort1Dsmooth
which is a P-splines
smooth of the input data of degree and order fixed by the
user. Specifically tailored to mortality data.
Mort1Dsmooth(x, y, offset, w, overdispersion=FALSE, ndx = floor(length(x)/5), deg = 3, pord = 2, lambda = NULL, df = NULL, method = 1, control = list())
x |
Values of the predictor variable. These must be at least
2 ndx + 1 of them. |
y |
Set of counts response variable values. y must be a
vector of the same length as x |
offset |
This can be used to specify an a priori known component
to be included in the linear predictor during fitting. This should be
NULL or a numeric vector of length either one or equal to the
number of cases. |
w |
An optional vector of weights to be used in the fitting
process. This should be NULL or a numeric vector of length
equal to the number of cases. |
overdispersion |
Logical on the accounting for overdisperion in
the smoothing parameter selection criterion. See
Details . Default: FALSE. |
ndx |
Number of internal knots -1. Default:
floor(length(x)/5) . |
deg |
Degree of the B-splines. Default: 3. |
pord |
Order of differences. Default: 2. |
lambda |
Smoothing parameter (optional). |
df |
A number which specifies the degrees of freedom (optional). |
method |
The method for controlling the amount of
smoothing. method = 1 (default) adjusts the smoothing parameter
so that the BIC is minimized. method = 2 adjusts lambda
so that the AIC is minimized. method = 3 uses the value
supplied for lambda . method = 4 adjusts lambda so
that the degrees of freedom is equal to the supplied df . |
control |
A list of control parameters. See Details . |
The method fits a P-spline model with equally-spaced B-splines along
x
. The response variables must be Poisson distributed counts,
though overdisperion can be accounted. Offset can be provided,
otherwise the default is that all weights are one.
The method produces results similar to function smooth.spline
,
but the smoothing function is a B-spline smooth with discrete
penalization directly on the differences of the B-splines
coefficients. The user can set the order of difference, the degree of
the B-splines and number of them. Nevertheless, the smoothing
parameter lambda
is mainly used to tune the smoothness/model
fidelity of the fitted values.
There are print.Mort1Dsmooth
,
summary.Mort1Dsmooth
, plot.Mort1Dsmooth
predict.Mort1Dsmooth
and
residuals.Mort1Dsmooth
methods available for this
function.
Four methods are possible and optimal smoothing parameter based on BIC
is set as default. Minimization ofthe AIC is also possible. BIC will
give always smoother outcomes with respect to AIC, especially for
large sample size. Alternatively the user can directly provide the
smoothing parameter (method=3
) or the degree of freedom to be
used in the model (method=4
). Note that Mort1Dsmooth
uses approximated degree of freedom, therefore method=4
will
produce fitted values with degree of freedom only similar to the one
provided in df
. The tolerance level can be set via
control
- TOL2
.
Note that the 'ultimate' smoothing with very large lambda will
approach to a polynomial of degree pord
.
The argument overdispersion
can be set to TRUE
when
smoothing parameter selection has to account for possible presence of
over(under)dispersion. Mortality data often present overdispersion
also known, in demography, as heterogeneity. Duplicates in insurance
data can lead to overdispersed data, too. Smoothing parameter
selection may be affected by this phenomenon. When
overdispersion=TRUE
, the function uses a penalized
quasi-likelihood method for including an overdisperion parameter
(psi2
) in the fitting procedure. With this approach expected
values are assumed equal to the variance multiplied by the parameter
psi2
. See reference. Note that the inclusion of the
overdisperion parameter within the estimation might lead to select
higher lambda, leading to smoother outcomes. When
overdispersion=FALSE
(default value) or method=3
or
method=4
, psi2
is estimated after the smoothing
parameter have been employed. Overdispersion parameter larger
(smaller) than 1 may be a sign of overdispersion (underdispersion).
The control
argument is a list that can supply any of the
following components:
MON
: Logical. If TRUE
tracing information on the
progress of the fitting is produced. Default: FALSE
.
TOL1
: The absolute convergence tolerance. Default: 1e-06.
TOL2
: Difference between two adjacent smoothing parameters in
the (pseudo) grid search, log-scale. Useful only when method
is
equal to 1, 2 or 4. Default: 0.1.
MAX.IT
: The maximum number of iterations. Default: 50.
The arguments MON
, TOL1
and MAX.IT
are kept
during all the (pseudo) grid search when method
is equal to 1,
2 or 4. Function cleversearch
from package
svcm
is employed to speed the grid search.
The function is specifically tailored to smooth mortality data in
one-dimensional setting. In such case the argument x
would be
either the ages or the years under study. Death counts will be the
argument y
. In a Poisson regression setting applied to actual
death counts the offset
will be the logarithm of the exposure
population. See example below.
An object of the class Mort1Dsmooth
with components:
coefficients |
vector of fitted (penalized) B-splines coefficients. |
residuals |
the deviance residuals. |
fitted.values |
vector of fitted counts. |
linear.predictor |
vector of fitted linear predictor. |
leverage |
diagonal of the hat-matrix. |
df |
effective dimension. |
deviance |
Poisson Deviance. |
aic |
Akaike's Information Criterion. |
bic |
Bayesian Information Criterion. |
psi2 |
Overdispersion parameter. |
lambda |
the selected (given) smoothing parameter lambda. |
call |
the matched call. |
n |
number of observations. |
tolerance |
the used tolerance level. |
ndx |
the number of internal knots -1. |
deg |
degree of the B-splines. |
pord |
order of difference. |
x |
values of the predictor variable. |
y |
set of counts response variable values. |
offset |
vector of the offset. |
w |
vector of weights used in the model. |
Carlo G Camarda
Eilers and Marx (1996). Flexible Smoothing with B-splines and Penalties. Statistical Science. Vol. 11. 89-121.
predict.Mort1Dsmooth
,
plot.Mort1Dsmooth
.
# selected data years <- 1950:2006 death <- selectHMDdata("Japan", "Deaths", "Females", ages = 80, years = years) exposure <- selectHMDdata("Japan", "Exposures", "Females", ages = 80, years = years) # various fits # default using Bayesian Information Criterion fitBIC <- Mort1Dsmooth(x=years, y=death, offset=log(exposure)) fitBIC summary(fitBIC) # subjective choice of the smoothing parameter lambda fitLAM <- Mort1Dsmooth(x=years, y=death, offset=log(exposure), lambda=10000, method=3) # plot plot(years, log(death/exposure), main="Mortality rates, log-scale. Japanese females, age 80, 1950:2006") lines(years, log(fitted(fitBIC)/exposure), col=2, lwd=2) lines(years, log(fitted(fitLAM)/exposure), col=3, lwd=2) legend("topright", c("Actual", "BIC", "lambda=10000"), col=1:3, lwd=c(1,2,2), lty=c(-1,1,1), pch=c(1,-1,-1)) # about Extra-Poisson variation (overdispersion) # checking the presence of overdispersion fitBIC$psi2 # quite larger than 1 # fitting accounting for overdispersion fitBICover <- Mort1Dsmooth(x=years, y=death, offset=log(exposure), overdispersion=TRUE) # difference in the selected smoothing parameters fitBIC$lambda;fitBICover$lambda # plotting both situations plot(fitBICover) lines(years, log(fitBIC$fitted) - fitBIC$offset, col=4, lwd=2, lty=2)