phmm {phmm} | R Documentation |
Fits a proportional hazards regression model incorporating random effects. The function implements an EM algorithm using Markov Chain Monte Carlo (MCMC) at the E-step as described in Vaida and Xu (2000).
phmm(formula, random, data, subset, na.action, Sigma, varcov, NINIT, VARSTART, MAXSTEP, CONVERG, Gbs, Gbsvar, verbose)
formula |
model formula for the fixed part of the model, i.e. the
part that specifies beta'x[ij]. See
survreg for further details. Intercept is implicitely
included in the model by estimation of the error distribution. As a
consequence -1 in the model formula does not have any effect
on the model. The left-hand side of the formula must be an object
created using Surv .
If random is used then the formula must contain
an identification of clusters in the form cluster(id) , where
id is a name of the variable that determines clusters.
|
random |
formula for the random part of the model, i.e. the
part that specifies w[ij]'b[i]. If omitted,
no random part is included in the model. E.g. to specify the model with a
random intercept, say random=~1 .
|
data |
optional data frame in which to interpret the variables occuring in the formulas. |
subset |
subset of the observations to be used in the fit. |
na.action |
function to be used to handle any NA s in the
data. The user is discouraged to change a default value
na.fail . |
Sigma |
initial covariance matrix for the random effects. Defaults to "identity". |
varcov |
constraint on Sigma . Currently only "diagonal" is supported. |
NINIT |
number of starting values supplied to Adaptive Rejection Metropolis Sampling (ARMS) algorithm. |
VARSTART |
starting value of the variances of the random effects. |
MAXSTEP |
number of EM iterations. |
CONVERG |
iteration after which Gibbs sampling size changes from Gbs to Gbsvar. |
Gbs |
initial Gibbs sampling size (until CONVERG iterations). |
Gbsvar |
Gibbs sampling size after CONVERG iterations. |
verbose |
Set to TRUE to print EM steps. |
The proportional hazards model with mixed effects is equipped to handle clustered survival data. The model generalizes the usual frailty model by allowing log-linearl multivariate random effects. The software can only handle random effects from a multivariate normal distribution. Maximum likelihood estimates of the regression parameters and variance components is gotten by EM algorithm, with Markov chain Monte Carlo (MCMC) used in the E-step.
Care must be taken to ensure the MCMC-EM algorithm has converged, as the algorithm stops after MAXSTEP iterations. No convergence criteria is implemented. It is advised to plot the estimates at each iteration using the plot.phmm
method. For more on MCMC-EM convergence see Booth & Hobart (1999).
The function produces an object of class "phmm" consisting of:
steps |
a matrix of estimates at each EM step; |
bhat |
empirical Bayes estimates of expectation of random effects; |
sdbhat |
empirical Bayes estimates of standard deviation of random effects; |
coef |
the final parameter estimates for the fixed effects; |
var |
the estimated variance-covariance matrix; |
loglik |
a vector of length four with the conditional log-likelihood and marginal log-likelihood estimated by Laplace approximation, reciprocal importance sampling, and bridge sampling (only implemented for nreff < 3); |
lambda |
the estimated baseline hazard; |
Lambda |
the estimated cumulative baseline hazard. |
Gilks WR & Wild P. (1992) Adaptive rejection sampling for Gibbs sampling. Applied Statistics 41, pp 337-348.
Vaida F & Xu R. 2000. "Proportional hazards model with random effects", Statistics in Medicine, 19:3309-3324.
Gamst A, Donohue M, & Xu R (2009). Asymptotic properties and empirical evaluation of the NPMLE in the proportional hazards mixed-effects model. Statistica Sinica, 19, 997.
Xu R, Gamst A, Donohue M, Vaida F, & Harrington DP. 2006. Using Profile Likelihood for Semiparametric Model Selection with Application to Proportional Hazards Mixed Models. Harvard University Biostatistics Working Paper Series, Working Paper 43.
Booth JG & Hobert JP. Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. Journal of the Royal Statistical Society, Series B 1999; 61:265-285.
N <- 100 B <- 100 n <- 50 nclust <- 5 clusters <- rep(1:nclust,each=n/nclust) beta0 <- c(1,2) set.seed(13) #generate phmm data set Z <- cbind(Z1=sample(0:1,n,replace=TRUE), Z2=sample(0:1,n,replace=TRUE), Z3=sample(0:1,n,replace=TRUE)) b <- cbind(rep(rnorm(nclust),each=n/nclust),rep(rnorm(nclust),each=n/nclust)) Wb <- matrix(0,n,2) for( j in 1:2) Wb[,j] <- Z[,j]*b[,j] Wb <- apply(Wb,1,sum) T <- -log(runif(n,0,1))*exp(-Z[,c('Z1','Z2')]%*%beta0-Wb) C <- runif(n,0,1) time <- ifelse(T<C,T,C) event <- ifelse(T<=C,1,0) mean(event) phmmdata <- data.frame(Z) phmmdata$cluster <- clusters phmmdata$time <- time phmmdata$event <- event fit.phmm <- phmm(Surv(time, event)~Z1+Z2+cluster(cluster), ~-1+Z1+Z2, phmmdata, Gbs = 100, Gbsvar = 1000, VARSTART = 1, NINIT = 10, MAXSTEP = 100, CONVERG=90) summary(fit.phmm) plot(fit.phmm)