--- title: "Getting Started" output: bookdown::html_document2: toc: true number_sections: FALSE toc_float: true toc_depth: 3 code_folding: show theme: united highlight: pygments df_print: paged link-citations: true pkgdown: as_is: true vignette: > %\VignetteIndexEntry{getting-started} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(panelPomp) ``` ## Introduction Panel data arise when time series are measured on a collection of units. When the time series for each unit is modeled as a partially observed Markov process (POMP) the collection of these models is a PanelPOMP. The `panelPomp` package provides facilities for inference on panel data using PanelPOMP models. Monte Carlo methods used for POMP models require adaptation for PanelPOMP models due to the higher dimensionality of panel data. This package builds on the functionality and tools of the popular [`pomp` R package](https://kingaa.github.io/pomp/), providing a computationally efficient framework that can be used to simulate from, fit, and diagnose PanelPOMP models. As such, a basic working knowledge of the `pomp` package is recommended. Here we cover some of the necessary basics. See [Getting Started With `pomp`](https://kingaa.github.io/pomp/vignettes/getting_started.html) for an introductory Vignette to the `pomp` package. ## Mathematical Notation Before discussing features of the `panelPomp` package, we describe a mathematical notation that is helpful in communicating details about `panelPomp` code and models. The general scope of the `panelPomp` package requires notation concerning random variables and their densities in arbitrary spaces. The notation below allows us to talk about these things using the language of mathematics, enabling precise description of models and algorithms. Units of the panel can be identified with numeric labels $\{1,2,\dots,U\}$, which we also write as $1:U$. Let $N_u$ be the number of measurements collected on unit $u$, allowing for the possibility that a different number of observations are collected for each unit. The data from the entire panel are written as $y^*_{1:U,1:N_u} = \{y^*_{1,1}, y^*_{2,1},\dots, y^*_{U, 1}, y^*_{1,2}\ldots, y^*_{u,N_u}\}$ where the $n^{th}$ observation from unit $u$ (denoted as $y^*_{u,n}$) is collected at time $t_{u,n}$ with $t_{u,1})`. ### Constructing `panelPomp` objects The fundamental mathematical functions that define a PanelPOMP model are made available via the `pomp` objects in the `unit_objects` slot. As such, constructing a `panelPomp` object is simple if you are already familiar with constructing `pomp` objects, or if you already have access to `pomp` objects. **Here we describe how to create a `panelPomp` object if the unit-specific `pomp` objects are already created**. In the next sub-section, we give a brief demonstration of how to construct a `panelPomp` object from scratch, including each individual `pomp` object. Here, we construct a `panelPomp` object representing a panel of stochastic Gompertz population models with log-normal measurement error. The latent state process is defined as: $$ X_{n + 1} = K^{1-S}X_n^S\epsilon_n, $$ where $S = \exp^{-r}$ and the $\epsilon_n$ are i.i.d. lognormal random variables with variance $\sigma^2$. The measurement model for the observed variables $Y_n$ are distributed as: $$ Y_n \sim \text{Lognormal}\big(\log X_n, \tau \big). $$ The parameters of this model are: - Per-capita growth rate $r$. - The carrying capacity $K$. - The process noise standard deviation $\sigma$. - The measurement error standard deviation $\tau$. - The initial condition $X_0$. This particular model class has a constructor function `gompertz()` from the `pomp` package. Here, we create 5 unique instances of this model, and use these instances to create a single `panelPomp` object: ```{r createPpomp1} mod1 <- pomp::gompertz() # Using default values mod2 <- pomp::gompertz(K = 2, r = 0.01) # Overwriting some of the defaults mod3 <- pomp::gompertz(K = 1.5, sigma = 0.15, X_0 = 5) mod4 <- pomp::gompertz(K = 1.5, r = 0.05, X_0 = 5) mod5 <- pomp::gompertz(K = 5, sigma = 0.08) panelMod1 <- panelPomp( object = list(mod1, mod2, mod3, mod4, mod5) ) ``` One important thing to note above the above construction is that each individual model already has parameter values present. When this is the case, the `panelPomp()` constructor sets all parameters to be unit specific: ```{r showMod1} print(specific(panelMod1)) print(shared(panelMod1)) ``` In the `panelMod1` object, all five parameters are listed as *unit specific*. Notably, because $\tau$ was not modified in any of the unit specific objects, it has the same value across all five units. In such cases, it might make sense to list parameters that have the same value across all units as *shared* parameters, which can be done in the model constructor: ```{r createPpomp2} panelMod2 <- panelPomp( object = list(mod1, mod2, mod3, mod4, mod5), shared = c("tau" = 0.1) ) specific(panelMod2) shared(panelMod2) ``` In this case we did not need to explicitly specify unit-specific parameters; if parameter values are present in the unit `pomp` objects that comprise the panel, parameters are assumed to be unit-specific unless otherwise specified. However, it is possible to explicitly provide a matrix of unit specific parameters in the constructor, if desired. This is especially important if the individual `pomp` objects that make up the panel have missing parameter values. Unit-specific parameters can be expressed in two ways: as a matrix with rows corresponding to parameter values and columns the corresponding unit (as seen above), or as a named numeric vector that follows the convention `[]`: ```{r unitSpecific-vector} specific(panelMod2, format = 'vector') ``` It is often convenient to modify which parameters are shared and which are unit-specific on existing `panelPomp` objects rather than creating new objects from scratch. This can be done with the `shared<-` and `specific<-` setter functions: ```{r setterFuns} shared(panelMod2) <- c('r' = 0.05, 'sigma' = 0.1) specific(panelMod2) <- c('tau[unit1]' = 0.11, 'tau[unit4]' = 0.09) print(shared(panelMod2)) print(specific(panelMod2)) ``` Notice above that if a shared parameter (`tau`) is changed to a unit-specific parameter and not all values of the unit-specific parameter are explicitly provided, the parameters that are not specified default to the original shared value. The unit-specific setter function also works in matrix format: ```{r } specific(panelMod2) <- matrix( data = rbind(c(1.24, 1.78), c( 2, 3)), nrow = 2, dimnames = list( param = c("K", "X_0"), unit = c('unit2', 'unit4') ) ) specific(panelMod2) ``` Neither the `shared<-` nor the `specific<-` setter functions allow a user to add new parameters (or unit names) that are not already part of the model: ```{r errorShared, error=TRUE} shared(panelMod2) <- c("foo" = 1) ``` ```{r errorSpecific, error=TRUE} specific(panelMod2) <- c("tau[unit6]" = 1) ```