Читать книгу Statistical Approaches for Hidden Variables in Ecology - Nathalie Peyrard - Страница 23
1.2.1.2. Inference
ОглавлениеUsing the model defined by [1.1], inference is used for two purposes:
– Estimation of positions: in this case, inference is used to determine the distribution of actual positions based on observations, that is, for 0 ≤ t ≤ n, the distribution of the random variable Zt|Y0:n. This distribution is known as the smoothing distribution.
– Estimation of parameters: to estimate the unknown parameters in the model (which, in the majority of cases, correspond to the two variance–covariance matrices, Σm and Σo).
With known parameters and for any 0 ≤ t ≤ n, the distribution of Zt|Y0:n is Gaussian. The mean and the variance–covariance matrix of this distribution can be calculated explicitly. This step is carried out using Kalman smoothing, which will not be described in detail here; interested readers may wish to consult Tusell (2011). It is important to note that the explicit nature of this solution is exceptional in the context of latent variable models, and is a result of the Gaussian linear formulation of model [1.1].
In practice, the parameter θ = {μ, A, ν, B, Σm, Σo} is unknown. In a frequentist context, the natural aim is to identify the parameter that maximizes the likelihood associated with observations Y0:n:
where p is a generic notation for probability density. In this case, the expression of likelihood implies the calculation of an integral in very high dimensions, as it must be integrated across all hidden states. However, given a known sequence of real positions X0:n, we would have an explicit expression of the full log-likelihood:
As all of the densities in this model are Gaussian, maximization of the log-likelihood would be simple. The expectation–maximization (EM) algorithm uses this full likelihood to maximize likelihood. Based on an initial parameter value θ(0), the algorithm produces a series of estimations as follows:
– Step E calculates:[1.3]
– Step M takes:
The series converges to a local maximum likelihood (Dempster et al. 1977). Equation [1.3] consists of calculating the expectation of [1.2] with respect to the distribution of missing data given the existing observations, that is, the distribution of X0:n| {Y0:n = y0:n), using a “real” parameter The smoothing distribution for the parameter must, therefore, be calculated as part of this step; this is done using Kalman smoothing. A solution is then obtained explicitly in step M thanks to the Gaussian linear nature of the problem.