Distributional assumptions in Poisson LRA

We are interested in fitting the model: \( \newcommand\ml{\mathbf{L}} \newcommand\mf{\mathbf{F}} \newcommand\mx{\mathbf{X}} \newcommand\vs{\mathbf{s}} \newcommand\mm{\mathbf{M}} \newcommand\mphi{\boldsymbol{\Phi}} \)

\begin{align*} x_{ij} \mid s_i, \lambda_{ij} &\sim \operatorname{Poisson}(s_i \lambda_{ij})\\ \lambda_{ij} \mid \ml, \mf &= h^{-1}((\ml\mf)_{ij}) \end{align*}

where \(i = 1, \ldots, n\), \(j = 1, \ldots, p\), \(\ml\) is an \(n \times K\) matrix, and \(\mf\) is a \(p \times K\) matrix. In this model, we make a structural (low rank) assumption but not a distributional assumption. This is analogous to choosing \(g = \delta_\mu\) (a point mass) in distribution deconvolution, and assuming that variation in the data is only introduced by the measurement process.

One way we can add a distributional assumption to this model is to introduce a random effect:

\begin{align*} \lambda_{ij} \mid \mu_{ij}, u_{ij} &= \mu_{ij} u_{ij}\\ \mu_{ij} &= h^{-1}((\ml\mf)_{ij})\\ u_{ij} &\sim \operatorname{Gamma}(\phi_{ij}, \phi_{ij}) \end{align*}

Under this assumption \(E[u_{ij}] = 1\), \(V[u_{ij}] = 1 / \phi_{ij}\), and \(\lambda_{ij} \mid \mu_{ij}, \phi_{ij} \sim \operatorname{Gamma}(\phi_{ij}, \phi_{ij} / \mu_{ij})\). (This is actually the same way we derive the Gamma prior on latent gene expression in distribution deconvolution of a single gene.)

Under this model, \(\lambda_{ij}\) follows a "multivariate" Gamma distribution (the \(\mu_{ij}\) are dependent) centered around some low-dimensional manifold. (For comparison, in probabilistic PCA data are generated by picking a latent point on the manifold and adding isotropic Gaussian noise in the measurement process.) We can make various simplifications, such as assuming \(\phi_{ij} = \phi_j\) or even \(\phi_{ij} = \phi\), or assume more complicated structured random effects such as \(\phi_{ij} = \alpha_i \beta_j\). This assumption has been previously proposed, but not derived in this manner (Gouvert et al. 2018).

We could go further and assume:

\[ \lambda_{ij} \mid \pi_{ij}, \mu_{ij}, \phi_{ij} \sim \pi_{ij} \delta_0(\cdot) + (1 - \pi_{ij}) \operatorname{Gamma}(\phi_{ij}, \phi_{ij} / \mu_{ij}) \]

where the \(\pi_{ij}\) also have some structure (e.g. constant, gene-specific, label-specific). This assumption has also been previously proposed, but not derived in this manner (Risso et al. 2018, Lopez et al. 2018).

We are interested in the practical problem: given estimates of \(\mu_{ij}, \phi_{ij}\) (and any other assumed parameters), how do we estimate \(\lambda_{ij}\)? By analogy to distribution deconvolution, we can treat \(p(\lambda_{ij})\) as a prior and get the posterior \(p(\lambda_{ij} \mid \cdot)\) directly. The posterior is easy in the case of a Gamma prior:

\begin{align*} p(\lambda_{ij} \mid \mu_{ij}, \phi_{ij}) &= \operatorname{Gamma}(\phi_{ij}, \phi_{ij} / \mu_{ij})\\ p(\lambda_{ij} \mid \mx, \vs, \mm, \mphi) &= p(\lambda_{ij} \mid x_{ij}, s_i, \mu_{ij}, \phi_{ij})\\ &\propto p(x_{ij} \mid s_i, \lambda_{ij}) p(\lambda_{ij} \mid \mu_{ij}, \phi_{ij})\\ &\propto \lambda_{ij}^{x_{ij}} \exp(-s_i \lambda_{ij}) \lambda_{ij}^{\phi_{ij} - 1}\exp(-\phi_{ij} \lambda_{ij} / \mu_{ij})\\ &\propto \lambda_{ij}^{x_{ij} + \phi_{ij} - 1} \exp(-\lambda_{ij}(s_i + \phi_{ij} / \mu_{ij}))\\ &= \operatorname{Gamma}(x_{ij} + \phi_{ij}, s_i + \phi_{ij} / \mu_{ij})\\ E[\lambda_{ij} \mid \mx, \cdot] &= \frac{x_{ij} + \phi_{ij}}{s_i + \phi_{ij} / \mu_{ij}} \end{align*}

If we assume the \(p(\lambda_{ij})\) is a mixture of a point mass on zero of size \(\pi_{ij}\) and a Gamma distribution, then the posterior will also be a point-Gamma mixture, with posterior mean

\[ E[\lambda_{ij} \mid \mx, \cdot] = (1 - \pi_{ij}) \frac{x_{ij} + \phi_{ij}}{s_i + \phi_{ij} / \mu_{ij}} \]

This is derived by introducing an indicator variable \(z_{ij}\) denoting whether \(\lambda_{ij}\) was drawn from from the Gamma distribution, and marginalizing over \(z_{ij}\). (The posterior given \(z_{ij} = 0\) is a point mass on zero).

Author: Abhishek Sarkar

Created: 2019-09-07 Sat 22:01

Validate