Hierarchical Bayesian Models
with generative & physical components
for inference with corrupted data




Cosmic Connections: A ML X Astrophysics Symposium
at Simons Foundation, NYC, May 23rd


Benjamin Remy with Francois Lanusse and Jean-Luc Starck




Gravitational lensing

Galaxy shapes as estimators for gravitational shear
$$ e = \gamma + e_i \qquad \mbox{ with } \qquad e_i \sim \mathcal{N}(0, I)$$
  • Standard methods are trying the measure the ellipticity $e$ of galaxies as an estimator for the gravitational shear $\gamma$

Measuring galaxies ellipticity



Moments-based method

$\begin{align} F &= \sum G(x, y)I(x, y)\\ \sigma^2 &= \sum G(x, y)^2 \sigma^2(I(x, y)) \\ T &= \sum G(x, y)I(x, y)(x^2 + y^2) \\ M_1 &= \sum G(x, y)I(x, y)(x^2 - y^2) \\ M_2 &= \sum G(x, y)I(x, y)2xy \\ \color{#B1CB11}{e_1} &\color{#B1CB11}{=} \color{#B1CB11}{M_1 / T}\\ \color{#B1CB11}{e_2} &\color{#B1CB11}{=} \color{#B1CB11}{M_2 / T} \\ S/N &= F / \sigma(F) \\ \end{align}$

Measuring galaxies ellipticity

Model fitting

$\begin{align} I(x, y) &= F \cdot \mathcal{N}\left( 0, \color{#B1CB11}{\Sigma} \right)\\ \color{#B1CB11}{(e_1, e_2)} &\color{#B1CB11}{=} \color{#B1CB11}{f(\Sigma)} \end{align}$

What about real galaxies?

The ellipticity is not necessarily a well defined quantity for arbitrary galaxyies,
leading to bias in cosmic shear estimation... Need for calibration

Let us build a probabilistic model of galaxy images

$\longrightarrow$
$g_\theta$
$\longrightarrow$
shear $\gamma$
$\longrightarrow$
PSF
$\longrightarrow$
Noise
Probabilistic model
$$ x \sim ? $$
$$ x \sim \mathcal{N}(z, \Sigma) \quad z \sim ? $$
latent $z$ is a denoised galaxy image
$$ x \sim \mathcal{N}(\Pi \ast z, \Sigma) \quad z \sim ? $$
latent $z$ is a deconvolved, and denoised galaxy image
$\begin{align} x \sim \mathcal{N}(\Pi \ast (z \otimes \gamma), \Sigma) \quad & z \sim ? \\ & \gamma \sim \mathcal{N}(0, .05) \end{align}$
latent $z$ is a unsheared deconvolved, and denoised galaxy image
$\begin{align} x \sim \mathcal{N}(\Pi \ast (g_\theta(z) \otimes \gamma), \Sigma) \quad & z \sim \mathcal{N}(0, \mathbf{I}) \\ & \gamma \sim \mathcal{N}(0, .05) \end{align}$
latent $z$ are morphological parameters
$\theta$ are global parameters of the model



$\Longrightarrow$ Decouples the morphology model from the observing conditions.

Bayesian modeling of cosmic shear

We aim to model the posterior distribution $p(\gamma|\mathcal{D})$

$\begin{align} p(\gamma|\mathcal{D}) &= \int p(\gamma, z, \Pi|\mathcal{D}) ~dz~d\Pi \\ \end{align}$
$\begin{align} ~~~~~~~~~~~&= \int \color{#B1CB11}{\underbrace{p(\mathcal{D}|\gamma, z, \Pi)}_{\text{likelihood}}} \underbrace{p(\gamma)p(z)p(\Pi)}_{\text{priors}} ~dz~d\Pi \end{align}$

The likelihood $\color{#B1CB11}{p(\mathcal{D}|\gamma, z, \Pi)}$ is naturally built from the forward model

Joint inference using a parametric model for the morphology

Let's assume that $g(z)$ is a sersic model, i.e. $z = \{n, r_\text{hlr}, F, e_1, e_2, s_x, s_y\}$ and $$g(z) = F \times I_0 \exp \left( -b_n \left[\left( \frac{r}{r_\text{hlr}}\right)^{\frac{1}{n}} -1\right] \right)$$
The joint inference of $p(z, \gamma | \mathcal{D})$ leads to a biased posterior...

Marginal shear posterior $p(\gamma|\mathcal{D})$

Maximum a posteriori fit and residuals

...due to model misspecification $\longrightarrow$ Let's learn a more flexible $g_\theta$

Key concept: generative modeling


  • The goal of generative modeling is to learn an implicit distribution $\mathbb{P}$ from which the training set $X = \{x_0, x_1, \ldots, x_n \}$ is drawn.

  • Usually, this means building a parametric model $\mathbb{P}_\theta$ that tries to be close to $\mathbb{P}$.


True $\mathbb{P}$

Samples $x_i \sim \mathbb{P}$

Model $\mathbb{P}_\theta$


  • Once trained, you can typically sample from $\mathbb{P}_\theta$ and/or evaluate the likelihood $p_\theta(x)$.

Learning from corrupted data

Lanusse et al. 2020
$\longrightarrow$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$
$q_\phi(z|x)$
$p_\theta(x|z)$
$z \sim q_\phi(z|x)$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$\longrightarrow$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$x' \sim p_\theta(x|z)$

Learning from corrupted data

Lanusse et al. 2020
$\longrightarrow$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$
$q_\phi(z|x)$
$p_\theta(x|z)$
$z \sim q_\phi(z|x)$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$\longrightarrow$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$g_\theta(z)$
$x' \sim p_\theta(x|z)$

Learning from corrupted data

Lanusse et al. 2020
$\longrightarrow$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$
$q_\phi(z|x)$
$p_\theta(x|z)$
$z \sim q_\phi(z|x)$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$\longrightarrow$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$g_\theta(z)$
$\ast$
$\Pi$
$x' \sim p_\theta(x|z, \Pi, \Sigma)$
$\longrightarrow$

Learning from corrupted data

Lanusse et al. 2020
$\longrightarrow$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$
$q_\phi(z|x)$
$p_\theta(x|z)$
$z \sim q_\phi(z|x)$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$\longrightarrow$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$g_\theta(z)$
$\ast$
$\Pi$
$x' \sim p_\theta(x|z, \Pi, \Sigma)$
$\longrightarrow$
$\underbrace{\quad \quad \quad}_{g_\theta(z) \ast \Pi}$
$\longrightarrow$

Learning from corrupted data

Lanusse et al. 2020
$\longrightarrow$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$
$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$
$q_\phi(z|x)$
$p_\theta(x|z)$
$z \sim q_\phi(z|x)$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$\longrightarrow$
$\rightarrow$
$\rightarrow$
$\rightarrow$
$\longrightarrow$
$g_\theta(z)$
$\ast$
$\Pi$
$x' \sim p_\theta(x|z, \Pi, \Sigma)$
$\longrightarrow$
$\underbrace{\quad \quad \quad}_{g_\theta(z) \ast \Pi}$
$\longrightarrow$
Optimized maximizing the ELBO

$\log p(x) \geq \mathbb{E}_{z\sim q_\phi(z|x)} \left[ \log p_\theta(x|z, \Pi, \Sigma) + \mathbb{D}_\text{KL}(q_\phi \| p(z)) \right]$

A generative model for galaxy morpgologies

The Bayesian view of the problem: $$ p(z | x ) \propto p_\theta(x | z, \Sigma, \mathbf{\Pi}) p(z)$$ where:
  • $p( z | x )$ is the posterior
  • $p( x | z )$ is the data likelihood, contains the physics
  • $p( z )$ is the prior
Data
$x_n$
Truth
$x_0$

Posterior samples
$g_\theta(z)$

$\mathbf{P} (\Pi \ast g_\theta(z))$
Median
Data residuals
$x_n - \mathbf{P} (\Pi \ast g_\theta(z))$
Standard Deviation
$\Longrightarrow$ Uncertainties are fully captured by the posterior.

Joint inference using a generative model for the morpholgy

Remy, Lanusse, Starck (2022)
Let's use the leared $g_\theta(z)$

The joint inference of $p(z, \gamma | \mathcal{D})$ leads to an unbiased posterior!


Marginal shear posterior $p(\gamma|\mathcal{D})$

Maximum a posteriori fit and residuals

Takeaway message


  • Ellipticity is not a well defined quantity for arbitrary galaxies $\rightarrow$ bias in shear estimation

  • Foward modeling allows to decouple morphology from observing conditions

    • Deep generative models can be used to provide flexible light profile model

    • Explicit likelihood, uses of all of our physical knowledge
      $ + $ Our method can be applied for varying PSF, noise, or even different instruments!

$\Longrightarrow$ Joint inference of morpholgy and shear leads to unbiased marginal shear posterior


Opening remarks

  • Latent Variable Models are no longer SOTA $\longrightarrow$ diffusion models
  • Training a score-based / diffusion model from corrupted data remains a challenge!

Thank you!