Cosmic Connections 2023

Hierarchical Bayesian Models
with generative & physical components
for inference with corrupted data

Cosmic Connections: A ML X Astrophysics Symposium
at Simons Foundation, NYC, May 23rd

Benjamin Remy with Francois Lanusse and Jean-Luc Starck

slides at b-remy.github.io/talks/NYC2023

Gravitational lensing

Galaxy shapes as estimators for gravitational shear

$$ e = \gamma + e_i \qquad \mbox{ with } \qquad e_i \sim \mathcal{N}(0, I)$$

Standard methods are trying the measure the ellipticity $e$ of galaxies as an estimator for the gravitational shear $\gamma$

Measuring galaxies ellipticity

Moments-based method

$\begin{align} F &= \sum G(x, y)I(x, y)\\ \sigma^2 &= \sum G(x, y)^2 \sigma^2(I(x, y)) \\ T &= \sum G(x, y)I(x, y)(x^2 + y^2) \\ M_1 &= \sum G(x, y)I(x, y)(x^2 - y^2) \\ M_2 &= \sum G(x, y)I(x, y)2xy \\ \color{#B1CB11}{e_1} &\color{#B1CB11}{=} \color{#B1CB11}{M_1 / T}\\ \color{#B1CB11}{e_2} &\color{#B1CB11}{=} \color{#B1CB11}{M_2 / T} \\ S/N &= F / \sigma(F) \\ \end{align}$

Measuring galaxies ellipticity

Model fitting

$\begin{align} I(x, y) &= F \cdot \mathcal{N}\left( 0, \color{#B1CB11}{\Sigma} \right)\\ \color{#B1CB11}{(e_1, e_2)} &\color{#B1CB11}{=} \color{#B1CB11}{f(\Sigma)} \end{align}$

What about real galaxies?

The ellipticity is not necessarily a well defined quantity for arbitrary galaxyies,
leading to bias in cosmic shear estimation... Need for calibration

Let us build a probabilistic model of galaxy images

$\longrightarrow$
$g_\theta$

$\longrightarrow$
shear $\gamma$

$\longrightarrow$
PSF

$\longrightarrow$
Noise

Probabilistic model

$$ x \sim ? $$

$$ x \sim \mathcal{N}(z, \Sigma) \quad z \sim ? $$
latent $z$ is a denoised galaxy image

$$ x \sim \mathcal{N}(\Pi \ast z, \Sigma) \quad z \sim ? $$
latent $z$ is a deconvolved, and denoised galaxy image

$\begin{align} x \sim \mathcal{N}(\Pi \ast (z \otimes \gamma), \Sigma) \quad & z \sim ? \\ & \gamma \sim \mathcal{N}(0, .05) \end{align}$
latent $z$ is a unsheared deconvolved, and denoised galaxy image

$\begin{align} x \sim \mathcal{N}(\Pi \ast (g_\theta(z) \otimes \gamma), \Sigma) \quad & z \sim \mathcal{N}(0, \mathbf{I}) \\ & \gamma \sim \mathcal{N}(0, .05) \end{align}$
latent $z$ are morphological parameters
$\theta$ are global parameters of the model

$\Longrightarrow$ Decouples the morphology model from the observing conditions.

Bayesian modeling of cosmic shear

We aim to model the posterior distribution $p(\gamma|\mathcal{D})$

$\begin{align} p(\gamma|\mathcal{D}) &= \int p(\gamma, z, \Pi|\mathcal{D}) ~dz~d\Pi \\ \end{align}$

$\begin{align} ~~~~~~~~~~~&= \int \color{#B1CB11}{\underbrace{p(\mathcal{D}|\gamma, z, \Pi)}_{\text{likelihood}}} \underbrace{p(\gamma)p(z)p(\Pi)}_{\text{priors}} ~dz~d\Pi \end{align}$

The likelihood $\color{#B1CB11}{p(\mathcal{D}|\gamma, z, \Pi)}$ is naturally built from the forward model

Joint inference using a parametric model for the morphology

Let's assume that $g(z)$ is a sersic model, i.e. $z = \{n, r_\text{hlr}, F, e_1, e_2, s_x, s_y\}$ and $$g(z) = F \times I_0 \exp \left( -b_n \left[\left( \frac{r}{r_\text{hlr}}\right)^{\frac{1}{n}} -1\right] \right)$$

The joint inference of $p(z, \gamma | \mathcal{D})$ leads to a biased posterior...

Marginal shear posterior $p(\gamma|\mathcal{D})$

Maximum a posteriori fit and residuals

...due to model misspecification $\longrightarrow$ Let's learn a more flexible $g_\theta$

Key concept: generative modeling

The goal of generative modeling is to learn an implicit distribution $\mathbb{P}$ from which the training set $X = \{x_0, x_1, \ldots, x_n \}$ is drawn.

Usually, this means building a parametric model $\mathbb{P}_\theta$ that tries to be close to $\mathbb{P}$.

True $\mathbb{P}$

Samples $x_i \sim \mathbb{P}$

Model $\mathbb{P}_\theta$

Once trained, you can typically sample from $\mathbb{P}_\theta$ and/or evaluate the likelihood $p_\theta(x)$.

Learning from corrupted data

Lanusse et al. 2020

$\longrightarrow$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$

$q_\phi(z|x)$

$p_\theta(x|z)$

$z \sim q_\phi(z|x)$

$\rightarrow$

$\longrightarrow$

$\rightarrow$

$\longrightarrow$

$x' \sim p_\theta(x|z)$

Learning from corrupted data

Lanusse et al. 2020

$\longrightarrow$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$

$q_\phi(z|x)$

$p_\theta(x|z)$

$z \sim q_\phi(z|x)$

$\rightarrow$

$\longrightarrow$

$\rightarrow$

$\longrightarrow$

$g_\theta(z)$

$x' \sim p_\theta(x|z)$

Learning from corrupted data

Lanusse et al. 2020

$\longrightarrow$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$

$q_\phi(z|x)$

$p_\theta(x|z)$

$z \sim q_\phi(z|x)$

$\rightarrow$

$\longrightarrow$

$\rightarrow$

$\longrightarrow$

$g_\theta(z)$

$\ast$

$\Pi$

$x' \sim p_\theta(x|z, \Pi, \Sigma)$

$\longrightarrow$

Learning from corrupted data

Lanusse et al. 2020

$\longrightarrow$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$

$q_\phi(z|x)$

$p_\theta(x|z)$

$z \sim q_\phi(z|x)$

$\rightarrow$

$\longrightarrow$

$\rightarrow$

$\longrightarrow$

$g_\theta(z)$

$\ast$

$\Pi$

$x' \sim p_\theta(x|z, \Pi, \Sigma)$

$\longrightarrow$

$\underbrace{\quad \quad \quad}_{g_\theta(z) \ast \Pi}$

$\longrightarrow$

Learning from corrupted data

Lanusse et al. 2020

$\longrightarrow$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Inference network}$

$\underbrace{\quad \quad \quad \quad \quad \quad}_\textrm{Generator network}$

$q_\phi(z|x)$

$p_\theta(x|z)$

$z \sim q_\phi(z|x)$

$\rightarrow$

$\longrightarrow$

$\rightarrow$

$\longrightarrow$

$g_\theta(z)$

$\ast$

$\Pi$

$x' \sim p_\theta(x|z, \Pi, \Sigma)$

$\longrightarrow$

$\underbrace{\quad \quad \quad}_{g_\theta(z) \ast \Pi}$

$\longrightarrow$

Optimized maximizing the ELBO

$\log p(x) \geq \mathbb{E}_{z\sim q_\phi(z|x)} \left[ \log p_\theta(x|z, \Pi, \Sigma) + \mathbb{D}_\text{KL}(q_\phi \| p(z)) \right]$

A generative model for galaxy morpgologies

The Bayesian view of the problem: $$ p(z | x ) \propto p_\theta(x | z, \Sigma, \mathbf{\Pi}) p(z)$$ where:

$p( z | x )$ is the posterior
$p( x | z )$ is the data likelihood, contains the physics
$p( z )$ is the prior

Data
$x_n$

Truth
$x_0$

Posterior samples
$g_\theta(z)$

$\mathbf{P} (\Pi \ast g_\theta(z))$

Median

Data residuals
$x_n - \mathbf{P} (\Pi \ast g_\theta(z))$

Standard Deviation

$\Longrightarrow$ Uncertainties are fully captured by the posterior.

Joint inference using a generative model for the morpholgy

Remy, Lanusse, Starck (2022)

Let's use the leared $g_\theta(z)$

The joint inference of $p(z, \gamma | \mathcal{D})$ leads to an unbiased posterior!

Marginal shear posterior $p(\gamma|\mathcal{D})$

Maximum a posteriori fit and residuals

Takeaway message

Ellipticity is not a well defined quantity for arbitrary galaxies $\rightarrow$ bias in shear estimation

Foward modeling allows to decouple morphology from observing conditions
- Deep generative models can be used to provide flexible light profile model
- Explicit likelihood, uses of all of our physical knowledge
  $ + $ Our method can be applied for varying PSF, noise, or even different instruments!

$\Longrightarrow$ Joint inference of morpholgy and shear leads to unbiased marginal shear posterior

Opening remarks

Latent Variable Models are no longer SOTA $\longrightarrow$ diffusion models
Training a score-based / diffusion model from corrupted data remains a challenge!

Thank you!

Hierarchical Bayesian Models with generative & physical components for inference with corrupted data

Gravitational lensing

Measuring galaxies ellipticity

Measuring galaxies ellipticity

Let us build a probabilistic model of galaxy images

Bayesian modeling of cosmic shear

Joint inference using a parametric model for the morphology

Key concept: generative modeling

Learning from corrupted data

Learning from corrupted data

Learning from corrupted data

Learning from corrupted data

Learning from corrupted data

A generative model for galaxy morpgologies

Joint inference using a generative model for the morpholgy

Takeaway message

Hierarchical Bayesian Models
with generative & physical components
for inference with corrupted data