ABSTRACT

Non-Gaussian statistics of the projected weak lensing field are powerful estimators that can outperform the constraining power of the two-point functions in inferring cosmological parameters. This is because these estimators extract the non-Gaussian information contained in the small scales. However, fully leveraging the statistical precision of such estimators is hampered by theoretical uncertainties, such as those arising from baryonic physics. Moreover, as non-Gaussian estimators mix different scales, there exists no natural cut-off scale below which baryonic feedback can be completely removed. We therefore present a Bayesian solution for accounting for baryonic feedback uncertainty in weak lensing non-Gaussianity inference. Our solution implements Bayesian model averaging (BMA), a statistical framework that accounts for model uncertainty and combines the strengths of different models to produce more robust and reliable parameter inferences. We demonstrate the effectiveness of this approach in a Stage IV convergence peak count analysis, including three baryonic feedback models. We find that the resulting BMA posterior distribution safeguards parameter inference against biases due to baryonic feedback, and therefore provides a robust framework for obtaining accurate cosmological constraints at Stage IV precision under model uncertainty scenarios.

1 INTRODUCTION

The formation of cosmic structures is determined by gravity and the expansion history of the Universe. In the late Universe, structure growth has evolved into the non-linear regime, resulting in matter being distributed as a non-Gaussian random field. Weak gravitational lensing is particularly affected by such non-linearities, as its effects are driven by the total matter distribution. Consequently, estimators that capture the non-Gaussian features in the lensing field are valuable tools for extracting additional cosmological information contained in the small (non-linear) scales.

In recent years, Stage III weak lensing surveys such as the Kilo Degree Survey1 (KiDS; Kuijken et al. 2015; Asgari et al. 2021), the Dark Energy Survey2 (Abbott et al. 2016; Troxel et al. 2018; Amon et al. 2022; Secco et al. 2022), and the Hyper Suprime Cam3 (HSC; Aihara et al. 2018; Mandelbaum et al. 2018) have implemented inference from non-Gaussian estimators. These analyses show that such estimators can tighten cosmological constraints compared to inference of the power spectrum alone (e.g. Martinet et al. 2018, 2021; Shan et al. 2018; Gatti et al. 2020; Zürcher et al. 2022; Liu et al. 2023; Thiele et al. 2023; Cheng et al. 2024; Grandón et al. 2024; Harnois-Déraps et al. 2024; Marques et al. 2024).Among the most studied non-Gaussian statistics, we find Minkowski functionals (Kratochvil et al. 2012; Marques et al. 2019; Parroni et al. 2020; Grewal et al. 2022), peak counts (Liu et al. 2015; Kacprzak et al. 2016; Li et al. 2019; Ajani et al. 2020; Harnois-Déraps et al. 2021; Ayçoberry et al. 2023; Davies et al. 2024), minimum counts (Coulton et al. 2020; Marques et al. 2024), the one-point probability density function (Liu & Madhavacheril 2019; Barthelemy et al. 2020, 2024; Thiele, Hill & Smith 2020; Boyle et al. 2021; Giblin, Cai & Harnois-Déraps 2023; Castiblanco et al. 2024), scattering transform coefficients (Cheng et al. 2020; Cheng & Ménard 2021; Valogiannis & Dvorkin 2022), and starlet |$\ell _1$| norm (Ajani, Starck & Pettorino 2021; Ajani et al. 2023). Some of the non-Gaussian estimators have also been studied at Stage IV precision, for mock data of Euclid (Laureijs et al. 2011) and Vera Rubin Observatory Legacy Survey of Space and Time (Ivezić et al. 2019). These cosmological forecasts predict that a joint analysis of non-Gaussian statistics and two-point functions can improve the constraints on cosmological parameters by a factor of 2–3 compared to using the two-point function alone (Euclid Collaboration 2023). However, a full Bayesian parameter inference analysis is needed to study cosmological constraints from noisy real data and the parameter biases that can arise due to unmodelled systematic effects.

In this context, baryonic feedback is one of the most important astrophysical systematics in weak lensing analysis. It describes how cosmic matter fields are subject not only to gravitational collapse but also to matter redistribution by stellar and galactic processes from intermediate to small scales. These processes include supernova feedback, star formation, gas cooling, and active galactic nucleus (AGN) feedback, among others. Given the complexity in the modelling of baryonic physics, many hydrodynamic simulation suites are required. They differ in many aspects, including specific calibration strategies of sub-grid parameters, box size, and resolution4 (Schaye et al. 2010; van Daalen et al. 2011; Dubois et al. 2014; Vogelsberger et al. 2014; Khandai et al. 2015; Hellwing et al. 2016; McCarthy et al. 2017, 2018, 2023; Peirani et al. 2017; Springel et al. 2018).

Results from Chisari et al. (2019) and van Daalen, McCarthy & Schaye (2020) reveal a significant discrepancy in the amplitude of the effects of baryonic feedback across hydrodynamic simulations (and the scales at which these effects become important). This is because baryonic feedback encapsulates a list of complicated processes and to date no consensus on a single outstanding model has been reached. Accounting for a multitude of baryonic feedback models in a non-Gaussianity analysis is hence paramount, if the inferred primary cosmological parameters are to be unbiased. In Grandón et al. (2024), we show the impact of baryons on several non-Gaussian estimators based on Subaru HSC Y1 mock data. We demonstrate that unmodelled baryonic physics leads to |${\sim} 1\sigma$| biases on the structure growth parameter |$S_8$| when including the smallest scales. However, Semboloni, Hoekstra & Schaye (2013), Coulton et al. (2020), and Martinet et al. (2021) show that this effect is forecasted to be severe at Stage IV precision.

In this paper, our goal is to establish a Bayesian framework that safeguards the inference against confusion between baryonic feedback models. We aim to preserve accuracy in our cosmological constraints while retaining most of the cosmological information contained in our non-Gaussian estimator, fully leveraging Stage IV statistical power. In this paper, we focus on the peak counts of the convergence field. We account for the uncertainty in the baryonic feedback modelling by implementing Bayesian model averaging (BMA), a statistical framework that produces robust predictions for model parameters by combining each model’s posterior distributions weighted by its relative probabilities of having generated the data.

Our paper is structured as follows. In Section 2, we present our data analysis set-up, including the mock weak lensing maps based on N-body simulations and the hydrodynamic simulations. We also present our strategy to account for the influence of baryons on non-Gaussian statistics. In Section 3, we introduce the formulation of BMA and how it can be applied to baryonic feedback models. In Section 4, we present our likelihood and the analysis set-up at Stage IV precision. We finally show our results in Section 5 followed by the conclusions in Section 6.

2 DATA ANALYSIS SET-UP

In this paper, the aim is to infer primary cosmological parameters |$\boldsymbol{\theta }$| from observed maps of weak gravitational lensing. The weak lensing convergence fields |$\kappa (\vartheta , \varphi)$| are mapped as a function of the celestial coordinates |$\vartheta ,\varphi$|⁠. They can be gained from observations of sheared galaxies with algorithms such as Almanac (Loureiro et al. 2022; Sellentin et al. 2023), or competing mass-mapping algorithms (Kaiser & Squires 1993; Bartelmann 1995; Squires & Kaiser 1996; Pires et al. 2020; Fiedorowicz et al. 2022; Boruah, Fiedorowicz & Rozo 2024).

Weak lensing convergence maps |$\kappa (\vartheta , \varphi)$| are non-Gaussian random fields. Therefore, estimators beyond the power spectrum are needed to maximize the information extracted from the non-Gaussianity contained in such maps. Technically, there are infinitely many non-Gaussianity estimators. In this paper, we study the peak counts of lensing convergence |$\kappa$| maps (Jain & Van Waerbeke 2000; van Waerbeke 2000). As shown in Yang et al. (2011) and Liu & Haiman (2016), the peak heights in convergence maps are direct tracers of massive haloes and projection of smaller haloes along the line of sight, which are sensitive to cosmological parameters. To measure the peaks, we count the number of pixels in convergence maps whose values are larger than the eight neighbouring pixels. This results in the distribution of local maxima in a convergence map, as a function of |$\kappa$|⁠.

Our simulated convergence maps are based on the Scinet LIghtCone Simulations SLICS (Harnois-Déraps et al. 2018) and cosmo-SLICS N-body simulations (Harnois-Déraps, Giblin & Joachimi 2019). These 100 deg|$^{2}$| maps mimic Stage IV properties, such as shape noise and source redshift distribution. More details on the simulations and map production can be found in Grandón et al. (2024). We consider five tomographic redshift bins in the ranges |$0.25\lt z_1\lt 0.75$|⁠, |$0.75\lt z_2\lt 1.25$|⁠, |$1.25\lt z_3\lt 1.75$|⁠, |$1.75\lt z_4\lt 2.25$|⁠, and |$2.25\lt z_5\lt 2.75$|⁠, with number densities |$n_{\text{gal}}={18.27, 14.58, 7.89, 4, \ \mathrm{ and} \ 1.89}$| arcmin|$^{-2}$|⁠, respectively. Then, we include shape noise to our maps by adding to each pixel a value drawn from a Gaussian distribution centred at 0 with variance

(1)

where |$A_{\text{pix}}$| is the solid angle per pixel, and we adopt |$\sigma _{e}=0.26$| for the mean intrinsic ellipticity. Finally, we apply a Gaussian kernel to smooth the maps with variances of 2 and 5 arcmin. Varying the smoothing scale allows us to suppress noise and exploit different features of the data. We choose 2 arcmin as this value is above the resolution of our maps while maintaining the information of the small scales. Larger smoothing scales remove small structures, thereby reducing the effect of baryons to some extent. Therefore, our analysis consists of multiple configurations, where peak counts are measured in 10|$\kappa$| bins, obtained for the five tomographic bins and smoothing scales.

2.1 Baryon correction modelling

To date, there is no analytical expression to include baryonic effects in the modelling of the peak counts. Therefore, we instead introduce its effects into the dark matter-only estimators as a correction factor obtained from hydrodynamic simulations. We build convergence maps based on the BAryons and HAloes of MAssive Systems (BAHAMAS) simulations (McCarthy et al. 2017, 2018) at the Wilkinson Microwave Anisotropy Probe 9 yr cosmology (Hinshaw et al. 2013).In particular, we include BAHAMAS runs with AGN heating temperature raised and lowered by 0.2 dex with respect to the fiducial value. We refer to these models as ‘high-AGN’ (stronger feedback), ‘fiducial-AGN’, and ‘low-AGN’ (lower feedback). We also include the dark matter-only counterpart, which we denote as ‘DMO’. For each of these models, we have 10 000 realizations obtained from 25 independent light-cones, with 400 realizations each generated through random rotations and shifts of the potential planes. To include the Stage IV properties, we follow the same methodology described for the SLICS and cosmo-SLICS convergence maps.

The correction factor B is obtained for the three baryonic feedback scenarios. The elements |$B_i$| of this factor are computed as follows:

(2)

where the angular brackets represent an average over 10 000 realizations. The factor |$\langle x_i^{\text{B}} \rangle$| denotes the average peak counts measured in hydrodynamic maps, while |$\langle x_i^{\text{DMO}} \rangle$| denotes the average peak counts from the corresponding dark matter-only maps. We opt for this approach instead of introducing estimators obtained from BAHAMAS mocks directly as any slight discrepancy between mocks cancels out to leading order when computing the ratio in equation (2).

We show the impact of baryons on the peak counts for the five tomographic bins and smoothing scales in Appendix  A. From Fig. A1, we see that the high-AGN model produces the strongest effects on the peak counts, especially for the high |$\kappa$| regime. This result is consistent with previous analyses presented in Coulton et al. (2020), Broxterman et al. (2024), Grandón et al. (2024), and Osato, Liu & Haiman (2021) based on convergence maps with different redshift bins, noise properties, and hydrodynamic simulation. Studies of baryonic feedback on the peak counts based on the baryonic correction model also show the same overall effect (Yang et al. 2013; Weiss et al. 2019). Therefore, baryons can introduce biases in the cosmological parameters if their effects are not modelled correctly. This effect is less significant when increasing the smoothing scale, due to the removal of small-scale structures where baryons become important, and potentially a loss of precision in cosmological constraints.

To introduce the effect of baryons on the likelihood mean |$\mu (\theta)$|⁠, we impose the product

(3)

where we assume that the fractional impact of baryonic feedback on the peak counts is cosmology-independent (Broxterman et al. 2024). We present more details on the correction factor and its implementation in the likelihood in Section 4.

3 BAYESIAN MODEL AVERAGING AND POSTERIOR SET-UP

The standard practice in statistical inference for cosmology consists of assuming the existence of a cosmological model that could have generated the data, and then estimating the model parameters based on the observed data. The best-fitting parameter values are thus conditioned on the chosen model. When multiple competing theoretical models exist, we can advance further and select the model that is favoured by the data according to some criteria, such as the evidence ratio in the Bayesian framework. Therefore, we draw conclusions assuming the selected model to be true. However, like many other pre-specified models, this model may still be an approximation. This raises the question of how to address the fact that we select a model from a range of competing candidate models. A way to propagate this model uncertainty is to implement BMA. The BMA address the model uncertainty by performing an average of candidate models, producing more robust predictions (see Hoeting et al. 1999; Hinne et al. 2020 for a review).5 In the average, each model posterior probability is weighted by its Bayesian evidence.

The BMA posterior corresponds to

(4)

where |$\mathcal {P}(M_k|\boldsymbol{x})$| is the Bayesian evidence of model |$M_k$| and |$\mathcal {P}(\boldsymbol{\theta }|\boldsymbol{x},M_k)$| is the posterior of each model |$M_k$|⁠. The pieces of evidence are given by

(5)

where |$\boldsymbol{\theta }_k$| are all |$r_k$| parameters that model |$M_k$| uses. As model |$M_k$| might use more parameters than the primary cosmological parameters, |$r_k$| can differ per model. In order to arrive at a model-averaged posterior for parameters |$\boldsymbol{\theta }$|⁠, the vectors |$\boldsymbol{\theta }_k$|⁠, however, include all parameters of |$\boldsymbol{\theta }$|⁠. The term |$\mathcal {L}$| in equation (5) is the likelihood, and |$\pi$| is the prior on the model parameters |$\boldsymbol{\theta }_k$| in model |$M_k$|⁠.

The evidence expresses the total probability that a model |$M_k$| has generated the data |$\boldsymbol{x}$|at all. As can be seen, the evidence integrates over all parameters of model |$M_k$| and weighs their contribution to the total evidence by the likelihood and the prior probability. If model |$M_1$| has a larger evidence than model |$M_2$| in light of the data |$\boldsymbol{x}$|⁠, then model |$M_1$| is more likely to be the model to have generated the data. By evaluating pieces of evidence, one can therefore rank models by their relative probability to have generated the data.

The complexity of baryonic physics in the large-scale structure has led to multiple approaches to model its effects. Therefore, the BMA is a natural solution to address this model uncertainty in baryonic physics for weak lensing inference. We implement the BMA in the analysis of non-Gaussianity estimators so that the data can determine which baryonic feedback model is the most likely.

Finally, we compute the Bayes factor between models. Given two competing models |$M_j$| and |$M_k$|⁠, the ratio of the pieces of evidence corresponds to (Kass & Raftery 1995; Jeffreys 1998)

(6)

with models having the same prior probability.

4 PARAMETER INFERENCE

This section describes the set-up of our posterior. We first estimate the covariance matrix from the SLICS simulations. It is computed as follows:

(7)

where |$N_r=953$| is the number of realizations per estimator |$\boldsymbol{x}$| at the fiducial cosmology and |$\bar{\boldsymbol{x}}$| is the mean for the estimators. We assume a Stage IV survey with a sky coverage of 18 000 deg|$^{2}$|⁠, and thus we rescale the covariance matrix as |${\sf {C}}= (100/18\,000){\sf {C}}$|⁠. For estimated covariance matrices, we adopt the likelihood function presented in Sellentin & Heavens (2016). This corresponds to the modified t-distribution

(8)

where |$\mu _B$| corresponds to the mean including the effect of baryons in equation (3). To obtain |$\mu (\boldsymbol{\theta })$|⁠, we model the peak counts for arbitrary cosmologies by training a Gaussian process emulator, implemented in scikit-learn6 (Pedregosa et al. 2011). Our training set consists of the 26 cosmo-SLICS cosmologies in the parameter space of |$\Omega _m$|⁠, w, and |$S_8$|⁠, for which we calculate the peak counts. We implement a radial basis function kernel and check its accuracy using a leave-one-out cross-validation test. The emulator errors are below |$1\sigma$| uncertainty of the survey; however, the training of emulators is still an open challenge for Stage IV precision. We further describe the emulator challenges for various non-Gaussian statistics and the effects of the error propagation (e.g. Grandón & Sellentin 2022; Harnois-Déraps et al. 2024) in Grandón et al. (in preparation). Through this paper, we implement flat prior probability for the parameters given by |$-1.80 \lt w \lt -0.70$|⁠, |$0.63 \lt S_8 \lt 0.89$|⁠, and|$0.1 \lt \Omega _m \lt 0.55$|⁠.

Our estimators are obtained from gravity-only simulations. Hence, to include baryonic effects in the theory |$\boldsymbol{\mu }_B$|⁠, we infuse the effect of baryons into the dark matter-only mean, as presented in equation (2).

4.1 Data vector set-up

To emulate a real science case at the Stage IV precision, we infuse the baryons on the data vector as well. First, we assume that this data vector is drawn from a Gaussian |$\mathcal {G}$| as

(9)

where |${\sf {C}}$| is the data covariance matrix and the data’s expectation value equals |$\langle \boldsymbol{x}\rangle = \boldsymbol{\mu }(\boldsymbol{\theta }_t)$|⁠, i.e. a parametric mean evaluated at the position of the true cosmological parameters |$\boldsymbol{\theta }_t$|⁠. As real data contain baryonic feedback, we correct for the baryons into the data vectors following the same procedure as for the mean.

We explicitly write out the baryonic feedback model M from which these data arise. M can be ‘high AGN’, ‘fid AGN’, ‘low AGN’, or ‘DMO’. Our data |$\boldsymbol{x}$| and mean |$\boldsymbol{\mu }$| always contain the same non-Gaussianity estimators, and the mean is always evaluated for each of the four models. Obviously, if a data vector |$\boldsymbol{x}$| stems in reality from model |$M = \rm{'high\ AGN'}$|⁠, but is then fitted with a mean |$\boldsymbol{\mu }(\boldsymbol{\theta })$| from model |$M = \rm{'low\ AGN'}$|⁠, then the inferred parameters |$\boldsymbol{\theta }$| will be biased. This bias ensues from the incorrect baryon model being chosen. This is a significant concern of actual data analysis where the true impact of baryons on the large-scale structure is far from sufficiently understood. To include baryonic physics into the data vector, we consider two cases for |$\boldsymbol{x}$|⁠: (1) a data vector with fiducial AGN-like baryons and (2) a data vector with model misspecification. The details on how to generate such cases are detailed below.

4.1.1 Case 1: data vector with fiducial AGN

Our first case considers parameter inference with the data vector corresponding to |$\boldsymbol{x}_{\mathrm{o}}= \boldsymbol{x}_iB_i$|⁠, where |$B_i$| is the correction factor derived from the fiducial-AGN model. We repeat our parameter inference with this data vector, but with the corrected mean varying the model M.

4.1.2 Case 2: model misspecification

If one proposes multiple models for fitting the data, then it may happen that none of them is the model that generated the data. This situation is called model misspecification. We imitate this situation by generating a data vector |$\boldsymbol{x}_o$| as

(10)

with |$\lambda =0.5$|⁠. Hence, the correct mean of this data vector is

(11)

To demonstrate model misspecification, we purposefully fit the data with the four means of our original four models.7

We sample the posterior of the cosmological parameters |$\Omega _m$|⁠, |$S_8$|⁠, and w with multinest (Feroz, Hobson & Bridges 2009) as implemented by pymultinest (Buchner et al. 2014). multinest reports the Bayesian evidence alongside the parameter constraints. We run our analysis for the two data vectors’ cases considered and two smoothing scales. To study the model preferred by the data, we calculate the logarithm of the Bayes factor.

5 RESULTS

5.1 Case 1

We report our results and logarithm of the Bayes factors in Table 1 and Fig. 1. Here, the data vector stems from the fiducial-AGN model (case 1) for 2 and 5 arcmin smoothing scales. In Fig. 1, we demonstrate how to remove the bias from the cosmological inference. The four posteriors in open contours depict the posteriors of each individual model, where three of them are biased posteriors with respect to the true cosmology (in dashed black lines). The biases arise from fitting three incorrect models to these data (DMO, high AGN, and low AGN), followed by the true baryonic feedback model centred at the true cosmology. The biases in the inferred cosmological parameters are statistically significant and reach |$\sim \!\! -4\sigma$| for |$\Omega _m$| and |${\sim} 3\sigma$| for w and |$S_8$| when the modelling in the mean is incorrect. From the Bayes ratio in Table 1, we see that the data have discriminating power between these different baryonic models. The logarithm of the Bayes ratio ranges from 6 up to 12.35 when compared to the fid-AGN model. According to Jeffreys’s scale, this indicates decisive evidence in favour of this baryonic feedback model. We therefore evaluated BMA posterior from equation (4) and display it in filled purple contours. The model-averaged (the BMA) posterior is centred at the true cosmology, and thus removes biases in the cosmological constraints. In this particular case, it is also almost identical to the posterior of the correct fid-AGN baryon model. This shows that for strongly constraining data, the analysis succeeds in identifying the preferred model. The other competing models are then downweighed due to their inferior evidence, as seen in Table 1.

Cosmological constraints of $S_8$, w, and $\Omega _m$ based on the peak counts with ‘Case 1’ fid-AGN data vector. Left: Results obtained for convergence maps smoothed with a Gaussian kernel of 2 arcmin smoothing scale. Right: Results with a Gaussian kernel of 5 arcmin smoothing scale. Dashed line denotes the input true cosmology. The three baryonic feedback models and dark matter-only model are depicted with dashed contour lines. The BMA result, shown with solid contour lines, combines all baryonic feedback models and successfully recovers the true cosmological parameters. The contours show 68% and 95% credible intervals.
Figure 1.

Cosmological constraints of |$S_8$|⁠, w, and |$\Omega _m$| based on the peak counts with ‘Case 1’ fid-AGN data vector. Left: Results obtained for convergence maps smoothed with a Gaussian kernel of 2 arcmin smoothing scale. Right: Results with a Gaussian kernel of 5 arcmin smoothing scale. Dashed line denotes the input true cosmology. The three baryonic feedback models and dark matter-only model are depicted with dashed contour lines. The BMA result, shown with solid contour lines, combines all baryonic feedback models and successfully recovers the true cosmological parameters. The contours show 68% and 95% credible intervals.

Table 1.

Logarithm of Bayes factor for the baryonic feedback models considered in this work. This table shows the results obtained from the fiducial-AGN data vector (case 1).

 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|6.23|$-$|12.35|$-$|1.99
BAHAMAS low-AGN6.230|$-$|6.114.23
BAHAMAS fid-AGN12.356.11010.35
BAHAMAS high-AGN1.99|$-$|4.23|$-$|10.350
  5 arcmin  
DM0|$-$|0.71|$-$|2.91|$-$|0.93
BAHAMAS low-AGN0.710|$-$|2.21|$-$|0.22
BAHAMAS fid-AGN2.912.2101.98
BAHAMAS high-AGN0.930.22|$-$|1.980
 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|6.23|$-$|12.35|$-$|1.99
BAHAMAS low-AGN6.230|$-$|6.114.23
BAHAMAS fid-AGN12.356.11010.35
BAHAMAS high-AGN1.99|$-$|4.23|$-$|10.350
  5 arcmin  
DM0|$-$|0.71|$-$|2.91|$-$|0.93
BAHAMAS low-AGN0.710|$-$|2.21|$-$|0.22
BAHAMAS fid-AGN2.912.2101.98
BAHAMAS high-AGN0.930.22|$-$|1.980
Table 1.

Logarithm of Bayes factor for the baryonic feedback models considered in this work. This table shows the results obtained from the fiducial-AGN data vector (case 1).

 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|6.23|$-$|12.35|$-$|1.99
BAHAMAS low-AGN6.230|$-$|6.114.23
BAHAMAS fid-AGN12.356.11010.35
BAHAMAS high-AGN1.99|$-$|4.23|$-$|10.350
  5 arcmin  
DM0|$-$|0.71|$-$|2.91|$-$|0.93
BAHAMAS low-AGN0.710|$-$|2.21|$-$|0.22
BAHAMAS fid-AGN2.912.2101.98
BAHAMAS high-AGN0.930.22|$-$|1.980
 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|6.23|$-$|12.35|$-$|1.99
BAHAMAS low-AGN6.230|$-$|6.114.23
BAHAMAS fid-AGN12.356.11010.35
BAHAMAS high-AGN1.99|$-$|4.23|$-$|10.350
  5 arcmin  
DM0|$-$|0.71|$-$|2.91|$-$|0.93
BAHAMAS low-AGN0.710|$-$|2.21|$-$|0.22
BAHAMAS fid-AGN2.912.2101.98
BAHAMAS high-AGN0.930.22|$-$|1.980

From Table 1 for 5 arcmin, we see that there is also decisive evidence in favour of the fid-AGN model when it is compared to the other models. The 5 arcmin posterior results correspond to the right corner plot in Fig. 1. We observe that all baryonic feedback models are less biased compared to the 2 arcmin smoothing scale. This is because, as we increase the smoothing scale, we lose precision and we partially remove the imprints of baryons at the smallest scales of the maps. This is supported by the ratio of the peak counts with and without baryonic physics in Fig. A1, where the impact of baryons is less significant for the last three tomographic bins. This, in turn, makes the evidence of the individual feedback models closer to each other. This results in a wider BMA posterior contour (contour in filled purple), as seen in Fig. 1. We can therefore obtain accurate cosmological results, though with some loss of precision due to the propagation of baryonic feedback model uncertainty. Still, this allows us to make statistically robust claims about the inferred parameters.

5.2 Case 2

Fig. 2 shows our results for the model misspecification case, where the baryons in the data vector do not correspond to any of the baryonic feedback models. In this case, the data then have less preference for a single model, and instead the pieces of evidence of the two closest models dominate. This is indicated by the results in Table 2, both for 2 and 5 arcmin smoothing scales. The posterior gained from BMA hence combines the two closest posteriors, as seen in Fig. 2. Multimodality in the posterior is an expected feature in BMA, and we here observe it in the resulting posterior for |$\Omega _m$| for 2 arcmin result. As can be seen, although the true cosmology was excluded by the four biased posteriors, it is correctly assigned posterior credibility by the model-averaged posterior. This multimodality is not seen in the 5 arcmin smoothing scale, as all contours are less biased with respect to the true cosmology. For 2 arcmin, the marginal distributions of |$S_8$| and w for the BMA exhibit a higher concentration of probability mass in the right tail of the distribution, due to the influence of the fid-AGN model in the model average.

Same as Fig. 1 for the model misspecification data vector. In this case, none of the baryonic feedback models provides an accurate approximation of the influence on baryons as presented in the data vector. This results in the bias observed for all models. However, the BMA result successfully recovers the true cosmological parameters by combining the four posteriors.
Figure 2.

Same as Fig. 1 for the model misspecification data vector. In this case, none of the baryonic feedback models provides an accurate approximation of the influence on baryons as presented in the data vector. This results in the bias observed for all models. However, the BMA result successfully recovers the true cosmological parameters by combining the four posteriors.

Table 2.

Same as Table 1, but for the case 2 of model misspecification data vector.

 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|4.77|$-$|4.5410.90
BAHAMAS low-AGN4.7700.2315.68
BAHAMAS fid-AGN4.54|$-$|0.23015.45
BAHAMAS high-AGN|$-$|10.90|$-$|15.68|$-$|15.450
  5 arcmin  
DM0|$-$|1.56|$-$|2.690.73
BAHAMAS low-AGN1.560|$-$|1.132.29
BAHAMAS fid-AGN2.691.1303.42
BAHAMAS high-AGN|$-$|0.73|$-$|2.29|$-$|3.420
 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|4.77|$-$|4.5410.90
BAHAMAS low-AGN4.7700.2315.68
BAHAMAS fid-AGN4.54|$-$|0.23015.45
BAHAMAS high-AGN|$-$|10.90|$-$|15.68|$-$|15.450
  5 arcmin  
DM0|$-$|1.56|$-$|2.690.73
BAHAMAS low-AGN1.560|$-$|1.132.29
BAHAMAS fid-AGN2.691.1303.42
BAHAMAS high-AGN|$-$|0.73|$-$|2.29|$-$|3.420
Table 2.

Same as Table 1, but for the case 2 of model misspecification data vector.

 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|4.77|$-$|4.5410.90
BAHAMAS low-AGN4.7700.2315.68
BAHAMAS fid-AGN4.54|$-$|0.23015.45
BAHAMAS high-AGN|$-$|10.90|$-$|15.68|$-$|15.450
  5 arcmin  
DM0|$-$|1.56|$-$|2.690.73
BAHAMAS low-AGN1.560|$-$|1.132.29
BAHAMAS fid-AGN2.691.1303.42
BAHAMAS high-AGN|$-$|0.73|$-$|2.29|$-$|3.420
 DMBAHAMAS low-AGNBAHAMAS fid-AGNBAHAMAS high-AGN
  2 arcmin  
DM0|$-$|4.77|$-$|4.5410.90
BAHAMAS low-AGN4.7700.2315.68
BAHAMAS fid-AGN4.54|$-$|0.23015.45
BAHAMAS high-AGN|$-$|10.90|$-$|15.68|$-$|15.450
  5 arcmin  
DM0|$-$|1.56|$-$|2.690.73
BAHAMAS low-AGN1.560|$-$|1.132.29
BAHAMAS fid-AGN2.691.1303.42
BAHAMAS high-AGN|$-$|0.73|$-$|2.29|$-$|3.420

6 CONCLUSIONS

This paper presents a Bayesian solution to safeguard parameter inference against biases resulting from baryonic feedback not being correctly modelled in the weak lensing non-Gaussian statistics. Our solution consists of BMA, a statistical framework that proposes a posterior distribution combining the individual posteriors of multiple competing models that could have generated the observed data. Therefore, instead of comparing models by means of model selection criteria, the BMA enables more robust predictions by averaging all models, and hence propagating this model uncertainty. The BMA posterior is presented in equation (4).

In this paper, we focus on three baryonic feedback models that impact the non-Gaussian estimators and hence the inference of cosmological parameters |$\Omega _m$|⁠, w, and |$S_8$|⁠. We perform a tomographic analysis of the convergence peak counts at Stage IV precision.

Our results are shown in Figs 1 and 2. Fig. 1 corresponds to the resulting posteriors when the data vector is corrected by the fiducial-AGN model, which also corresponds to one of the theory models. Fig. 2 corresponds to our results when the data vector is none of the baryonic feedback models, and hence represents a model misspecification case. As can be seen from the Bayes factors in Tables 1 and 2, the cosmological data at Stage IV precision can distinguish between different feedback models, making them more or less a good fit. This means that the data will suppress bad models, and hence the largest evidence is for the correct model. By averaging the posteriors obtained from all baryonic feedback models, weighted by their evidence, the resulting BMA posterior correctly finds the true cosmology within the 68 per cent credible interval. This demonstrates that fitting a multitude of baryon models to the data, and having the data select the best models, is a solid technique to accomplish accuracy in the inference from non-Gaussianity estimators.

ACKNOWLEDGEMENTS

We thank Ian McCarthy and the BAHAMAS simulation team for making their simulations publicly available. We would like to thank Andrew Jaffe, Alan Heavens, Joachim Harnois-Déraps, Jia liu, Javier Silva-Lafaurie, Kutay Nazli, and Tatiana M. Rodriguez for useful discussions.

DATA AVAILABILITY

The data underlying this article will be shared on reasonable request to the corresponding author.

Footnotes

4

Hydrodynamic simulations also differ in initial conditions, hydrodynamic solvers, and number of sub-grid parameters.

5

For previous applications of BMA in the study of cosmological models, we refer the reader to Liddle et al. (2006), Parkinson & Liddle (2010), Vardanyan, Trotta & Silk (2011), Paradiso, McGee & Percival (2024a), and Paradiso et al. (2024b).

7

We refer the reader to Porqueres et al. (2023) for model misspecification in intrinsic alignment studies.

REFERENCES

Abbott
T.
et al. ,
2016
,
MNRAS
,
460
,
1270

Aihara
H.
et al. ,
2018
,
PASJ
,
70
,
S4

Ajani
V.
,
Peel
A.
,
Pettorino
V.
,
Starck
J.-L.
,
Li
Z.
,
Liu
J.
,
2020
,
Phys. Rev. D
,
102
,
103531

Ajani
V.
,
Starck
J.-L.
,
Pettorino
V.
,
2021
,
A&A
,
645
,
L11

Ajani
V.
,
Harnois-Déraps
J.
,
Pettorino
V.
,
Starck
J.-L.
,
2023
,
A&A
,
672
,
L10

Amon
A.
et al. ,
2022
,
Phys. Rev. D
,
105
,
023514

Asgari
M.
et al. ,
2021
,
A&A
,
645
,
A104

Ayçoberry
E.
et al. ,
2023
,
A&A
,
671
,
A17

Bartelmann
M.
,
1995
,
A&A
,
303
,
643

Barthelemy
A.
,
Codis
S.
,
Uhlemann
C.
,
Bernardeau
F.
,
Gavazzi
R.
,
2020
,
MNRAS
,
492
,
3420

Barthelemy
A.
,
Halder
A.
,
Gong
Z.
,
Uhlemann
C.
,
2024
,
J. Cosmol. Astropart. Phys.
,
2024
,
060

Boruah
S. S.
,
Fiedorowicz
P.
,
Rozo
E.
,
2024
,
Phys. Rev. D
,
110
,
023524

Boyle
A.
,
Uhlemann
C.
,
Friedrich
O.
,
Barthelemy
A.
,
Codis
S.
,
Bernardeau
F.
,
Giocoli
C.
,
Baldi
M.
,
2021
,
MNRAS
,
505
,
2886

Broxterman
J. C.
et al. ,
2024
,
MNRAS
,
529
,
2309

Buchner
J.
et al. ,
2014
,
A&A
,
564
,
A125

Castiblanco
L.
,
Uhlemann
C.
,
Harnois-Déraps
J.
,
Barthelemy
A.
,
2024
,
Open J. Astrophys
,
7
,
59

Cheng
S.
,
Ménard
B.
,
2021
,
MNRAS
,
507
,
1012

Cheng
S.
,
Ting
Y.-S.
,
Ménard
B.
,
Bruna
J.
,
2020
,
MNRAS
,
499
,
5902

Cheng
S.
,
Marques
G. A.
,
Grandón
D.
,
Thiele
L.
,
Shirasaki
M.
,
Ménard
B.
,
Liu
J.
,
2024
,
preprint
()

Chisari
N. E.
et al. ,
2019
,
Open J. Astrophys.
,
2
,
4

Coulton
W. R.
,
Liu
J.
,
McCarthy
I. G.
,
Osato
K.
,
2020
,
MNRAS
,
495
,
2531

Davies
C. T.
,
Harnois-Déraps
J.
,
Li
B.
,
Giblin
B.
,
Hernández-Aguayo
C.
,
Paillas
E.
,
2024
,
MNRAS
,
533
,
3546

Dubois
Y.
et al. ,
2014
,
MNRAS
,
444
,
1453

Euclid Collaboration
,
2023
,
A&A
,
675
,
A120

Feroz
F.
,
Hobson
M. P.
,
Bridges
M.
,
2009
,
MNRAS
,
398
,
1601

Fiedorowicz
P.
,
Rozo
E.
,
Boruah
S. S.
,
Chang
C.
,
Gatti
M.
,
2022
,
MNRAS
,
512
,
73

Gatti
M.
et al. ,
2020
,
MNRAS
,
498
,
4060

Giblin
B.
,
Cai
Y.-C.
,
Harnois-Déraps
J.
,
2023
,
MNRAS
,
520
,
1721

Grandón
D.
,
Sellentin
E.
,
2022
,
Open J. Astrophys
,
5
,
12

Grandón
D.
,
Marques
G. A.
,
Thiele
L.
,
Cheng
S.
,
Shirasaki
M.
,
Liu
J.
,
2024
,
Phys. Rev. D
,
110
,
103539

Grewal
N.
,
Zuntz
J.
,
Tröster
T.
,
Amon
A.
,
2022
,
Open J. Astrophys
,
5
,
13

Harnois-Déraps
J.
et al. ,
2018
,
MNRAS
,
481
,
1337

Harnois-Déraps
J.
,
Giblin
B.
,
Joachimi
B.
,
2019
,
A&A
,
631
,
A160

Harnois-Déraps
J.
,
Martinet
N.
,
Castro
T.
,
Dolag
K.
,
Giblin
B.
,
Heymans
C.
,
Hildebrandt
H.
,
Xia
Q.
,
2021
,
MNRAS
,
506
,
1623

Harnois-Déraps
J.
et al. ,
2024
,
MNRAS
,
534
,
3305

Hellwing
W. A.
,
Schaller
M.
,
Frenk
C. S.
,
Theuns
T.
,
Schaye
J.
,
Bower
R. G.
,
Crain
R. A.
,
2016
,
MNRAS
,
461
,
L11

Hinne
M.
,
Gronau
Q.
,
van den Bergh
D.
,
Wagenmakers
E.-J.
,
2020
,
Adv. Methods Pract. Psychol. Sci.
,
3
,
251524591989865

Hinshaw
G.
et al. ,
2013
,
ApJS
,
208
,
19

Hoeting
J. A.
,
Madigan
D.
,
Raftery
A. E.
,
Volinsky
C. T.
,
1999
,
Stat. Sci.
,
14
,
382

Ivezić
Ž.
et al. ,
2019
,
ApJ
,
873
,
111

Jain
B.
,
Van Waerbeke
L. V.
,
2000
,
ApJ
,
530
,
L1

Jeffreys
H.
,
1998
,
Theory of Probability, International Series of Monographs on Physics
.
Clarendon Press
,
University of California
, 0198531931, 9780198531937,

Kacprzak
T.
et al. ,
2016
,
MNRAS
,
463
,
3653

Kaiser
N.
,
Squires
G.
,
1993
,
ApJ
,
404
,
441

Kass
R. E.
,
Raftery
A. E.
,
1995
,
J. Am. Stat. Assoc.
,
90
,
773

Khandai
N.
,
Di Matteo
T.
,
Croft
R.
,
Wilkins
S.
,
Feng
Y.
,
Tucker
E.
,
DeGraf
C.
,
Liu
M.-S.
,
2015
,
MNRAS
,
450
,
1349

Kratochvil
J. M.
,
Lim
E. A.
,
Wang
S.
,
Haiman
Z.
,
May
M.
,
Huffenberger
K.
,
2012
,
Phys. Rev. D
,
85
,
103513

Kuijken
K.
et al. ,
2015
,
MNRAS
,
454
,
3500

Laureijs
R.
et al. ,
2011
,
preprint
()

Li
Z.
,
Liu
J.
,
Matilla
J. M. Z.
,
Coulton
W. R.
,
2019
,
Phys. Rev. D
,
99
,
063527

Liddle
A. R.
,
Mukherjee
P.
,
Parkinson
D.
,
Wang
Y.
,
2006
,
Phys. Rev. D
,
74
,
123506

Liu
J.
,
Haiman
Z.
,
2016
,
Phys. Rev. D
,
94
,
043533

Liu
J.
,
Madhavacheril
M. S.
,
2019
,
Phys. Rev. D
,
99
,
083508

Liu
J.
,
Petri
A.
,
Haiman
Z.
,
Hui
L.
,
Kratochvil
J. M.
,
May
M.
,
2015
,
Phys. Rev. D
,
91
,
063507

Liu
X.
,
Yuan
S.
,
Pan
C.
,
Zhang
T.
,
Wang
Q.
,
Fan
Z.
,
2023
,
MNRAS
,
519
,
594

Loureiro
A.
,
Whiteway
L.
,
Sellentin
E.
,
Lafaurie
J. S.
,
Jaffe
A. H.
,
Heavens
A. F.
,
2022
,
Open J. Astrophys.
,
6
,
2023

McCarthy
I. G.
,
Schaye
J.
,
Bird
S.
,
Le Brun
A. M. C.
,
2017
,
MNRAS
,
465
,
2936

McCarthy
I. G.
,
Bird
S.
,
Schaye
J.
,
Harnois-Déraps
J.
,
Font
A. S.
,
Van Waerbeke
L.
,
2018
,
MNRAS
,
476
,
2999

McCarthy
I. G.
et al. ,
2023
,
MNRAS
,
526
,
5494

Mandelbaum
R.
et al. ,
2018
,
PASJ
,
70
,
S25

Marques
G. A.
,
Liu
J.
,
Matilla
J. M. Z.
,
Haiman
Z.
,
Bernui
A.
,
Novaes
C. P.
,
2019
,
J. Cosmol. Astropart. Phys.
,
2019
,
019

Marques
G. A.
et al. ,
2024
,
MNRAS
,
528
,
4513

Martinet
N.
et al. ,
2018
,
MNRAS
,
474
,
712

Martinet
N.
,
Castro
T.
,
Harnois-Déraps
J.
,
Jullo
E.
,
Giocoli
C.
,
Dolag
K.
,
2021
,
A&A
,
648
,
A115

Osato
K.
,
Liu
J.
,
Haiman
Z.
,
2021
,
MNRAS
,
502
,
5593

Paradiso
S.
,
McGee
G.
,
Percival
W. J.
,
2024a
,
J. Cosmol. Astropart. Phys.
,
2024
,
021

Paradiso
S.
,
DiMarco
M.
,
Chen
M.
,
McGee
G.
,
Percival
W. J.
,
2024b
,
MNRAS
,
528
,
1531

Parkinson
D.
,
Liddle
A. R.
,
2010
,
Phys. Rev. D
,
82
,
103533

Parroni
C.
,
Cardone
V. F.
,
Maoli
R.
,
Scaramella
R.
,
2020
,
A&A
,
633
,
A71

Pedregosa
F.
et al. ,
2011
,
J. Mach. Learn. Res.
,
12
,
2825

Peirani
S.
et al. ,
2017
,
MNRAS
,
472
,
2153

Pires
S.
et al. ,
2020
,
A&A
,
638
,
A141

Porqueres
N.
,
Heavens
A.
,
Mortlock
D.
,
Lavaux
G.
,
Makinen
T. L.
,
2023
,
preprint
()

Schaye
J.
et al. ,
2010
,
MNRAS
,
402
,
1536

Secco
L. F.
et al. ,
2022
,
Phys. Rev. D
,
105
,
023515

Sellentin
E.
,
Heavens
A. F.
,
2016
,
MNRAS
,
456
,
L132

Sellentin
E.
,
Loureiro
A.
,
Whiteway
L.
,
Lafaurie
J. S.
,
Balan
S. T.
,
Olamaie
M.
,
Jaffe
A. H.
,
Heavens
A. F.
,
2023
,
Open J. Astrophys
,
6
,
31

Semboloni
E.
,
Hoekstra
H.
,
Schaye
J.
,
2013
,
MNRAS
,
434
,
148

Shan
H.
et al. ,
2018
,
MNRAS
,
474
,
1116

Springel
V.
et al. ,
2018
,
MNRAS
,
475
,
676

Squires
G.
,
Kaiser
N.
,
1996
,
ApJ
,
473
,
65

Thiele
L.
,
Hill
J. C.
,
Smith
K. M.
,
2020
,
Phys. Rev. D
,
102
,
123545

Thiele
L.
,
Marques
G. A.
,
Liu
J.
,
Shirasaki
M.
,
2023
,
Phys. Rev. D
,
108
,
123526

Troxel
M. A.
et al. ,
2018
,
Phys. Rev. D
,
98
,
043528

Valogiannis
G.
,
Dvorkin
C.
,
2022
,
Phys. Rev. D
,
105
,
103534

van Daalen
M. P.
,
Schaye
J.
,
Booth
C. M.
,
Dalla Vecchia
C.
,
2011
,
MNRAS
,
415
,
3649

van Daalen
M. P.
,
McCarthy
I. G.
,
Schaye
J.
,
2020
,
MNRAS
,
491
,
2424

van Waerbeke
L.
,
2000
,
MNRAS
,
313
,
524

Vardanyan
M.
,
Trotta
R.
,
Silk
J.
,
2011
,
MNRAS
,
413
,
L91

Vogelsberger
M.
et al. ,
2014
,
Nature
,
509
,
177

Weiss
A. J.
,
Schneider
A.
,
Sgier
R.
,
Kacprzak
T.
,
Amara
A.
,
Refregier
A.
,
2019
,
J. Cosmol. Astropart. Phys.
,
2019
,
011

Yang
X.
,
Kratochvil
J. M.
,
Wang
S.
,
Lim
E. A.
,
Haiman
Z.
,
May
M.
,
2011
,
Phys. Rev. D
,
84
,
043529

Yang
X.
,
Kratochvil
J. M.
,
Huffenberger
K.
,
Haiman
Z.
,
May
M.
,
2013
,
Phys. Rev. D
,
87
,
023511

Zürcher
D.
et al. ,
2022
,
MNRAS
,
511
,
2075

APPENDIX A

The impact of baryons on the peak counts is presented in Fig. A1. The grey shaded region corresponds to Stage IV 1|$\sigma$| uncertainty, obtained from the diagonal of the data covariance matrix in equation (7). We find that baryonic feedback reduces the number of peaks up to 10 per cent for large |$\kappa$|⁠, and two out of three of our baryonic feedback models produce effects that exceed the error budget of the survey. Therefore, neglecting the effects of baryons can lead to statistically significant biases in cosmological constraints, as confirmed by our results presented in Figs 1 and 2.

The impact of baryonic feedback on the peak counts for smoothing scales of 2 arcmin (top) and 5 arcmin (bottom). The panels from left to right show the results for the tomographic bins. The grey shaded region indicates the survey $1\sigma$ uncertainty.
Figure A1.

The impact of baryonic feedback on the peak counts for smoothing scales of 2 arcmin (top) and 5 arcmin (bottom). The panels from left to right show the results for the tomographic bins. The grey shaded region indicates the survey |$1\sigma$| uncertainty.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.