Sensitivity analysis of a galaxy formation model

Planck Collaboration XVI et al. (2014) cosmology used in the P-Millennium simulation; the last two rows give the simulation box length and the number of particles used.

Parameter	Value
Ω_Λ	0.693
Ω_M	0.307
Ω_baryon	0.04825
h	0.6777
σ₈	0.8288
n	0.967
L[h⁻¹ Mpc]	542.16
N_P	5040³

Table 1.

Planck Collaboration XVI et al. (2014) cosmology used in the P-Millennium simulation; the last two rows give the simulation box length and the number of particles used.

Parameter	Value
Ω_Λ	0.693
Ω_M	0.307
Ω_baryon	0.04825
h	0.6777
σ₈	0.8288
n	0.967
L[h⁻¹ Mpc]	542.16
N_P	5040³

2.1.1 Star formation rate

The GP14.PMILL model uses an empirically motivated star formation law that was introduced by Blitz & Rosolowsky (2006) and implemented in galform by Lagos et al. (2011). The star formation rate is given by

$$\begin{eqnarray*} \Sigma _{\mathrm{SFR}}=\nu _{\mathrm{SF}}\times {f}_{\mathrm{mol}}\times \Sigma _{\mathrm{gas}}, \end{eqnarray*}$$

(1)

where Σ_SFR is the star formation rate per unit area, Σ_gas is the surface density of gas, ν_SF is the inverse of the star formation time-scale, and f_mol is the ratio of the surface densities of the molecular and total gas masses, Σ_mol/Σ_gas.

2.1.2 Supernova feedback

Supernova feedback in galform is modelled as a process that ejects cold gas from a galaxy to a reservoir of mass m_res, at a rate of

$$\begin{eqnarray*} \dot{m}_{\mathrm{out}}=\beta \psi , \end{eqnarray*}$$

(2)

where ψ is the star formation rate and β is a mass loading factor defined as

$$\begin{eqnarray*} \beta =\left(\frac{V_{\mathrm{\mathrm{ c}}}}{V_{\mathrm{ hot}}}\right)^{-\gamma _{\mathrm{SN}}}. \end{eqnarray*}$$

(3)

Here, V_hot and γ_SN are model parameters and V_c is the effective circular velocity of the disc or bulge (for starbursts) at the half-mass radius. Note that these equations are applied to quiescent and burst star formation. Different values can be adopted for the V_hot parameters for the disc and burst contributions to star formation.

Gas is returned from this reservoir to the hot gas halo at the rate of

$$\begin{eqnarray*} \dot{m}_{\mathrm{ret}}=\alpha _{\mathrm{ret}}\times \frac{m_{\mathrm{res}}}{\tau _{\mathrm{dyn}}}, \end{eqnarray*}$$

(4)

which is controlled by the free parameter α_ret; τ_dyn = r_vir/V_vir is the dynamical time of the halo, where r_vir is the virial radius of the halo and V_vir is the effective circular velocity at this radius.

2.1.3 AGN feedback

Supermassive black holes (SMBHs) grow in the centres of galaxies, and inject energy into the gas reservoir of a halo following accretion, which disrupts the cooling process (see Fanidakis et al. 2011 and Griffin et al. 2019 for descriptions of the treatment of SMBHs in galform). In galform AGN heating occurs when two conditions are met: (i) the hot gas halo is in quasi-hydrostatic equilibrium, defined in terms of the ratio of the cooling time, τ_cool, to the free-fall time, τ_ff:

$$\begin{eqnarray*} \frac{\tau _{\mathrm{cool}}(r_{\mathrm{cool}})}{\tau _{\mathrm{ff}}(r_{\mathrm{cool}})}\gt \frac{1}{\alpha _{\mathrm{cool}}}, \end{eqnarray*}$$

(5)

where α_cool is a parameter, and (ii) the AGN power required to balance the radiative cooling luminosity L_cool is below a fraction f_Edd of the Eddington luminosity L_Edd of the SMBH of mass M_BH:

$$\begin{eqnarray*} L_{\rm cool} \lt f_{\rm Edd} L_{\rm Edd} \left(M_{\rm BH}\right). \end{eqnarray*}$$

(6)

2.1.4 Disc instabilities

Galaxies can also undergo morphological transformations and starbursts as a result of disc instabilities. Galaxy discs that are dominated by rotational motions are unstable to bar formation when they are sufficiently self-gravitating. We assume that discs are dynamically unstable to bar formation if (Efstathiou, Lake & Negroponte 1982)

$$\begin{eqnarray*} F_{\rm disc} = \frac{V_\mathrm{ c}(r_{\mathrm{disc}})}{\sqrt{1.68 \, G \, M_{\mathrm{disc}}/r_{\mathrm{disc}}}}\lt f_{\mathrm{stab}}, \end{eqnarray*}$$

(7)

where M_disc is the total disc mass (i.e. stars plus cold gas), r_disc is the disc half-mass radius, and the factor 1.68 relates this to the exponential scale length of the disc.

The quantity F_disc measures the contribution of disc self-gravity to its circular velocity, with larger values corresponding to less self-gravity and so greater disc stability. Efstathiou et al. (1982) found a stability threshold F_stab ≈ 1.1 for a family of exponential stellar disc models. Note that a completely self-gravitating stellar disc would have F_disc = 0.61, which is therefore the minimum value allowed for this parameter.

2.1.5 Parameter selection

We consider the relative importance of the processes described in Sections 2.1.1–2.1.4 by performing an SA on the parameters that describe these phenomena. The parameters and the ranges over which they are varied are listed in Table 2. In some instances, the parameter range is reasonably well defined, such as f_stab, as discussed above in Section 2.1.4. In other cases, the choice of range of parameter values is less well defined. For example, using simple conservative arguments, γ_SN could take on values of 1 and 2 in the momentum and energy conserving phases of the wind evolution (Ostriker & McKee 1988; Lagos, Lacey & Baugh 2013). Numerical simulations of winds have suggested different values of γ_SN. The other parameters defining the galform model beyond those listed in Table 2 are held fixed.

Table 2.

The galform parameters analysed in this work. The parameter ranges have been taken from previous analyses (Bower et al. 2010; Rodrigues et al. 2017).

Process	Parameter	Min.	Max.
Star formation	ν_SF [Gyr⁻¹]	0.2	1.2
Supernova feedback	γ_SN	1.0	4.0
	α_ret	0.2	1.2
	V_{hot, disc} [km s⁻¹]	100	550
	V_{hot, burst} [km s⁻¹]	100	550
AGN feedback	α_cool	0.2	1.2
Disc instabilities	f_stab	0.61	1.1

Process	Parameter	Min.	Max.
Star formation	ν_SF [Gyr⁻¹]	0.2	1.2
Supernova feedback	γ_SN	1.0	4.0
	α_ret	0.2	1.2
	V_{hot, disc} [km s⁻¹]	100	550
	V_{hot, burst} [km s⁻¹]	100	550
AGN feedback	α_cool	0.2	1.2
Disc instabilities	f_stab	0.61	1.1

Table 2.

The galform parameters analysed in this work. The parameter ranges have been taken from previous analyses (Bower et al. 2010; Rodrigues et al. 2017).

Process	Parameter	Min.	Max.
Star formation	ν_SF [Gyr⁻¹]	0.2	1.2
Supernova feedback	γ_SN	1.0	4.0
	α_ret	0.2	1.2
	V_{hot, disc} [km s⁻¹]	100	550
	V_{hot, burst} [km s⁻¹]	100	550
AGN feedback	α_cool	0.2	1.2
Disc instabilities	f_stab	0.61	1.1

Process	Parameter	Min.	Max.
Star formation	ν_SF [Gyr⁻¹]	0.2	1.2
Supernova feedback	γ_SN	1.0	4.0
	α_ret	0.2	1.2
	V_{hot, disc} [km s⁻¹]	100	550
	V_{hot, burst} [km s⁻¹]	100	550
AGN feedback	α_cool	0.2	1.2
Disc instabilities	f_stab	0.61	1.1

2.1.6 Model output

After the formation and evolution of galaxies are calculated over the merger history of the dark matter haloes in the PMILL simulation, galaxy luminosities can be obtained from the predicted star formation rate and metallicity of the stars produced using a stellar population synthesis model. Dust extinction is calculated in post-processing, based on the size and gas metallicity of each galaxy (Gonzalez-Perez et al. 2014; Lacey et al. 2016). The model output that we focus on here is the K-band luminosity function at z = 0.

2.2 Variance-based sensitivity analysis

The SA method we use here closely follows those used by Sobol’ (2001) and Saltelli et al. (2017), which are designed to decompose variance in the model output into the variances of the input parameters and their interactions using as few model evaluations as possible.

Many SA approaches suffer from a number of shortcomings, which make them unsuitable for analysing non-linear models. By non-linear models, we mean here those that are characterized by interactions between the inputs² and which therefore cannot be analysed effectively using regression or one-at-a-time (OAT) parameter variation techniques (Morris 1991).

Unlike other methods, variance-based SA allows a full exploration of the input space, and therefore accounts for the interactions between parameters and non-linear responses of the model. It follows that variance-based methods are able to evaluate the total-effect indices (see below) and rank the parameters in order of their influence on the output (Chan, Saltelli & Tarantola 1997; Sobol’ 2001; Saltelli et al. 2010).

Finally, we note that all SA methods assume that the model inputs are independent, which might not hold in general for complex models. For instance, correlations between inputs, or unphysical combinations of their values, cannot be recognized by SA techniques. Similarly, variance-based SA currently assumes that the model output is a scalar. This means that the model outputs are independent of one another; for example, in the case of the luminosity function, the model prediction in a luminosity bin is considered to be independent of the results in other bins and from other outputs, e.g. other galaxy statistics. Even if the output of the model is multidimensional, and even if it is correlated across one or more dimensions,³ each of the outputs must be analysed in isolation from the others. Unfortunately, at the time of writing, there are no well-established techniques that quantify or alleviate these two shortcomings. However, these limitations do not apply to galform: the input parameters can all be varied independently and freely across the entire parameter space, and our outputs will be quantized and analysed independently.

2.3 Sampling parameter space

Sampling the high-dimensional parameter space of a complex model requires a trade-off between the accuracy of the sampling and computational expense. The accuracy of the sampling describes how well the space is probed – have any potentially interesting regions of the parameter space been overlooked because too few points have been sampled or because the method used has left gaps in the space?

The accuracy of a sampling scheme can be assessed formally in terms of its 'discrepancy'. The lowest discrepancy sampling possible is a regular grid. However, this is subject to 'aliasing' or a lack of resolution due to the fixed gaps in the parameter space between the model evaluations; interesting model behaviour could be hidden in the unsampled parts of the parameter space. The convergence of the exploration of the parameter space is slow with a regular grid. The aliasing can be reduced and the convergence rate sped up by using a random sequence to sample the parameter space, which leads to a higher density of sampling in some parts of parameter space compared to a regular grid. The drawback in this case is that some regions of the parameter space will be more sparsely sampled than they were using a regular grid. A random sequence is formally described as the highest discrepancy sampling. Ideally, for a fixed number of sampling points, we want to strike a balance between avoiding the regular sampling achieved using a grid and leaving big gaps unsampled in the parameter space, as happens with random sampling.

Several quasi-random techniques have been proposed to generate sequences that approach this ideal of 'low-discrepancy' sampling, and which also ensure fast convergence of the uncertainties in the sensitivity indices. A quasi-random sequence is one designed to generate points in d-dimensions, which appear random but which are generated deterministically to have certain desired properties. Unlike pseudo-random and truly random sequences, successive points in a quasi-random sequence fill the gaps left by the previous points in the parameter space. The 'random' part of the name is technically a misnomer, as the sequence is fully deterministic, but yields a uniform distribution when projected on to any dimension of the parameter space.

A quasi-random sequence can be designed to minimize its discrepancy. For a low-discrepancy sequence, all of its subsequences also have low discrepancy. If a given sequence is uniform, its discrepancy tends to be zero as its length increases. For these reasons, quasi-random low-discrepancy sequences are used to maintain a balance between rapid convergence of numerical algorithms, a thorough coverage of the parameter space, and a high uniformity of a resulting sample along all dimensions of the parameter space (Press et al. 2007, section 7.8). Quasi-random sequences are therefore an attractive replacement for pseudo-random sequences in many applications that require a high-quality sampling.

Sampling based on low-discrepancy sequences, such as the recurrent additive sequence (Ulam 1960), Halton sequence (Halton 1964), Latin hypercube (Stein 1987), or Sobol’ sequence (Sobol’ 1967; Levitan et al. 1988), can be used in numerical integration and model optimization and have been shown to outperform schemes based on truly random, or pseudo-random number generators, while achieving significantly faster convergence rates (Sobol’ 1993). The advantage of these sequences over truly random and pseudo-random sequences can be attributed to the fact that the low-discrepancy property guarantees gapless sampling over the entire parameter space.

The low-discrepancy quasi-random sequence typically used in SA is the Sobol’ sequence (Sobol’ 1967). It can be efficiently calculated, and produces a sample that quickly converges to the correct set of sensitivity indices, as verified by checking against analytically calculated values for test models. Even though it is impossible to estimate the required number of model evaluations prior to running the SA, there exists a natural convergence criterion – the sum of the first-order indices, defined in equation (11), has to add up to unity. Moreover, even if the SA did not converge after the initial run, additional evaluations can be easily added (see the example in the next subsection).

2.4 Saltelli sequence sampling

The Sobol’ sequence was originally proposed as a method of improving the convergence of numerical integration (Sobol’ 1967). Antonov & Saleev (1979) developed an efficient computational method to implement Sobol’ sampling. Saltelli et al. (2010) combined multiple Sobol’ sequences to further reduce the number of points required for the estimation of the sensitivity indices, improving the convergence rate.

Hereafter, we refer to the Sobol’ sequence as an N by d matrix, where N is the number of points of a d dimensional parameter space.

The Saltelli sequence is obtained as follows: first, we generate an N by 2d Sobol’ sequence, (as demonstrated for the case of N = 4, d = 3 in the first line of equation 8). Let the first d columns be called submatrix |$\operatorname{\mathbf {A}}$| (blue), and the last d submatrix |$\operatorname{\mathbf {B}}$| (red), color text is available online. The values in the matrices indicate the locations in parameter space at which the model is to be evaluated, for parameters which can take on values over the range 0 to 1. We next construct a number d of N by d matrices |$\operatorname{\mathbf {A}_\mathbf {B}^{\left(i\right)}}$|⁠, for i ∈ {1, 2, ..., d}, such that for each |$\operatorname{\mathbf {A}_\mathbf {B}^{\left(i\right)}}$| the i^th column is taken from matrix |$\operatorname{\mathbf {B}}$|⁠, while the remaining columns come from matrix |$\operatorname{\mathbf {A}}$|⁠. The matrices |$\operatorname{\mathbf {A}}$|⁠, |$\operatorname{\mathbf {B}}$|⁠, and |$\operatorname{\mathbf {A}_\mathbf {B}^{\left(i\right)}}$| specify all the points of the parameter space at which the model is to be evaluated (one point per row), giving a total of N × (2 + d) evaluations that are required to calculate the first-order sensitivity indices.

$$\begin{eqnarray} \mathrm{Sobol}(4,3)&=& {\begin{bmatrix}{\color{blue}0.500} &\,\,\,\, {\color{blue}0.500} &\,\,\,\, {\color{blue}0.500} &\,\,\,\, {\color{red}0.500} &\,\,\,\, {\color{red}0.500} &\,\,\,\, {\color{red}0.500} \\ {\color{blue}0.250} &\,\,\,\, {\color{blue}0.750} &\,\,\,\, {\color{blue}0.250} &\,\,\,\, {\color{red}0.750} &\,\,\,\, {\color{red}0.250} &\,\,\,\, {\color{red}0.750} \\ {\color{blue}0.750} &\,\,\,\, {\color{blue}0.250} &\,\,\,\, {\color{blue}0.750} &\,\,\,\, {\color{red}0.250} &\,\,\,\, {\color{red}0.750} &\,\,\,\, {\color{red}0.250} \\ {\color{blue}0.125} &\,\,\,\, {\color{blue}0.625} &\,\,\,\, {\color{blue}0.875} &\,\,\,\, {\color{red}0.875} &\,\,\,\, {\color{red}0.625} &\,\,\,\, {\color{red}0.125} \\ \end{bmatrix}} \nonumber\\ \mathbf {A}_\mathbf {B}^{\left(1\right)}&=& {\begin{bmatrix}{\color{red}0.500} &\,\,\,\, {\color{blue}0.500} &\,\,\,\, {\color{blue}0.500} \\ {\color{red}0.750} &\,\,\,\, {\color{blue}0.750} &\,\,\,\, {\color{blue}0.250} \\ {\color{red}0.250} &\,\,\,\, {\color{blue}0.250} &\,\,\,\, {\color{blue}0.750} \\ {\color{red}0.875} &\,\,\,\, {\color{blue}0.625} &\,\,\,\, {\color{blue}0.875} \\ \end{bmatrix}} \nonumber\\ \mathbf {A}_\mathbf {B}^{\left(2\right)}&=& {\begin{bmatrix}{\color{blue}0.500} &\,\,\,\, {\color{red}0.500} &\,\,\,\, {\color{blue}0.500} \\ {\color{blue}0.250} &\,\,\,\, {\color{red}0.250} &\,\,\,\, {\color{blue}0.250} \\ {\color{blue}0.750} &\,\,\,\, {\color{red}0.750} &\,\,\,\, {\color{blue}0.750} \\ {\color{blue}0.125} &\,\,\,\, {\color{red}0.625} &\,\,\,\, {\color{blue}0.875} \\ \end{bmatrix}} \nonumber\\ \mathbf {A}_\mathbf {B}^{\left(3\right)}&=& {\begin{bmatrix}{\color{blue}0.500} &\,\,\,\, {\color{blue}0.500} &\,\,\,\, {\color{red}0.500} \\ {\color{blue}0.250} &\,\,\,\, {\color{blue}0.750} &\,\,\,\, {\color{red}0.750} \\ {\color{blue}0.750} &\,\,\,\, {\color{blue}0.250} &\,\,\,\, {\color{red}0.250} \\ {\color{blue}0.125} &\,\,\,\, {\color{blue}0.625} &\,\,\,\, {\color{red}0.125} \\ \end{bmatrix}} \end{eqnarray}$$

(8)

A visual impression of the different sampling approaches is given by Fig. 1, which shows five commonly used types of sampling: OAT, uniform pseudo-random number generator, uniform grid sampling, a two-dimensional Sobol’ sequence, and Saltelli sampling. The OAT approach is often used with far fewer evaluations than shown here, which makes it computationally cheaper than the other approaches. The drawback of this method is clear from the vast areas of the parameter space that are left unexplored. This problem is only exacerbated on increasing the dimensionality of the parameter space. The pseudo-random number generation suffers from poor convergence, as randomness often results in over- and undersampling of many regions. The Sobol’ and Saltelli sequences uniformly sample the parameter space and achieve the low-discrepancy target at a reasonable computational cost.

Comparison of selected parameter space sampling strategies. Each panel contains 400 points sampled between [0,1] in the X0 and X1 dimensions using different methods, as labelled in each panel.

Figure 1.

Comparison of selected parameter space sampling strategies. Each panel contains 400 points sampled between [0,1] in the X₀ and X₁ dimensions using different methods, as labelled in each panel.

2.5 Sensitivity indices

Given a scalar model Y with independent inputs, we can define the first-order effect of the variance in the input X_i as

$$\begin{eqnarray*} E_i&=&E_{\operatorname{\mathbf {X}_{\sim {i}}}}\left(Y|X_i\right) = \int Y(X_i) {\rm pdf}(X_i) \prod _{i \ne j}^{d} {\rm d}X_i \\ V_i&=&\operatorname{\mathrm{Var}}_{X_i}\left(E_i\right) = \int \left(E_i - E(Y)\right)^2 {\rm pdf}(X_i) {\rm d}X_i, \end{eqnarray*}$$

(9),(10)

where X_i is i^th model input, V_i is the variance integrated in X_i space over dimension i, and E_i is the mean Y value, integrated over the d-dimensional X space in all dimensions except i. Since V_i can only take values between 0 and |$\operatorname{\mathrm{Var}}(Y)$|⁠, the total variance in the model output, we can define normalized first-order sensitivity indices S_i of each input parameter as

$$\begin{eqnarray*} S_i=\frac{V_i}{\operatorname{\mathrm{Var}}\left(Y\right)}, \end{eqnarray*}$$

(11)

which measures the effect that varying the input X_i has on the output, averaged over variations of all other inputs. If S_i = 1, all variance in Y comes from the variance in X_i, whereas if S_i = 0, none of it does, and Y is independent of X_i.

In order to measure the interactions between model parameters, we can define higher order indices. For second-order interactions, the combined variance is

$$\begin{eqnarray*} V_{ij}=\operatorname{\mathrm{Var}}_{X_{ij}}\left(E_{\mathbf {X}_{\sim {ij}}}\left(Y|X_i,X_j\right)\right)-V_{i}-V_{j}, \end{eqnarray*}$$

(12)

from which S_{i, j} can be calculated analogously to S_i.

It should now be obvious from the definition of the model variance why the OAT methods are inappropriate for complex models – they do not consider the full contribution to the model variance given by equation (9) (which averages over all values of the other inputs, instead of being measured only at a designated slice, as shown in the relevant panel of Fig. 1), nor does OAT treat the combined variance of two (equation 12) or more variables correctly.

For a deterministic model, the only source of variance in the output is the variances of the inputs. Therefore, from variance decomposition it follows that

$$\begin{eqnarray*} \sum _{i=1}^{d}V_{i}+\sum _{i\lt j}^{d}V_{ij}+...+V_{12...d}=\operatorname{\mathrm{Var}}(Y), \end{eqnarray*}$$

(13)

which we normalize to obtain the sensitivity indices of all orders

$$\begin{eqnarray*} \sum _{i=1}^{d}S_{i}+\sum _{i\lt j}^{d}S_{ij}+...+S_{12...d}=1. \end{eqnarray*}$$

(14)

A direct consequence is that, in order to analytically decompose the total variance of the model, one needs to compute variances of 2^d − 1 variables, which can be computationally expensive for complex models. However, if we assume that the indices decrease as their order increases (which is correct for the model of interest here), we might be less interested in the precise values of higher order contributions, and focus instead on the total higher order response of a given variable. In this case, it is convenient to combine the higher order terms into a total-order index

$$\begin{eqnarray*} S_{Ti}=\frac{E_{\operatorname{\mathbf {X}_{\sim {i}}}}\left(\operatorname{\mathrm{Var}}_{X_i}\left(Y|\operatorname{\mathbf {X}_{\sim {i}}}\right)\right)}{\operatorname{\mathrm{Var}}(Y)}=1-\frac{\operatorname{\mathrm{Var}}_{\operatorname{\mathbf {X}_{\sim {i}}}}\left(E_{X_i}\left(Y|\operatorname{\mathbf {X}_{\sim {i}}}\right)\right)}{\operatorname{\mathrm{Var}}(Y)}, \end{eqnarray*}$$

(15)

containing all terms of the decomposed output variance, which include X_i. Unlike the first-order indices, the S_Ti do not have to add up to 1, as they include all the input interactions.⁴

Higher order effects can also be calculated in simpler analyses, such as ANOVA (Fisher 1918), high-dimensional model representations (Sobol’ 1993), or derivative-based methods. However, the total indices are a unique feature of the variance-based SA, and are a major advantage of this methodology, as they allow for a direct comparison of the linear and non-linear impacts of the input parameters.

Following Jansen (1999), Sobol’ (2001), and Saltelli et al. (2010), we can use the approximate forms of the first- and total-order sensitivity indices, based on the sampling matrices |$\operatorname{\mathbf {A}},\operatorname{\mathbf {B}}, \mathrm{ and} \operatorname{\mathbf {A}_\mathbf {B}^{\left(i\right)}}$|

$$\begin{eqnarray*} \operatorname{\mathrm{Var}}_{X_i}\left(E_{\operatorname{\mathbf {X}_{\sim {i}}}}\left(Y|X_i\right)\right)&\approx& \frac{1}{N}\sum _{j=1}^{N}f\left(\operatorname{\mathbf {B}}\right)_{j}\left(f\left(\operatorname{\mathbf {A}_\mathbf {B}^{\left(i\right)}}\right)_j-f\left(\operatorname{\mathbf {A}}\right)_j\right)\nonumber\\ \\ E_{\operatorname{\mathbf {X}_{\sim {i}}}}\left(\operatorname{\mathrm{Var}}_{X_i}\left(Y|{\operatorname{\mathbf {X}_{\sim {i}}}}\right)\right)&\approx& \frac{1}{2N}\sum _{j=1}^{N}\left(f\left(\operatorname{\mathbf {A}}\right)_j-f\left(\operatorname{\mathbf {A}_\mathbf {B}^{\left(i\right)}}\right)_j\right)^{2}, \end{eqnarray*}$$

(16),(17)

where |$f\left(\mathbf {X}\right)$| is the model f evaluated at point X.

2.6 Illustrative sensitivity analysis of a toy model

The performance of the sensitivity analysis estimator can be demonstrated using a toy model. The Ishigami function is an example of such a model, and is commonly used to test the predictions of sensitivity analysis because it contains non-linear interacting terms. Nevertheless, the sensitivity indices can be calculated analytically and compared to the estimated values.

The Ishigami function is defined by equation (14) of Ishigami & Homma (1991) as

$$\begin{eqnarray*} Y\left(X_1,X_2,X_3\right)=\sin \left(X_1\right)+a\sin ^2\left(X_2\right)+b\, X_3^4\, \sin \left(X_1\right), \end{eqnarray*}$$

(18)

where the X_i are random variables uniformly distributed between −π and π, such that pdf(X_i) = U(−π, π), and a, b are numerical constants, here chosen to be 7 and 0.1, respectively.

The SA was carried out by running the model on inputs generated by a three-dimensional Sobol’ sequence for 500 realizations, which resulted in 4000 = 500 × (2 + 2 × 3) values of X_i (as explained in Section 2.4). Next, equation (18) was evaluated at each X_i point, giving a vector Y of length 4000. Finally, the vector Y was analysed using the SALib Python package (Herman & Usher 2017).

The evaluations of equation (18) are shown in Fig. 2, and the first- and total-order sensitivity indices of the three input parameters are shown in Fig. 3. It is interesting to draw some qualitative observations from Fig. 2:

varying X₁ and X₂ in isolation results in large changes in Y; this is reflected by large values for S₁ and S₂.
varying X₁ and X₂ together has a large effect on Y; this is reflected by large values for S_T1 and S_T2.
varying X₃ for mid-range values produces little effect, but varying other parameters at extreme X₃ values produces a large change in Y; correspondingly, S₃ is nearly zero, but S_T3, which captures the global response of Y to X₃, is larger.
S₁ is negative, despite being defined in terms of non-zero variance (equation 11); this is the result of using a numerical approximation instead of an analytical formula to estimate S₁; however, note that the confidence interval includes the origin, and so the value of S₁ is consistent with zero.

Figure 2.

The 4000 evaluations of the Ishigami function (equation 18). On-diagonal histograms show distribution of the X_i parameters. Off-diagonal scatter plots show pairs of parameters and are colour-coded to show the value of output Y.

Figure 3.

First-order (red) and total (blue) sensitivity indices of the three input parameters of the Ishigami function, with 1σ confidence bars (black).

A more complete SA would involve computing second-order indices, and comparing sensitivity indices for different versions of the Ishigami function, such as with different a, b parameters, or over different X_i ranges. However, this more complete analysis is beyond the scope of this section, as it is only meant for demonstration purposes. The source code used to reproduce this analysis has been made public: https://github.com/oleskiewicz/sensitivity/releases/tag/v1.0.

2.7 GALFORM output used in the sensitivity analysis

When applying SA to a model with a multidimensional output, it is necessary to select the most interesting outputs manually. The Sobol’ index method assumes, and can only be calculated for, separate one-dimensional output vectors Y. From the formal standpoint, this is problematic as the sensitivity indices contain no information about any correlations between various model outputs. However, in practice, one could perform model runs that follow the Saltelli sampling and then carry out separate sensitivity analyses for any desired number of model outputs, since running the model is more time consuming than calculating the Sobol’ indices.

Here, we focus on the prediction of GALFORM for the K-band LF at z = 0, calculated as described in Section 2.1.6. We have chosen to consider this statistic due to the well-understood influence of the model parameters on the form of the luminosity function [see the extensive discussion in Lacey et al. (2016)]. Varying the parameters around the values used in the fiducial model shows that the bright and faint ends of the luminosity function are regulated by different physical processes. Therefore, the sensitivity indices could be easily verified for errors, and we will be able to quantify our intuition regarding the relative importance of the different feedback modes on the abundance of galaxies at different luminosities.

We have elected to perform the analysis on the model output values normalized by the observational data (see equation 20) instead of on the model output itself. This way, the values we focused on were close to the ones typically used for model optimization, and had a reduced dynamic range, being effectively normalized by the observational values. Analysing a SAM independently of the observational constraints, while interesting in its own merit, is outside the scope of this work.

3 RESULTS

3.1 Sensitivity analysis experiments

We have carried out two separate sensitivity analyses using GALFORM: (i) 600 GALFORM model runs varying two parameters (α_cool and γ_SN), and (ii) 1600 model GALFORM runs varying seven parameters (see Table 2).

For both series of runs, an SA was carried out on the K-band LF at z = 0, with two different binnings of the LF used to compute the sensitivity indices, as explained below.

In the first instance, we performed a simple analysis by splitting the LF into two broad luminosity bins, one covering a range of luminosities brighter than L* and the other luminosities fainter than L* (see Fig. 4). For each run, we calculated two model outputs covering the bright and faint ends of the LF, d_faint and d_bright, defined by summing the normalized differences between the observed and predicted values of the luminosity function for luminosities brighter and fainter than L*, e.g.

$$\begin{eqnarray*} d_{\mathrm{faint}}=\sum _{L\lt L^*}\frac{\log _{10}(\phi)-\log _{10}(\hat{\phi })}{\log _{10}(\phi)}, \end{eqnarray*}$$

(19)

with d_bright defined analogously for L > L*. The observed luminosity function |$\hat{\phi }$| is taken from Driver et al. (2012). Unlike a traditional measure of model fitness, we do not take the absolute value or square of the distance between the model prediction and the data. This is because the sign of the output (i.e. the sense of the discrepancy) is valuable information for the sensitivity indices, as it contains the direction of the model response.

Figure 4.

The K-band LF at z = 0 in the AB magnitude system. Grey lines represent 10 GALFORM model realizations randomly chosen from the 1600-model run series. The black line represents the observational data from Driver et al. (2012). The black vertical line is drawn at L = L*, and separates the bright and the faint ends of the LF.

This coarse analysis is quantitatively identical to measuring the LF using only two broad luminosity bins. This exercise has two goals: (i) to verify that SA produces explainable results that can be interpreted in accordance with our physical intuition about the galaxy formation model, and (ii) to check the convergence of the sensitivity indices and their confidence intervals, which can be estimated as explained in Section 2.2.

After this coarse two-bin analysis, in the second case we calculate sensitivity indices for each of the 18 luminosity bins, L_i, using the quantity

$$\begin{eqnarray*} d_{\mathrm{i}}=\frac{\log _{10}(\phi _{i})-\log _{10}(\hat{\phi _{i}})}{\log _{10}(\phi _{i})}. \end{eqnarray*}$$

(20)

This serves as a fine-grained analysis, which can quantify the relative impact of different parameters on the individual segments of the LF, as well as uncovering interactions between model parameters.

3.2 Feedback processes and the luminosity function

The first series of runs, which analysed the effects of changing two of the parameters that specify different feedback processes in galform, α_cool and γ_SN, was carried out to verify the usefulness of the SA and to evaluate its effectiveness, given our physical intuition, regarding the expected impact on the LF of varying these model parameters. Only two parameters were allowed to vary to speed-up the analysis and allow for an easier interpretation of results: α_cool and γ_SN (see Table 2 for the range of parameter values considered). Recall that γ_SN controls the mass loading of SNe-driven winds and α_cool determines the halo mass above which AGN heating shuts down the cooling flow.

Fig. 5 shows the first- and total-order sensitivity indices calculated from 600 galform model runs for the coarse-binned analysis of luminosity function using two bins, one fainter and one brighter than L*, as prescribed by equation (19). The results are striking, but not unexpected: it is clear that γ_SN is the dominant parameter out of the two in shaping the model output for galaxies fainter than L* (and hence, that such galaxies are mainly affected by SNe feedback) and that both parameters have similar significance for galaxies brighter than L* (albeit α_cool is slightly more important), and so bright galaxies are affected by SNe feedback and AGN heating. Moreover, S₁ and S_T are comparable in all cases, which means that the model response to varying these parameters is mostly linear.

Figure 5.

Sensitivity indices for the first series of runs, when varying two parameters in GALFORM: α_cool and γ_SN, for the K-band luminosity function measured in two coarse luminosity bins. The colours of the bars indicate different indices, first (blue) (equation 11) and total (red) (equation 15) order for a given variable. The left-hand panel shows indices for L < L*, and the right for L > L* (see equation 19). The black bars show the 1σ confidence interval for the sensitivity indices.

Fig. 6 shows the convergence of the indices from Fig. 5 as a function of the number of samples N. The indices do not change substantially after 100 galform runs.

Figure 6.

Convergence of the first- and total-order sensitivity indices for the first series of runs, when varying two parameters of galform, α_cool and γ_SN, as a function of a number of samples. The sensitivity indices in this case are computed in each of two broad luminosity bins, covering, respectively, the faint and bright ends of the luminosity function. Individual subplots show the results for the faint and bright ends of the K-band LF (columns, labelled on the top), and the α parameters (rows, labelled on the right). Solid lines correspond to the values of the indices, and the shaded regions to the 1σ confidence band of the values, both colour-coded by the order of the indices as labelled in the legend.

Fig. 7 shows the first- and total-order sensitivity indices for the fine-grained analysis of the LF using multiple luminosity bins, using equation (20) as model output. We can see that L* is close to coinciding with the bin at which AGN heating starts to become important, which explains the results shown in Fig. 5. We also learn that while SNe feedback does not interact with the AGN heating at the faint end of the LF, its influence over the bright end is strongly correlated.

Figure 7.

First- and total-order sensitivity indices (equation 11) for the first series of runs, when varying two parameters in GALFORM: α_cool and γ_SN. Bottom panel: K-band luminosity function at z = 0 like in Fig. 4; grey lines correspond to 10 randomly chosen runs; black line is the observational data from Driver et al. (2012); dashed vertical line corresponds to L*. Top panels: first- and total-order (as labelled on the right) sensitivity indices of two variables (y-axis) for 18 individual magnitude bins (x-axis), colour-coded by value between 0 (not sensitive) and 1 (most sensitive) as labelled by the colour bar at the top.

We did not consider the best-fitting model for this two-parameter case, since we perform a rudimentary estimate of the best-fitting parameter set in the next section, when varying mor galform model parameters at the same time.

3.3 Sensitivity analysis over a multidimensional parameter space

The design of the second experiment, in which seven galform parameters are varied simultaneously (Table 2), is inspired by the work on parameter optimization using Bayesian emulators by Bower et al. (2010) and Rodrigues et al. (2017). For comparison, we use the same parameter ranges adopted in their studies. This exercise requires significantly more model realizations than the first one, since we sample a higher dimensional parameter space and aim to observe interactions between more parameters.

Fig. 8 shows the parameter space and its sampling, colour-coded by the goodness-of-fit measure

$$\begin{eqnarray*} \chi ^2=\sum _i \frac{\left(\log _{10}(\phi _i)-\log _{10}(\hat{\phi _i})\right)^2}{\log _{10}(\phi _i)}, \end{eqnarray*}$$

(21)

where the sum is carried out over all luminosity bins and low values of χ² are blue. While χ² is not a robust model output for SA, as it does not contain information about the direction of the model response as explained in Section 3, it is still a useful measure of a global model response or 'quality of fit'. The on-diagonal histograms indicate that the Saltelli sampling produces a nearly uniform sampling of parameter space, as expected from a low-discrepancy sequence. The off-diagonal scatter plots give a first indication of some of the first-order index results: the χ² of the model LFs is sensitive to variation of γ_SN, is degenerate in the γ_SN–vhot disc plane (which follows directly from equation 3), and depends only weakly on other parameters.

Figure 8.

The parameter space of the second galform experiment in which seven parameters are varied over 1600 realizations of the model. On-diagonal histograms show the nearly uniform distribution of the individual parameters, as expected for Saltelli sampling. Off-diagonal scatter plots show the parameter space for pairs of parameters, colour-coded by the goodness-of-fit χ² (equation 21) of the model prediction for the K-band luminosity function using 18 luminosity bins (Fig. 4), to the observational estimate from Driver et al. (2012); blue points correspond to runs with low values of χ², as labelled by the colour bar.

Fig. 9 shows the first- and total-order sensitivity indices for the coarse-binned analysis of the LF (equation 19). Since sensitivity indices are derived from the normalized variance (equations 11 and 15), the values should always be between 0 and 1, which they are (including the confidence interval). Similarly to Fig. 5, in Fig. 9 we see two different types of behaviour of the galform model: the faint end is dominated by SNe feedback, while the bright end has a mixed, non-linear response to many parameters. Interestingly, while AGN feedback (via α_cool) has the highest first-order sensitivity index (S₁) for the bright end of the LF, the total-order indices (S_T) of SNe feedback processes dominate. Of particular interest are the f_stab and v_hot,burst parameters. These parameters have nearly zero first-order response indices (which means that their impact cannot be detected by an OAT analysis), but their combined higher order responses are significant.

Figure 9.

Sensitivity indices for the second series of 1600 galform uns, varying seven model parameters (Table 2), computed using the coarse two-bin description of the luminosity function. The bar colours indicate the values of different indices, the first index (equation 11) (S₁, red) and total-order index (equation 15) (S_T, blue) for each parameter. The left-hand panel shows indices for galaxies in the luminosity bin fainter than L*, and the right-hand panel for galaxies in the bin brighter than L* (equation 19). The black bars show the 1σ confidence intervals for the sensitivity indices.

It is instructive to see the origin of the values reported in Fig. 9, by inspecting how the sensitivity changes bin-by-bin (equation 20) in Fig. 10. The results are consistent with Section 3.2, and together provide an interpretation of the behaviour of the galform model. Moreover, displaying the model output together with model sensitivity can be of use when manually tweaking the model, allowing for a fine, manual control over the precise details of the LF (or, indeed, other outputs).

Figure 10.

First- and total-order sensitivity indices (equation 11) for the second series of runs, when varying seven GALFORM parameters (Table 2). Bottom panel: K-band luminosity function at z = 0 as in Fig. 4; grey lines show 10 randomly chosen galform models; the black line connects observational data from Driver et al. (2012); dashed vertical line corresponds to L*; the solid red line shows the best-fitting model. Top panels: first- and total-order (as labelled on the right) sensitivity indices of two variables (y-axis) for individual magnitude bins (x-axis), colour-coded by value between 0 (not sensitive) and 1 (most sensitive) as labelled by the colour bar at the top.

Finally, we note that Fig. 10 also shows the LF for the best-fitting model, as determined by the smallest value of equation (21). This can be considered an additional benefit of running SA – requiring so many model realizations naturally finds one that is likely to be close to a global optimum. The best-fitting parameter values are reported in Table 3. Note that the values diverge from those reported in Lacey et al. (2016), due to different fitting method and the fact that this study only considered the K-band LF, whereas Lacey et al. (2016) took into account multiple observations in a manual parameter tuning. Of particular interest is the value of V_hot,disc, which is over |$20{{\ \rm per\ cent}}$| larger than in the previous calibration of this galform model. We attribute this difference to the fact that, as discussed in Section 3.3 and shown in Fig. 9, the combined total-order sensitivity index of V_hot,disc outweighs the first-order index for both ends of the K-band LF. This suggests that the optimal value of this parameter could be missed by OAT model fitting. The differences in the other parameter values are not as significant as they might seem – the variables with the highest sensitivity match the previously reported values pretty closely (e.g. γ_SN is within |$7{{\ \rm per\ cent}}$|⁠), and the variables with low sensitivity that diverge by a significant margin by definition of the sensitivity indices do not have significant impact on the K-band LF.

Table 3.

The best-fitting galform parameters found in this work, in relation to Driver et al. (2012).

Parameter	Value
ν_SF [Gyr⁻¹]	0.46
γ_SN	3.45
α_reheat	0.74
V_hot,disc [km s⁻¹]	332.69
V_hot,burst [km s⁻¹]	392.90
α_cool	0.58
f_stab	0.77

Table 3.

10.1016/0041-5553(79)90085-5

The best-fitting galform parameters found in this work, in relation to Driver et al. (2012).

Parameter	Value
ν_SF [Gyr⁻¹]	0.46
γ_SN	3.45
α_reheat	0.74
V_hot,disc [km s⁻¹]	332.69
V_hot,burst [km s⁻¹]	392.90
α_cool	0.58
f_stab	0.77

4 CONCLUSIONS

We have used variance-based sensitivity analysis to analyse the sensitivity of the K-band luminosity function predicted using the galform semi-analytical model of galaxy formation to the variation of the model parameters. We have shown that sensitivity analysis is a useful tool, which goes beyond simple model fitting and OAT parameter variation, and we have demonstrated that it can be applied to a challenging problem in computational astrophysics. Variance-based sensitivity analysis is perhaps particularly useful for the semi-analytic modelling of galaxy formation modelling, due to the computational expense of searching a multidimensional parameter space and the non-linearity of the model. These features have led some to view such models as black boxes. Part of the aim of the sensitivity analysis presented here is to make the behaviour of the model and how it responds to parameter changes more transparent.

In its present form, sensitivity analysis can only deal with one-dimensional outputs of a model, which on the one hand means that it cannot be used to resolve correlations in model outputs (such as between the predictions in different bins of the luminosity function or between the luminosity function in different bands; see Benson 2014), yet on the other hand this feature gives the scientist performing the study unlimited flexibility in choosing and parametrizing the outputs they find the most important. Here, we have elected to perform the sensitivity analysis using the model predictions in luminosity bins cast in terms of the difference between the computed and measured K-band luminosity function at z = 0. Our motivation for this was that by choosing an established observable with a well-understood connection to the underlying physical processes and their description in terms of galform parameters, we could make a convincing case for the usefulness of the sensitivity analysis.

With this in mind, future work on SA might want to examine the variance of the outputs of the semi-analytic model alone, independently of the corresponding measured observable values. There are three main reasons for such an approach: (i) Using the full dynamic range of the predictions: normalizing the model output by observations flattens the dynamic range, and while SA works equally well for small and large values, by only analysing a flat version of the model predictions we effectively take the regions in which the model gives a flat or steep response (for instance, the faint and bright ends of the LF, respectively) and make them look the same. (ii) Independence of post-processing: by comparing to data, we had to make a choice about the norm of the discrepancy between the model output and observations – do we retain the sense of the discrepancy or square it? A different SA study could have chosen differently, altering the results. By analysing model outputs independently of the observations, these choices are no longer necessary. (iii) Data independence: SA results could change if a different data set is used with the same model.

Moreover, the K-band luminosity function is just one possible output and there are many others, which a successful semi-analytic model should reproduce accurately. Analysing all of these is outside the scope of this study, but we hope to have shown that SA is a promising avenue of research.

Finally, we note that while correctly estimating model sensitivity can be useful in guiding model optimization and improving the physical interpretation of the parameters of the galaxy formation models, one must remember that even the most rigorous sensitivity analysis can only provide the answers with regard to the model, not the underlying physical system itself (Taleb & Douady 2013). Therefore, the relationship between the structure of the model and that of the physical system remains open to discussion.

ACKNOWLEDGEMENTS

This work was supported by the Science and Technology Facilities Council (ST/L00075X/1). PO acknowledges an STFC studentship funded by STFC grant ST/N50404X/1. The inspiration for this project came from a placement by PO at Atom Bank, which was supported by the STFC/Durham University Centre for Doctoral Training in Data Intensive Science, supported by ST/P006744/1. This work used the DiRAC Data Centric system at the Durham University, operated by the Institute for Computational Cosmology on behalf of the STFC DiRAC HPC Facility (www.dirac.ac.uk). This equipment was funded by BIS National E-infrastructure capital grant ST/K00042X/1, STFC capital grants ST/H008519/1 and ST/K00087X/1, STFC DiRAC Operations grant ST/K003267/1, and the Durham University. DiRAC is part of the National E-infrastructure.

Footnotes

1

Some methods, such as Gaussian processes, use parameter exploration to simultaneously measure model sensitivity and maximize goodness-of-fit for model output(s).

2

Interactions between inputs occur, for example, when varying two or more input parameters produces a significantly different response from the model than would be expected from summing the change produced by varying the parameters independently.

3

We know that this is the case in galaxy formation models because if the luminosity function changes in a given bin, this will lead, for example, to a change in the luminosity–circular velocity relation. Benson (2014) argued that correlations between bins in the observed luminosity function are important in setting model parameters.

4

In this case, the whole is literally more than the sum of the parts.

REFERENCES

Antonov

I.

,

Saleev

V.

,

1979

,

USSR Comput. Math. Math. Phys.

,

19

,

252

10.1088/0034-4885/69/12/R02

Baugh

C. M.

,

2006

,

Rep. Prog. Phys.

,

69

,

3101

Baugh

C. M.

,

2013

,

PASA

,

30

,

e030

10.1017/pas.2013.007

Baugh

C. M.

et al. .,

2019

,

MNRAS

,

483

,

4922

10.1093/mnras/sty3427

10.1016/j.physrep.2010.06.001

Benson

A. J.

,

2010

,

Phys. Rep.

,

495

,

33

Benson

A. J.

,

2014

,

MNRAS

,

444

,

2599

10.1093/mnras/stu1630

10.1111/j.1365-2966.2010.16592.x

Benson

A. J.

,

Bower

R.

,

2010

,

MNRAS

,

405

,

1573

Blitz

L.

,

Rosolowsky

E.

,

2006

,

ApJ

,

650

,

933

10.1086/505417

10.1111/j.1365-2966.2006.10519.x

Bower

R. G.

,

Benson

A. J.

,

Malbon

R.

,

Helly

J. C.

,

Frenk

C. S.

,

Baugh

C. M.

,

Cole

S.

,

Lacey

C. G.

,

2006

,

MNRAS

,

370

,

645

10.1111/j.1365-2966.2010.16991.x

Bower

R. G.

,

Vernon

I.

,

Goldstein

M.

,

Benson

A. J.

,

Lacey

C. G.

,

Baugh

C. M.

,

Cole

S.

,

Frenk

C. S.

,

2010

,

MNRAS

,

407

,

2017

Chan

K.

,

Saltelli

A.

,

Tarantola

S.

,

1997

,

ACM Press

,

Proc. of the 1997 Winter Simulation Conf., Sensitivity Analysis of Model Output: Variance-Models Make the Difference

,

Atlanta, GA

, p.

8

10.1046/j.1365-8711.2000.03879.x

Cole

S.

,

Lacey

C. G.

,

Baugh

C. M.

,

Frenk

C. S.

,

2000

,

MNRAS

,

319

,

168

Contreras

S.

,

Baugh

C. M.

,

Norberg

P.

,

Padilla

N.

,

2013

,

MNRAS

,

432

,

2717

10.1093/mnras/stt629

Crain

R. A.

et al. .,

2015

,

MNRAS

,

450

,

1937

10.1093/mnras/stv725

10.1111/j.1365-2966.2012.22036.x

Driver

S. P.

et al. .,

2012

,

MNRAS

,

427

,

3244

Efstathiou

G.

,

Lake

G.

,

Negroponte

J.

,

1982

,

MNRAS

,

199

,

1069

10.1093/mnras/199.4.1069

10.1111/j.1365-2966.2010.17427.x

Fanidakis

N.

,

Baugh

C. M.

,

Benson

A. J.

,

Bower

R. G.

,

Cole

S.

,

Done

C.

,

Frenk

C. S.

,

2011

,

MNRAS

,

410

,

53

Fisher

R. A.

,

1918

,

Trans. R. Soc. Edinburgh

,

52

,

399

http://www.kdd.org/kdd2017/papers/view/google-vizier-a-service-for-black-box-optimization

Golovin

D.

,

Solnik

B.

,

Moitra

S.

,

Kochanski

G.

,

Karro

J. E.

,

Sculley

D.

, eds,

2017

,

Google Vizier: A Service for Black-Box Optimization

, Available at:

10.1088/0004-637x/760/2/112

Gómez

F. A.

,

Coleman-Smith

C. E.

,

O’Shea

B. W.

,

Tumlinson

J.

,

Wolpert

R. L.

,

2012

,

ApJ

,

760

,

112

10.1088/0004-637X/787/1/20

Gómez

F. A.

,

Coleman-Smith

C. E.

,

O’Shea

B. W.

,

Tumlinson

J.

,

Wolpert

R. L.

,

2014

,

ApJ

,

787

,

20

Gonzalez-Perez

V.

,

Lacey

C. G.

,

Baugh

C. M.

,

Lagos

C. D. P.

,

Helly

J.

,

Campbell

D. J. R.

,

Mitchell

P. D.

,

2014

,

MNRAS

,

439

,

264

10.1093/mnras/stt2410

Griffin

A. J.

,

Lacey

C. G.

,

Gonzalez-Perez

V.

,

Lagos

C. D. P.

,

Baugh

C. M.

,

Fanidakis

N.

,

2019

,

MNRAS

,

487

,

198

10.1093/mnras/stz1216

Guo

Q.

et al. .,

2016

,

MNRAS

,

461

,

3457

10.1093/mnras/stw1525

Halton

J. H.

,

1964

,

Commun. ACM

,

7

,

701

10.1145/355588.365104

10.1111/j.1365-2966.2009.14730.x

Henriques

B. M. B.

,

Thomas

P. A.

,

Oliver

S.

,

Roseboom

I.

,

2009

,

MNRAS

,

396

,

535

Henriques

B. M. B.

,

White

S. D. M.

,

Thomas

P. A.

,

Angulo

R. E.

,

Guo

Q.

,

Lemson

G.

,

Springel

V.

,

2013

,

MNRAS

,

431

,

3373

10.1093/mnras/stt415

Herman

J.

,

Usher

W.

,

2017

,

J. Open Source Softw.

,

2

,

97

10.21105/joss.00097

Ishigami

T.

,

Homma

T.

,

1991

,

Proceedings. First International Symposium on Uncertainty Modeling and Analysis

.

IEEE Comput. Soc. Press

,

College Park, MD

, p.

398

10.1016/S0010-4655(98)00154-4

Jansen

M. J.

,

1999

,

Comput. Phys. Commun.

,

117

,

35

Knebe

A.

et al. .,

2015

,

MNRAS

,

451

,

4029

10.1093/mnras/stv1149

Lacey

C. G.

et al. .,

2016

,

MNRAS

,

462

,

3854

10.1093/mnras/stw1888

10.1111/j.1365-2966.2011.19160.x

Lagos

C. D. P.

,

Lacey

C. G.

,

Baugh

C. M.

,

Bower

R. G.

,

Benson

A. J.

,

2011

,

MNRAS

,

416

,

1566

Lagos

C. D. P.

,

Lacey

C. G.

,

Baugh

C. M.

,

2013

,

MNRAS

,

436

,

1787

10.1093/mnras/stt1696

10.1016/0041-5553(88)90181-4

Levitan

Y.

,

Markovich

N.

,

Rozin

S.

,

Sobol’

I.

,

1988

,

USSR Comput. Math. Math. Phys.

,

28

,

88

10.1111/j.1365-2966.2011.19170.x

Lu

Y.

,

Mo

H. J.

,

Weinberg

M. D.

,

Katz

N.

,

2011

,

MNRAS

,

416

,

1949

10.1111/j.1365-2966.2012.20435.x

Lu

Y.

,

Mo

H. J.

,

Katz

N.

,

Weinberg

M. D.

,

2012

,

MNRAS

,

421

,

1779

Ludlow

A. D.

,

Schaye

J.

,

Bower

R.

,

2019

,

MNRAS

,

488

,

3663

10.1093/mnras/stz1821

Martindale

H.

,

Thomas

P. A.

,

Henriques

B. M.

,

Loveday

J.

,

2017

,

MNRAS

,

472

,

1981

10.1093/mnras/stx2131

Mitchell

P. D.

et al. .,

2018

,

MNRAS

,

474

,

492

10.1093/mnras/stx2770

10.1080/00401706.1991.10484804

Morris

M. D.

,

1991

,

Technometrics

,

33

,

161

Mutch

S. J.

,

Poole

G. B.

,

Croton

D. J.

,

2013

,

MNRAS

,

428

,

2001

10.1093/mnras/sts182

Ostriker

J. P.

,

McKee

C. F.

,

1988

,

Rev. Mod. Phys.

,

60

,

1

10.1103/RevModPhys.60.1

Pillepich

A.

et al. .,

2018

,

MNRAS

,

473

,

4077

10.1093/mnras/stx2656

10.1051/0004-6361/201321591

Planck Collaboration XVI

et al. .,

2014

,

A&A

,

571

,

A16

Press

W. H.

,

Teukolsky

S. A.

,

Vetterling

W. T.

,

Flannery

B. P.

,

2007

,

Numerical Recipes: The Art of Scientific Computing

3rd edn.

Cambridge Univ. Press

,

Cambridge

Rodrigues

L. F. S.

,

Vernon

I.

,

Bower

R. G.

,

2017

,

MNRAS

,

466

,

2418

10.1093/mnras/stw3269

10.1088/0004-637X/801/2/139

Ruiz

A. N.

et al. .,

2015

,

ApJ

,

801

,

139

10.1016/j.cpc.2009.09.018

Saltelli

A.

,

Annoni

P.

,

Azzini

I.

,

Campolongo

F.

,

Ratto

M.

,

Tarantola

S.

,

2010

,

Comput. Phys. Commun.

,

181

,

259

Saltelli

A.

et al. .,

2019

,

Environmental Modelling & Software

,

114

,

29

Schaye

J.

et al. .,

2014

,

MNRAS

,

446

,

521

10.1093/mnras/stu2058

10.1016/S0378-4754(00)00270-6

Sobol’

I. M.

,

1967

,

Ž. Vyčisl. Mat. Mat. Fiz.

,

7

,

784

Sobol’

I. M.

,

1993

,

Math. Model. Comput. Exp.

,

1

,

407

Sobol’

I.

,

2001

,

Math. Comput. Simul.

,

55

,

271

10.1146/annurev-astro-082812-140951

Somerville

R. S.

,

Davé

R.

,

2015

,

ARA&A

,

53

,

51

10.1080/00401706.1987.10488205

Stein

M.

,

1987

,

Technometrics

,

29

,

143

10.1080/14697688.2013.800219

Taleb

N. N.

,

Douady

R.

,

2013

,

Quant. Finance

,

13

,

1677

10.1088/0004-637x/708/2/1398

Tumlinson

J.

,

2009

,

ApJ

,

708

,

1398

Ulam

S.

,

1960

,

Problems in Modern Mathematics

.

John Wiley & Sons

,

New York

Vogelsberger

M.

et al. .,

2014

,

MNRAS

,

444

,

1518

10.1093/mnras/stu1536