Evidence for galaxy assembly bias in BOSS CMASS redshift-space galaxy correlation function

Yuan, Sihan; Hadzhiyska, Boryana; Bose, Sownak; Eisenstein, Daniel J; Guo, Hong

doi:10.1093/mnras/stab235

ABSTRACT

Building accurate and flexible galaxy–halo connection models is crucial in modelling galaxy clustering on non-linear scales. Recent studies have found that halo concentration by itself cannot capture the full galaxy assembly bias effect and that the local environment of the halo can be an excellent indicator of galaxy assembly bias. In this paper, we propose an extended halo occupation distribution (HOD) model that includes both a concentration-based assembly bias term and an environment-based assembly bias term. We use this model to achieve a good fit (χ²/degrees of freedom = 1.35) on the 2D redshift-space two-point correlation function (2PCF) of the Baryon Oscillation Spectroscopic Survey (BOSS) CMASS galaxy sample. We find that the inclusion of both assembly bias terms is strongly favoured by the data and the standard five-parameter HOD model is strongly rejected. More interestingly, the redshift-space 2PCF drives the assembly bias parameters in a way that preferentially assigns galaxies to lower mass haloes. This results in galaxy–galaxy lensing predictions that are within 1σ agreement with the observation, alleviating the perceived tension between galaxy clustering and lensing. We also showcase a consistent 3σ–5σ preference for a positive environment-based assembly bias that persists over variations in the fit. We speculate that the environmental dependence might be driven by underlying processes such as mergers and feedback, but might also be indicative of a larger halo boundaries such as the splashback radius. Regardless, this work highlights the importance of building flexible galaxy–halo connection models and demonstrates the extra constraining power of the redshift-space 2PCF.

gravitational lensing: weak, methods: analytical, methods: statistical, galaxies: haloes, dark matter, large-scale structure of Universe

1 INTRODUCTION

In the standard framework of structure formation in a Λ cold dark matter (ΛCDM) universe, galaxies are predicted to form and evolve in dark matter haloes (White & Rees 1978). While the distribution and structure of dark matter haloes are directly tied to the underlying cosmology, the extent to which galaxies do is less so. To extract cosmological information and understand galaxy formation from observed galaxy clustering statistics, it is critical to correctly model the connection between galaxies and their underlying dark matter haloes. The most popular model of the galaxy–halo connection is the halo occupation distribution (HOD) model (e.g. Peacock & Smith 2000; Scoccimarro et al. 2001; Berlind & Weinberg 2002; Zheng et al. 2005; Zheng, Coil & Zehavi 2007). The HOD makes the assumption that all galaxies live inside dark matter haloes, and expresses the average number of galaxies contained within an individual halo as a function of the halo mass. More specifically, the simplest formulation of the HOD assumes that galaxy occupation is determined solely by halo mass, an assumption that rests on the long-standing and widely accepted theoretical prediction that halo mass is the attribute that most strongly correlates with the halo abundance and halo clustering and the properties of the galaxies residing in it (White & Rees 1978; Blumenthal et al. 1984).

However, Gao, Springel & White (2005), Zentner et al. (2005), Wechsler et al. (2006), Gao & White (2007), Croton, Gao & White (2007), and Li, Mo & Gao (2008) showed that at fixed halo mass, halo clustering also depends on secondary halo properties that correlate with halo assembly history, an effect known as halo assembly bias. Additionally, a series of studies employing hydrodynamical simulations and semi-analytic models have found clear evidence that galaxy occupation correlates with secondary halo properties beyond just halo mass (e.g. Zhu et al. 2006; Artale et al. 2018; Zehavi et al. 2018; Bose et al. 2019; Contreras et al. 2019; Hadzhiyska et al. 2020; Xu, Zehavi & Contreras 2021). This phenomenon is commonly known as galaxy assembly bias (or assembly bias hereafter) to differentiate it from halo assembly bias. Wechsler & Tinker (2018) offer a more rigorous definition of assembly bias as: at fixed halo mass, the galaxy properties or number of galaxies within dark matter haloes may depend on secondary halo properties that themselves show a halo assembly bias signature. Ignoring the effects of assembly bias has been shown to introduce significant errors in inferring galaxy–halo connection models and bias galaxy formation models (Pujol & Gaztañaga 2014; Zentner, Hearin & van den Bosch 2014; Lange et al. 2019).

Several studies have implemented ways to incorporate a secondary dependence into the HOD formalism (e.g. Paranjape et al. 2015; Hearin et al. 2016; McEwen & Weinberg 2018; Yuan, Eisenstein & Garrison 2018; Walsh & Tinker 2019; Wibking et al. 2019; Xu et al. 2021). The different methodologies can be summarized into two approaches. One approach is to assign galaxies to haloes according to a secondary property within each mass bin, such as the decorated HOD framework proposed in Hearin et al. (2016). The other approach is to introduce a secondary dependence on the existing HOD parameter set, but where the definition of halo mass itself is modified by a dependence on the secondary parameter, such as in Yuan et al. (2018) and Walsh & Tinker (2019). Zentner et al. (2019) applied the decorated HOD framework showed the first tentative observational evidence for galaxy assembly bias in the Sloan Digital Sky Survey (SDSS) Data Release 7 (DR7) main galaxy sample.

Thus far, halo concentration, which is a measure of how centrally peaked the density profile of the halo is, has long been regarded as the standard secondary parameter to use in assembly bias studies. This choice is largely motivated by N-body simulations that show that halo concentration, along with several other parameters that correlate with the halo assembly history, is good predictors of halo assembly bias (Wechsler et al. 2006; Croton et al. 2007; Mao, Zentner & Wechsler 2018). However, recent studies that have systemically compared various secondary dependencies for assembly bias in hydrodynamical simulations and semi-analytic models have found that halo concentration only accounts for part of the actual assembly bias (Hadzhiyska et al. 2020; Xu et al. 2021). Both studies find that the local environment of the halo at the present day is an excellent predictor of assembly bias, where the environment is defined as either the smoothed local matter density or the number of neighbouring haloes within some radius. We point out that the term ‘assembly bias’ is somewhat of a misnomer in this context, as dependencies such as local environment identified today do not necessarily relate to the past assembly history of the haloes. In this paper, we mean assembly bias in its most general sense, inclusive of all secondary dependencies in the galaxy–halo connection model.

However, one must be careful when discussing halo environment in the context of assembly bias, as this may at first seem tautological. Since the environment of the halo is a statement on clustering itself, the halo environment cannot be naively used as the explanation for why there is assembly bias, since it is logically necessary that objects selected from dense environments exhibit stronger clustering. However, if one’s objective is to reproduce realistic galaxy mocks rather than explaining it, then it is fair to use the halo environment as an indicator of assembly bias in decorating galaxy–halo connection models. In fact, as Hadzhiyska et al. (2020) and Xu et al. (2021) have shown, halo environment is potentially the most effective indicator of assembly bias. In the context of dark matter only simulations, halo environment is also powerful because it is an easily accessible property, which bypasses the need to construct halo merger trees or resolve subhalo structure.

There is also growing evidence that halo environment affects galaxy evolution and, therefore, the galaxy–halo connection directly. Both Hadzhiyska et al. (2020) and Xu et al. (2021) showed that while the environment is defined locally (typically between 1 and 5 h⁻¹ Mpc), it seems to capture the assembly bias effects on much larger scales, up to tens of megaparsecs. This suggests that the environment might trace underlying processes that, at least partially, drive assembly bias. This should not be surprising as excursion set theory predicts a correlation between the halo environment and its formation history (Bond et al. 1991; Zentner 2007). In the context of the cosmic web structure, studies have shown that, when controlled for halo mass, galaxy evolution depends on its proximity to close-by cosmic filaments, and the content and kinematics of neighbouring gas reservoirs (e.g. Chen et al. 2017; Poudel et al. 2017; Laigle et al. 2018; Kraljic et al. 2019; Salerno, Martínez & Muriel 2019; Song et al. 2020). Obuljen, Percival & Dalal (2020) found a 5σ detection of an anisotropic assembly bias that correlates galaxy properties with large-scale tidal field in the Baryon Oscillation Spectroscopic Survey (BOSS) CMASS data. A series of papers including Tinker et al. (2017, 2018a,b) studied the relation between various observed galaxy properties and environment at fixed halo mass and stellar mass and found that star-forming galaxies tend to live in underdense environments, whereas quenched passive galaxies tend to occupy overdense environments. Independent studies such as Lee et al. (2018), Dragomir et al. (2018), and Behroozi et al. (2019) found similar trends in simulations and in separate data sets.

In this paper, we propose an extended HOD model that incorporates both a concentration-based assembly bias term and an environment-based assembly bias term. We constrain such an HOD with the observed two-dimensional galaxy redshift-space two-point correlation function (2PCF) of the BOSS (Eisenstein et al. 2011; Dawson et al. 2013) CMASS sample in the range 0.46 ≤ z ≤ 0.6 (Data Release 12). We show that the redshift-space 2PCF strongly prefers the inclusion of both assembly bias terms and affects the fit in a way that reduces the typical host halo mass of galaxies. We also show that the resulting best-fitting HOD predicts the galaxy–galaxy lensing signal to within 1σ, significantly reducing the perceived tension between galaxy clustering and lensing. Our results show that incorporating various galaxy assembly bias effects is an important ingredient in an accurate and flexible HOD model, and that the perceived tension between galaxy clustering and galaxy–galaxy lensing might partially be due to oversimplistic HOD models and the lack of constraining power of the projected 2PCF. We also showcase a consistent preference for a positive environment-based assembly bias by the data. This serves as further evidence that an environment-based assembly bias, together with the traditional concentration-based assembly bias, should be included in future galaxy–halo connection models.

The paper is organized as follows. In Section 2, we describe the extended HOD framework and the implementation of assembly bias parameters. In Section 3, we describe the observed redshift-space 2PCF and the simulation sets we employ in this work. In Section 4, we present the HOD fitting methodology and the optimizations developed for our analysis. In Section 5, we showcase our HOD fits with and without the assembly bias terms, the corresponding lensing predictions, and the best-fitting value of the environment-based assembly bias across variations to the fit. In Section 6, we discuss the parameter recovery of our routine, alternative environment definitions, and compare our assembly bias fit to previous works. We conclude in Section 7.

2 THE EXTENDED HOD FRAMEWORK

The HOD is a popular empirical framework used to populate dark matter haloes with central and satellite galaxies as a function of halo mass. However, given the simplistic nature of galaxy assignment, this model may also be a source of systematic errors for cosmological applications. In this section, we briefly review the baseline HOD formalism and discuss physically motivated extensions to the standard HOD. A more detailed description of the HOD and some of the extensions discussed here can be found in Yuan et al. (2018) and in section 2 of Yuan, Eisenstein & Leauthaud (2020). Our extended HOD code is publicly available as the grand-hod package.¹

2.1 The baseline model

The baseline five-parameter HOD model (Zheng & Weinberg 2007) predicts the mean number of central galaxies and satellite galaxies as a function of halo mass M:

$$\begin{eqnarray*} \bar{n}_{\mathrm{cent}}(M) &=& \frac{1}{2}\mathrm{erfc} \left[\frac{\ln (M_{\mathrm{cut}}/M)}{\sqrt{2}\sigma }\right], \nonumber \\ \bar{n}_{\textrm {sat}}(M) &=& \left[\frac{M-\kappa M_{\textrm {cut}}}{M_1}\right]^{\alpha }\bar{n}_{\mathrm{cent}}(M), \end{eqnarray*}$$

(1)

where halo mass is defined as the mass contained within a radius encompassing 200 times the background density, M_200b. The five parameters of this model are M_cut, M₁, σ, α, and κ. M_cut characterizes the minimum halo mass to host a central galaxy. M₁ characterizes the typical halo mass that hosts one satellite galaxy. σ describes the steepness of the transition from 0 to 1 in the number of central galaxies. α is the power-law index on the number of satellite galaxies. κM_cut gives the minimum halo mass to host a satellite galaxy. The actual number of central galaxies follows a Bernoulli distribution, while the actual number of satellite galaxies follows a Poisson distribution. The spatial distribution of satellites within the halo follows the halo matter density distribution. Traditionally, the satellites in each halo are distributed according to a Navarro–Frenk–White (NFW) profile (Navarro, Frenk & White 1997). Instead, we distribute the satellite positions according to the particle subsample of each halo. While this method incurs a computational cost, this approach has the benefit of preserving the shape and dynamics of the dark matter halo. For the baseline HOD, we simply give each particle of the halo equal probability of hosting a satellite galaxy, and the satellite galaxy inherits the velocity of the host particle.

Finally, to generate galaxies positions in redshift space, we assume the z-axis to be the line of sight (LOS) and modify the z coordinates of the galaxies according to

$$\begin{eqnarray*} X^{\prime }_z = X_z + \frac{v_z}{H}(1+z), \end{eqnarray*}$$

(2)

where |$X^{\prime }_z$| is the redshift-space comoving position of the galaxy, and X_z is the real-space comoving position of the galaxy. v_z is the z component of the galaxy velocity.

In the following subsections, we introduce six additional physically motivated HOD parameters.

2.2 Satellite profile parameters s and s_p

We first introduce two parameters that allow for flexibility in the radial distribution of satellite galaxies within each halo.

The radial distance parameter, s, deviates the satellite spatial distribution away from the halo matter density profile by giving preference to particles based on their radial distance to halo centre. A positive s preferentially situates satellites on the outskirts of the halo, whereas a negative s preferentially situates satellites towards the inner region of the halo. Fig. 2 of Yuan et al. (2018) shows how changing s affects the mock 2PCF. The range of s is defined to be between −1 and 1. The s parameter is motivated by baryonic processes that can bias the concentration of baryons within the dark matter potential well (e.g. Abadi et al. 2010; Duffy et al. 2010; Chua et al. 2017; Peirani et al. 2017).

The perihelion distance parameter, s_p, is related to the s parameter but additionally folds in the velocity information of the particles. Specifically, it gives preference to particles based on their perihelion distance to the halo centre, i.e. their closest approach distance to the halo centre given their current trajectory. The perihelion distance is calculated by solving the Kepler equations within an NFW potential well (equations 6–9 in Yuan et al. 2018). The impact of s_p on the 2PCF is shown in fig. 5 of Yuan et al. (2018). This parameter is motivated by processes such as ram pressure stripping and tidal disruption.

2.3 Velocity bias parameters s_v and α_c

Another important set of parameters we employ in order to accurately model the redshift-space correlation function is the satellite and central velocity bias parameters (e.g. Berlind et al. 2003; Yoshikawa, Jing & Börner 2003; van den Bosch et al. 2005; Skibba et al. 2011; Guo et al. 2015).

First, we define the satellite velocity bias parameter, s_v, which biases the satellite velocity distribution away from that of the host halo. A positive s_v preferentially assigns satellites to high peculiar velocity particles of the halo, and vice versa. We note that our implementation lets the satellites assume the peculiar velocity of their underlying matter field, thus guaranteeing that the satellite galaxies still obey Newtonian physics in the halo potential. This is in contrast to existing velocity bias implementations where satellite velocities are increased/decreased without altering their positions, breaking Newtonian physics. A key difference between these two approaches is that our velocity bias implementation has a small effect on the projected correlation function, whereas existing implementations have strictly zero impact on the projection clustering. Fig. 4 of Yuan et al. (2018) shows how s_v affects the predicted 2PCF. The range of s_v is defined to be between −1 and 1.

We also introduce the central velocity bias parameter, α_c. In the baseline implementation, the central galaxy is assumed to have the position and velocity of the halo centre of mass (CoM). When invoking velocity bias, the central galaxy velocity is given by

$$\begin{eqnarray*} v_{\mathrm{cent}, z} = v_{\mathrm{CoM}, z} + \delta v(\alpha _\mathrm{ c} \sigma _{\mathrm{LOS}}), \end{eqnarray*}$$

(3)

where v_{cent, z} is the LOS velocity of the central, and v_{CoM, z} is the LOS velocity of the halo CoM. σ_LOS is the LOS velocity dispersion of the halo particles. α_c is the central velocity bias parameter. δv is drawn from a Gaussian distribution with zero mean and standard deviation of α_cσ_LOS. The central velocity bias has strictly no effect on projected clustering, but affects the ‘length’ of the Fingers-of-God in redshift space. While α_c can technically vary between 0 and +∞, we expect the true α_c to be not greater than 1.

2.4 Assembly bias parameters A and A_e

So far, all our extensions to the baseline HOD have only dealt with the distribution of galaxies within each halo while respecting the assumption that the number of galaxies depends only on halo mass. In this section, we relax this assumption by introducing two secondary dependencies: halo concentration and halo environment.

We first define the concentration-based assembly bias parameter, A. The motivation for using halo concentration as the secondary parameter is that it is correlated with the formation histories of dark matter haloes, with earlier forming haloes having higher concentrations at fixed halo mass (Wechsler et al. 2002, 2006; Zhao et al. 2003, 2009; Villarreal et al. 2017). In this work, we define halo concentration as

$$\begin{eqnarray*} c = \frac{r_{\textrm {vir}}}{r_{\mathrm{ s}, \textrm {Klypin}}}, \end{eqnarray*}$$

(4)

where r_vir is the virial radius of the halo and r_{s, Klypin} is the velocity-based Klypin scale radius (Klypin, Trujillo-Gomez & Primack 2011). Our assembly bias implementation is based on a routine that preserves the overall galaxy number density, first described in Yuan et al. (2018). To summarize briefly, we rank all haloes by their mass, and compute the corresponding list of expected number of galaxies using the baseline HOD that depends only on mass. Then, we perturb the ranking of haloes by defining a pseudo-mass:

$$\begin{eqnarray*} \log _{10} M_\mathrm{pseudo} = \log _{10} M + A\delta _\mathrm{ c}, \end{eqnarray*}$$

(5)

where |$\delta _\mathrm{ c} = (c - \bar{c}(M))/\sigma _\mathrm{ c}(M)$| is the halo concentration subtracted by the mean concentration in that specific mass bin and normalized by the corresponding scatter in concentration. We perturb the halo ranking by sorting by M_pseudo and then map the unperturbed list of expected number of galaxies on to the perturbed list of haloes bijectively. Our implementation essentially swaps the galaxies between haloes while preserving the total number density of galaxies.

However, it does not preserve the expected number of galaxies for a given halo mass |$\langle \bar{n}_\mathrm{ g}|M\rangle$|⁠, in contrast to the assembly bias implementation in the halotools-decorated HOD framework (Behroozi, Wechsler & Wu 2013; Hearin et al. 2016). A detailed description of our implementation can be found in section 3 of Yuan et al. (2018). Fig. 6 of Yuan et al. (2018) shows the effect of A on the predicted 2PCF. The range of A is technically between −∞ and ∞, but we expect A to be between −1 and 1.

Similarly, we define the environmental assembly bias parameter, A_e, which incorporates the halo environment as the secondary dependence. To define halo environment, we adopt the same formalism as Hadzhiyska et al. (2020). Specifically, for each halo, we find all neighbouring haloes (including subhaloes) beyond its virial radius but within r_max = 5 h⁻¹ Mpc of the halo centre. We sum the mass of all these neighbouring haloes as M_env, and we compute the environment factor, f_env, as

$$\begin{eqnarray*} f_\mathrm{env} = M_\mathrm{env} / \bar{M}_\mathrm{env}(M), \end{eqnarray*}$$

(6)

where |$\bar{M}_\mathrm{env}(M)$| is the mean environment factor within halo mass bin M. Finally, we incorporate f_env into the pseudo-mass definition in equation (5) and introduce the environmental assembly bias parameter A_e:

$$\begin{eqnarray*} \log _{10} M_\mathrm{pseudo} = \log _{10} M + A\delta _\mathrm{ c} + A_\mathrm{ e} f_\mathrm{env}. \end{eqnarray*}$$

(7)

Again, we incorporate these assembly bias effects into our HOD by reranking the haloes with M_pseudo to essentially swap galaxies between haloes of different concentration and environment. The choice of r_max is meant to be large enough to capture the immediate vicinity of the halo without extending deep into the two-halo regime. We revisit this definition in Section 6.

It is important to note that while our implementation has the distinct advantage of incorporating multiple secondary dependencies, our assembly bias model is also limited by the fact that we do not distinguish between assembly biases for the central galaxies and satellite galaxies, as exemplified in some earlier assembly bias frameworks (e.g. Hearin et al. 2016; Xu et al. 2021). Recent simulation-based works have also found evidence that centrals and satellites might indeed have distinct assembly bias signatures (e.g. Bose et al. 2019). For this work, we do not make this distinction for model simplicity and to limit the number of necessary parameters. We explore alternative assembly bias models that distinguish between centrals and satellites in upcoming work.

2.5 Incompleteness factor f_ic

The final parameter in our extended HOD model is the incompleteness factor (e.g. Leauthaud et al. 2016; Rodríguez-Torres et al. 2016; Guo, Yang & Lu 2018). The inclusion of incompleteness is partially motivated by detection of incompleteness in the BOSS CMASS and LOWZ galaxy samples compared with theoretical stellar mass functions, but it is also needed to marginalize over uncertainties in the galaxy number densities since we do not model its full redshift dependence. We define f_ic as a modification to |$\bar{n}_{\mathrm{cent}}$| in equation (1):

$$\begin{eqnarray*} \bar{n}_{\mathrm{cent}} = \frac{f_\mathrm{ic}}{2}\mathrm{erfc} \left[\frac{\ln (M_{\mathrm{cut}}/M)}{\sqrt{2}\sigma }\right], \end{eqnarray*}$$

(8)

where 0 < f_ic ≤ 1. Our implementation simply uniformly downsamples the mock galaxies by f_ic to produce the desired galaxy number density.

To summarize, our extended HOD model contains the five baseline parameters: M_cut, M₁, σ, α, and κ; six extended parameters: s, s_p, s_v, α_c, A, andA_e; and one incompleteness factor f_ic, for a total of 12 parameters.

3 DATA AND SIMULATIONS

3.1 BOSS CMASS galaxy sample

The Baryon Oscillation Spectroscopic Survey (BOSS; Bolton et al. 2012; Dawson et al. 2013) is part of the SDSS-III programme (Eisenstein et al. 2011). BOSS Data Release 12 (DR12) provides redshifts for 1.5 million galaxies in an effective area of 9329 deg² divided into two samples: LOWZ and CMASS. The LOWZ galaxies are selected to be the brightest and reddest of the low-redshift galaxy population at z < 0.4, whereas the CMASS sample is designed to isolate galaxies of approximately constant mass at higher redshift (z > 0.4), most of them being also luminous red galaxies (LRGs; Reid et al. 2016; Rodríguez-Torres et al. 2016). The survey footprint is divided into chunks that are covered in overlapping plates of radius |${\sim} 1{_{.}^{\circ}}49$|⁠. Each plate can house up to 1000 fibres, but due to the finite size of the fibre housing, no two fibres can be placed closer than 62 arcsec, referred to as the fibre collision scale (Guo, Zehavi & Zheng 2012).

For this paper, we limit our measurements to the galaxy sample in the redshift range 0.46 < z < 0.6 in DR12. We choose this moderate redshift range for completeness and to minimize the systematics due to redshift evolution. Applying this redshift range to both the North and South Galactic Caps gives a total of approximately 600 000 galaxies in our sample. We showcase the number density variation over redshift in Fig. 1. The average galaxy number density is given by n_data = (3.01 ± 0.03) × 10⁻⁴ h³ Mpc⁻³.

Figure 1.

The CMASS DR12 galaxy comoving number density distribution across our redshift range. The red dashed line corresponds to the North Galactic Cap, whereas the blue dotted line corresponds to the South Galactic Cap. The green solid line shows the combined number density. The two vertical lines mark z = 0.46 and z = 0.6, respectively.

Open in new tab Download slide

Fig. 2 shows the redshift-space 2PCF of the same BOSS sample and its corresponding correlation matrix, assuming Planck 2015 cosmology. The redshift-space 2PCF ξ(r_p, π) is computed using the Landy & Szalay (1993) estimator:

$$\begin{eqnarray*} \xi (r_\mathrm{ p}, \pi) = \frac{\mathrm{ DD} - 2\mathrm{ DR} + \mathrm{ RR}}{\mathrm{ RR}}, \end{eqnarray*}$$

(9)

where DD, DR, and RR are the normalized numbers of data–data, data–random, and random–random pair counts in each bin of (r_p, π), and r_p and π are transverse and LOS separations in comoving units. For this paper, we choose a coarse binning to ensure reasonable accuracy on the covariance matrix, with eight logarithmically spaced bins between 0.169 and 30 h⁻¹ Mpc in the transverse direction, and six linearly spaced bins between 0 and 30 h⁻¹ Mpc bins along the LOS direction.

Figure 2.

The redshift-space two-point correlation function (2PCF; left) of the BOSS CMASS DR12 galaxies at 0.46 < z < 0.6 and its corresponding covariance matrix (right). r_p is the transverse comoving distance between galaxies. π is the LOS comoving distance between galaxies. The right-hand side shows the correlation matrix, calculated from 400 jackknife subsamples. The bins are formed by flattening the ξ bins column-by-column, with large bin number corresponding to large r_p. For example, the bins 0–5 correspond to the first r_p bin but increasing π.

Open in new tab Download slide

We have corrected the fibre collision effect following the method of Guo et al. (2012), by separating galaxies into collided and decollided populations and assuming those collided galaxies with measured redshifts in the plate-overlap regions are representative of the overall collided population. The final corrected correlation function can be obtained by summing up the contributions from the two populations.

The correlation matrix is defined relative to the covariance matrix as |$\mathrm{Corr}(\xi)_{ij} = \mathrm{Cov}(\xi)_{ij}/\sqrt{\mathrm{Cov}(\xi)_{ii}\mathrm{Cov}(\xi)_{jj}}$|⁠. The covariance matrix is calculated from 400 jackknife samples. The x and y axes of the right-hand panel in Fig. 2 show the same bins as on the left-hand panel, flattened in a column-by-column fashion such that the transverse separation r_p increases with bin number. Overall, we see that the off-diagonal power is relatively small at small transverse scales and becomes more significant at larger transverse scales. This suggests that the error at small r_p is shot-noise dominated, while sample variance becomes dominant at large r_p.

3.2 N-body simulations and halo finders

For the primary results of this paper, we generate our galaxy mocks from the AbacusCosmos N-body simulation suite, generated by the fast and high-precision abacus N-body code (Garrison et al. 2016, 2018; Ferrer et al., in preparation; Metchnik & Pinto, in preparation).² We use 20 boxes of comoving size 1100 h⁻¹ Mpc with Planck 2015 cosmology (Planck Collaboration XIII 2016) at redshift z = 0.5. We quote the cosmology parameter values as Ω_ch² = 0.1199, Ω_bh² = 0.02222, σ₈ = 0.830, n_s = 0.9652, h = 0.6726, and w₀ = −1. These boxes are set to different initial phases to generate unique outputs. Each box contains 1440³ dark matter particles of mass |$4\times 10^{10}\, h^{-1}\,\mathrm{ M}_{\odot }$|⁠. The force softening length is 0.06 h⁻¹ Mpc. Dark matter haloes are found and characterized using the rockstar (Behroozi et al. 2013) halo finder.

When testing the environmental assembly bias in Section 5.4, we also incorporate the AbacusSummit simulation suite, which is a set of large, high-accuracy cosmological N-body simulations designed to meet the cosmological simulation requirements of the Dark Energy Spectroscopic Instrument (DESI) survey (Levi et al. 2013). AbacusSummit consists of over 140 simulations most of which contain 6912³ particles within a 2 h⁻¹ Gpc volume, which yields a particle mass of 2.1 × 10⁹ h⁻¹ M_⊙.³

The AbacusSummit boxes that we employ include one with the primary Planck 2018 ΛCDM cosmology as the benchmark and four other boxes with perturbed cosmologies, specifically varying cosmological parameters Ω_c and σ₈. The Planck 2018 cosmology parameters are Ω_ch² = 0.1200, Ω_bh² = 0.02237, σ₈ = 0.811355, n_s = 0.9649, h = 0.6736, w₀ = −1, and w_a = 0. Note that this is a slightly different cosmology than the Planck 2016 cosmology used for the AbacusCosmos simulations, the largest difference being a |$\sim \! 2{{\ \rm per\ cent}}$| smaller σ₈ in the new cosmology. We concentrate our study of these simulations at redshift z = 0.5 and make use of the output data products including particle subsamples, CompaSO (Hadzhiyska et al., in preparation) and rockstar halo catalogues.

The two simulation suites also utilize different halo finders. The AbacusCosmos simulations use rockstar, whereas the AbacusSummit simulations use CompaSO.

rockstar is a temporal, phase-space halo finder considered to be highly accurate in determining particle–halo membership, as it uses information about both the phase-space distribution of the particles and their temporal evolution (Behroozi et al. 2013). This is because having information about the relative motion of two haloes makes the process of finding tidal remnants and determining halo boundaries substantially more effective, while having temporal information helps to maximize the consistency of halo properties across time, rather than just within a single snapshot.

The CompaSO halo-finding algorithm was designed for the AbacusSummit suite of high-performance cosmological N-body simulations, as a highly efficient on-the-fly group finder (Hadzhiyska et al., in preparation). CompaSO builds on the existing spherical overdensity (SO) algorithm by taking into consideration the tidal radius around a smaller halo before competitively assigning halo membership to the particles in an effort to more effectively deblend haloes. Among other features, the CompaSO finder also allows for the formation of new haloes on the outskirts of growing haloes, which alleviates a known issue of configuration-space halo finders of failing to identify haloes close to the centres of larger haloes.

A detailed comparison between the two halo finders will be made in Hadzhiyska et al. (in preparation).

4 METHODS

The fundamental technological challenge is to search a high-dimensional extended HOD parameter space for a prescription that best reproduces the observed redshift-space 2PCF signal and galaxy number density. Perhaps the most popular approach for fitting an extended HOD model on data is to employ a so-called emulator. The emulator approach is where we train a surrogate model of the observable within a HOD training set before we fit the best-fitting surrogate model on data. An example of an HOD emulator with the extended HOD model in this work was implemented in Yuan et al. (2020). However, we found that the emulator for such a high-dimensional parameter space has poor accuracy and breaks down outside the training range. The quadratic emulator model we employed was also not flexible enough for a wide training range to enable a comprehensive search. Thus, in this paper, we opt for a direct global optimization of the likelihood function to find the optimal HOD. A direct optimization has a much wider range in parameter space and is not limited by the shape of the surrogate model in an emulator. However, it requires repeatedly generating mock galaxy catalogues from simulations, which can be prohibitively expensive, especially since we are utilizing the particle catalogue in addition to the halo catalogue in this analysis.

In this section, we describe our likelihood function, its maximization, and then the key methods implemented to accelerate the HOD code.

4.1 The maximum likelihood routine

We assume Gaussian likelihood and express the log-likelihood using the chi-square technique. The χ² is given in two parts, corresponding to errors on the redshift-space 2PCF and errors on the galaxy number density:

$$\begin{eqnarray*} \chi ^2 = \chi ^2_{\xi } + \chi ^2_{n_\mathrm{ g}}, \end{eqnarray*}$$

(10)

where

$$\begin{eqnarray*} \chi ^2_{\xi } = (\boldsymbol {\xi }_{\mathrm{mock}} - \boldsymbol {\xi }_{\mathrm{data}})^\mathrm{ T} \boldsymbol{\sf C}^{-1}(\boldsymbol {\xi }_{\mathrm{mock}} - \boldsymbol {\xi }_{\mathrm{data}}) \end{eqnarray*}$$

(11)

and

$$\begin{eqnarray*} \chi ^2_{n_\mathrm{ g}} = \left\lbrace \begin{array}{@{}l@{\quad }l@{}}\left(\frac{n_{\mathrm{mock}} - n_{\mathrm{data}}}{\sigma _{n}/5}\right)^2 & (n_{\mathrm{mock}} < n_{\mathrm{data}}), \\ \left(\frac{n_{\mathrm{data}}(1/f_{\mathrm{ic}} - 1)}{\sigma _{n}}\right)^2 & (n_{\mathrm{mock}} \ge n_{\mathrm{data}}). \end{array}\right. \end{eqnarray*}$$

(12)

We define |$\boldsymbol{\sf C}$| as the jackknife covariance matrix on ξ, and σ_n is the jackknife uncertainty of the galaxy number density. The |$\chi ^2_{n_\mathrm{ g}}$| is an asymmetric normal around the observed number density n_data. When the mock number density is higher than data number density (n_mock ≥ n_data), we invoke the incompleteness fraction f_ic that uniformly downsamples the mock galaxies to match the data number density. In this case, we penalize f_ic values that are far from 1. When the mock number density is less than the data number density (n_mock < n_data), we give a much steeper penalty on the difference between n_mock and n_data. This definition of |$\chi ^2_{n_\mathrm{ g}}$| allows for modest incompleteness in the observed galaxy sample while penalizing HOD models that produce insufficient galaxy number density or too many galaxies. For the rest of this paper, we set n_data = 3.0 × 10⁻⁴ h³ Mpc⁻³ and σ_n = 4.0 × 10⁻⁵ h³ Mpc⁻³. Note that we choose a more lenient σ_n than the jackknife uncertainty on galaxy number density in Section 3.1 because we want to explore a larger HOD parameter space.

To minimize χ², we utilize a global optimization algorithm known as covariance matrix adaptation evolution strategy (CMA-ES; Hansen & Ostermeier 2001). CMA-ES is an evolutionary algorithm that stochastically varies and selects on a school of candidate solutions, resembling the evolution of a biological system. In the simplest terms, the algorithm works by updating the mean and covariance matrix of the distribution to increase the probability of previously successful search vectors in each step until the candidate solutions converge to the global optimum. The algorithm also records and exploits the time-wise history of the search for faster stepping while also preventing premature convergence. The specific implementation of CMA-ES we use is part of the publicly available STOCHastic OPtimization for PYthon (StochOPy) package.⁴ To assess the error bars on the best fit, we run 22 Markov chain Monte Carlo (MCMC) chains initialized around the best fit with the emcee package (Foreman-Mackey et al. 2013). We quote our 1σ error bars as the standard deviation of the marginalized posterior distribution.

4.2 Accelerating the HOD code

The key challenge in our HOD global optimization is speed. Each HOD evaluation requires generating mock galaxies on the halo catalogues and then computing summary statistics. The first step is particularly time-consuming since we are using a particle-based HOD. In the following paragraphs, we describe several key speed-ups to enable a fast particle-based HOD implementation.

The first and most obvious speed-up comes from parallelization. Given that generating mock galaxies is a so-called ‘embarrassingly parallel’ problem on the halo level, the performance gain scales roughly linearly with the number of CPU cores utilized, given a sufficient amount of memory and I/O bandwidth. All computation in this analysis is done on a custom-built desktop, where we distribute the computation over 20 cores on a pair of Intel Xeon E5-2630v4 CPU clocked at 2.2 GHz for a roughly 20× performance gain.

Another ∼20× speed-up comes from utilizing a numba just-in-time (JIT) compiler (Lam, Pitrou & Seibert 2015), which converts slow python code to fast machine code. numba is especially powerful for long loops of unvectorized code, which is the case for generating mock satellite galaxies. Note that generating satellites is the main performance bottleneck as generating central galaxies does not query the particle subsamples. The numba compiler brings the time to generate the satellites down to ∼10× that of the centrals.

The third speed-up comes from pre-downsampling the haloes and particles for satellite generation. The idea here is that satellites are rare compared to centrals, especially at halo mass <10¹⁴. Thus, we aggressively downsample the haloes at smaller halo mass and correspondingly upweight the expected number of satellites in each of the downsampled haloes. This way, we significantly reduce the number of haloes looped over in each HOD evaluation without losing fidelity, yielding a significant performance improvement. Similarly, we can downsample the particles in the haloes to further increase performance. With our final choice of downsampling functions, we achieve another 3× speed-up in our HOD evaluation. We suspect that our downsampling is still relatively conservative, and an even greater speed-up is attainable.

I/O is another performance bottleneck. We largely overcome this by pre-loading the downsampled halo and particle file on memory before we start an optimization chain. To control memory usage, we only load relevant halo and particle information. Since the extended HOD parameters (s, s_p, s_v) only interact with the ranking of particle properties within each halo instead of the particle properties themselves, we can pre-compute these ranks and load them on memory. Another potential slowdown is when the HOD generates too many galaxies. To avoid this problem, we calculate the incompleteness factor f_ic to match the observed galaxy number density before we start an HOD evaluation. Then we scale down the number of centrals and satellites for each halo prior to assigning galaxies.

Finally, we compute the redshift-space 2PCF using the high-performance corrfunc code in parallel (Sinha & Garrison 2020). This specific computation turns out to be fast compared to generating mock galaxies. For the AbacusCosmos simulations, our final optimized pipeline, given our machine specifications, evaluates a new HOD over a ∼10 h⁻¹ Gpc³ volume and computes its χ² in roughly 7 s, 5 s of which are spent on generating mocks and 2 s on computing the 2PCF. The performance is similar for the AbacusSummit simulations. For the rest of this paper, we fit the HOD with only the first eight of the 20 AbacusCosmos simulation boxes, due to limitations in system memory. For the AbacusSummit fits, we only use one 2 h⁻¹ Gpc box at each cosmology, which is equivalent to approximately six AbacusCosmos boxes in volume.

5 RESULTS

Table 1 summarizes the four key fits of this study. All four fits are constrained on the observed ξ(r_p, π) and the observed galaxy number density. The four fits are identical except for which assembly terms are included. The first fit includes both A and A_e. The second and third fits only use one assembly bias term, A and A_e, respectively. The fourth fit includes neither. The comparison of these four fits lets us assess the importance of each assembly bias term. When a fit does not include an assembly bias parameter, we simply fix that parameter to 0. Besides the HOD parameters, we also list the best-fitting χ², the degrees of freedom, the Bayesian information criterion (BIC), and the average halo mass per galaxy. The χ² shown has been corrected for the finite simulation volume, and also corrected for the covariance matrix inversion bias following Hartlap, Simon & Schneider (2007). The limits of the top-hat prior constraints on the parameters are listed in the second column. The prior constraints on log M₁ are not shown because we choose to constrain the satellite fraction 0 < f_sat < 0.2 instead.

Table 1.

Open in new tab

Summary of the key HOD fits in this study. The first column lists the HOD parameters, incompleteness factor f_ic, the final χ², degrees of freedom, and the average halo mass per galaxy |$\log _{10}\bar{M}_\mathrm{ h}/\mathrm{ M}_\odot$|⁠. The second column shows the prior constraints. The next four columns summarize the four key fits of this study. The prior constraints on log M₁ are not listed because we choose to constrain the satellite fraction 0 < f_sat < 0.2 instead. Note there is a one-to-one correspondence between M₁ and f_sat when the other HOD parameters are fixed. The corresponding best-fitting |$f_\mathrm{sat} = 9.4 \ {\rm per \ cent}$|⁠. The errors shown are 1σ marginalized errors.

Parameter name	Prior [min, max]	ξ fit with A and A_e	ξ fit with A	ξ fit with A_e	ξ fit with neither
log₁₀(M_cut/h⁻¹ M_⊙)	[12.5, 14]	13.33 ± 0.03	13.35 ± 0.03	13.37 ± 0.04	13.16 ± 0.02
log₁₀(M₁/h⁻¹ M_⊙)	–	14.47 ± 0.03	14.52 ± 0.02	14.33 ± 0.03	14.34 ± 0.02
σ	[0.1, 2.0]	0.61 ± 0.05	0.52 ± 0.07	0.94 ± 0.06	0.11 ± 0.09
α	[0.7, 1.5]	1.32 ± 0.05	1.39 ± 0.04	1.01 ± 0.04	1.16 ± 0.04
κ	[0.1, 2.0]	0.2 ± 0.1	0.1 ± 0.1	0.2 ± 0.1	0.2 ± 0.1
s	[−1.0, 1.0]	0.1 ± 0.1	0.0 ± 0.1	0.3 ± 0.1	0.6 ± 0.1
s_v	[−1.0, 1.0]	0.8 ± 0.1	0.6 ± 0.1	0.1 ± 0.1	0.5 ± 0.1
α_c	[0.0, 2.0]	0.22 ± 0.03	0.26 ± 0.01	0.07 ± 0.07	0.21 ± 0.02
s_p	[−1.0, 1.0]	−1.0 ± 0.1	−0.9 ± 0.2	−1.0 ± 0.1	−1.0 ± 0.1
A	[−1.0, 1.0]	−0.88 ± 0.08	−0.79 ± 0.06	–	–
A_e	[−1.0, 1.0]	0.040 ± 0.009	–	0.032 ± 0.05	–
f_ic	–	0.91	1.00	0.74	0.67
Final χ² (degrees of freedom)	–	50 (37)	67 (38)	73 (38)	93 (39)
BIC	–	92	107	111	128
\|$\log _{10}\bar{M}_\mathrm{ h}/\mathrm{ M}_\odot$\|	–	13.52	13.56	13.57	13.61

Parameter name	Prior [min, max]	ξ fit with A and A_e	ξ fit with A	ξ fit with A_e	ξ fit with neither
log₁₀(M_cut/h⁻¹ M_⊙)	[12.5, 14]	13.33 ± 0.03	13.35 ± 0.03	13.37 ± 0.04	13.16 ± 0.02
log₁₀(M₁/h⁻¹ M_⊙)	–	14.47 ± 0.03	14.52 ± 0.02	14.33 ± 0.03	14.34 ± 0.02
σ	[0.1, 2.0]	0.61 ± 0.05	0.52 ± 0.07	0.94 ± 0.06	0.11 ± 0.09
α	[0.7, 1.5]	1.32 ± 0.05	1.39 ± 0.04	1.01 ± 0.04	1.16 ± 0.04
κ	[0.1, 2.0]	0.2 ± 0.1	0.1 ± 0.1	0.2 ± 0.1	0.2 ± 0.1
s	[−1.0, 1.0]	0.1 ± 0.1	0.0 ± 0.1	0.3 ± 0.1	0.6 ± 0.1
s_v	[−1.0, 1.0]	0.8 ± 0.1	0.6 ± 0.1	0.1 ± 0.1	0.5 ± 0.1
α_c	[0.0, 2.0]	0.22 ± 0.03	0.26 ± 0.01	0.07 ± 0.07	0.21 ± 0.02
s_p	[−1.0, 1.0]	−1.0 ± 0.1	−0.9 ± 0.2	−1.0 ± 0.1	−1.0 ± 0.1
A	[−1.0, 1.0]	−0.88 ± 0.08	−0.79 ± 0.06	–	–
A_e	[−1.0, 1.0]	0.040 ± 0.009	–	0.032 ± 0.05	–
f_ic	–	0.91	1.00	0.74	0.67
Final χ² (degrees of freedom)	–	50 (37)	67 (38)	73 (38)	93 (39)
BIC	–	92	107	111	128
\|$\log _{10}\bar{M}_\mathrm{ h}/\mathrm{ M}_\odot$\|	–	13.52	13.56	13.57	13.61

Table 1.

Open in new tab

Summary of the key HOD fits in this study. The first column lists the HOD parameters, incompleteness factor f_ic, the final χ², degrees of freedom, and the average halo mass per galaxy |$\log _{10}\bar{M}_\mathrm{ h}/\mathrm{ M}_\odot$|⁠. The second column shows the prior constraints. The next four columns summarize the four key fits of this study. The prior constraints on log M₁ are not listed because we choose to constrain the satellite fraction 0 < f_sat < 0.2 instead. Note there is a one-to-one correspondence between M₁ and f_sat when the other HOD parameters are fixed. The corresponding best-fitting |$f_\mathrm{sat} = 9.4 \ {\rm per \ cent}$|⁠. The errors shown are 1σ marginalized errors.

Parameter name	Prior [min, max]	ξ fit with A and A_e	ξ fit with A	ξ fit with A_e	ξ fit with neither
log₁₀(M_cut/h⁻¹ M_⊙)	[12.5, 14]	13.33 ± 0.03	13.35 ± 0.03	13.37 ± 0.04	13.16 ± 0.02
log₁₀(M₁/h⁻¹ M_⊙)	–	14.47 ± 0.03	14.52 ± 0.02	14.33 ± 0.03	14.34 ± 0.02
σ	[0.1, 2.0]	0.61 ± 0.05	0.52 ± 0.07	0.94 ± 0.06	0.11 ± 0.09
α	[0.7, 1.5]	1.32 ± 0.05	1.39 ± 0.04	1.01 ± 0.04	1.16 ± 0.04
κ	[0.1, 2.0]	0.2 ± 0.1	0.1 ± 0.1	0.2 ± 0.1	0.2 ± 0.1
s	[−1.0, 1.0]	0.1 ± 0.1	0.0 ± 0.1	0.3 ± 0.1	0.6 ± 0.1
s_v	[−1.0, 1.0]	0.8 ± 0.1	0.6 ± 0.1	0.1 ± 0.1	0.5 ± 0.1
α_c	[0.0, 2.0]	0.22 ± 0.03	0.26 ± 0.01	0.07 ± 0.07	0.21 ± 0.02
s_p	[−1.0, 1.0]	−1.0 ± 0.1	−0.9 ± 0.2	−1.0 ± 0.1	−1.0 ± 0.1
A	[−1.0, 1.0]	−0.88 ± 0.08	−0.79 ± 0.06	–	–
A_e	[−1.0, 1.0]	0.040 ± 0.009	–	0.032 ± 0.05	–
f_ic	–	0.91	1.00	0.74	0.67
Final χ² (degrees of freedom)	–	50 (37)	67 (38)	73 (38)	93 (39)
BIC	–	92	107	111	128
\|$\log _{10}\bar{M}_\mathrm{ h}/\mathrm{ M}_\odot$\|	–	13.52	13.56	13.57	13.61

Parameter name	Prior [min, max]	ξ fit with A and A_e	ξ fit with A	ξ fit with A_e	ξ fit with neither
log₁₀(M_cut/h⁻¹ M_⊙)	[12.5, 14]	13.33 ± 0.03	13.35 ± 0.03	13.37 ± 0.04	13.16 ± 0.02
log₁₀(M₁/h⁻¹ M_⊙)	–	14.47 ± 0.03	14.52 ± 0.02	14.33 ± 0.03	14.34 ± 0.02
σ	[0.1, 2.0]	0.61 ± 0.05	0.52 ± 0.07	0.94 ± 0.06	0.11 ± 0.09
α	[0.7, 1.5]	1.32 ± 0.05	1.39 ± 0.04	1.01 ± 0.04	1.16 ± 0.04
κ	[0.1, 2.0]	0.2 ± 0.1	0.1 ± 0.1	0.2 ± 0.1	0.2 ± 0.1
s	[−1.0, 1.0]	0.1 ± 0.1	0.0 ± 0.1	0.3 ± 0.1	0.6 ± 0.1
s_v	[−1.0, 1.0]	0.8 ± 0.1	0.6 ± 0.1	0.1 ± 0.1	0.5 ± 0.1
α_c	[0.0, 2.0]	0.22 ± 0.03	0.26 ± 0.01	0.07 ± 0.07	0.21 ± 0.02
s_p	[−1.0, 1.0]	−1.0 ± 0.1	−0.9 ± 0.2	−1.0 ± 0.1	−1.0 ± 0.1
A	[−1.0, 1.0]	−0.88 ± 0.08	−0.79 ± 0.06	–	–
A_e	[−1.0, 1.0]	0.040 ± 0.009	–	0.032 ± 0.05	–
f_ic	–	0.91	1.00	0.74	0.67
Final χ² (degrees of freedom)	–	50 (37)	67 (38)	73 (38)	93 (39)
BIC	–	92	107	111	128
\|$\log _{10}\bar{M}_\mathrm{ h}/\mathrm{ M}_\odot$\|	–	13.52	13.56	13.57	13.61

We define the average halo mass per galaxy simply as

$$\begin{eqnarray*} \bar{M}_\mathrm{ h} = \frac{\sum _\mathrm{ g} M_\mathrm{ h}}{N_\mathrm{ g}}, \end{eqnarray*}$$

(13)

where the numerator sums the halo masses of all galaxies in the mock, and N_g is the total number of galaxies in the mock. The average halo mass per galaxy characterizes the typical halo mass of a galaxy given an HOD prescription, which is interesting in assessing the effect of assembly bias and indicative of the galaxy–galaxy lensing prediction.

5.1 ξ(r_p, π) fit with both A and A_e

The first (leftmost) HOD fit listed in Table 1 uses the full set of extended HOD parameters including both assembly bias terms. We achieve a good fit on ξ(r_p, π), with a final χ² = 50 (degrees of freedom = 37) and a BIC of 92. For comparison, if we fit the ξ(r_p, π) with just the standard five-parameter HOD with no extensions, then we get χ² = 151 and a BIC of 174. This shows that the standard five-parameter HOD is strongly disfavoured by the redshift-space correlation function.

Fig. 3 visualizes the corresponding best-fitting 2PCF. The left-hand panel shows the best-fitting projected 2PCF in orange and the observation in blue. The middle panel shows the normalized difference between the best-fitting ξ(r_p, π) and the observation. The normalization σ(ξ) is derived from of the diagonal of the inverse covariance matrix, i.e. |$\sigma = 1/\sqrt{\mathrm{diag}(\boldsymbol{\sf C}^{-1})}$|⁠. The right-hand panel shows the χ² contribution from each bin, computed by multiplying array |$(\boldsymbol {\xi }_{\mathrm{mock}} - \boldsymbol {\xi }_{\mathrm{data}})$| with array |$\boldsymbol{\sf C}^{-1}(\boldsymbol {\xi }_{\mathrm{mock}} - \boldsymbol {\xi }_{\mathrm{data}})$| termwise. Summing over these terms gives the final |$\chi ^2_\xi$|⁠, as in equation (11).

Figure 3.

The best-fitting 2PCF using AbacusCosmos boxes, showing the ξ(r_p, π) fit with both assembly bias terms A and A_e. The left-hand panel compares the projected 2PCF of the best-fitting HOD (orange) to that of the data (blue). The error bars on the data are computed from the diagonal of the inverted covariance matrix. The middle panel compares the best-fitting ξ(r_p, π) with the data, where the errors, σ(ξ), are computed from the diagonal of the inverse covariance matrix. The right-hand panel showcases the contribution to the final χ² from each bin.

Open in new tab Download slide

We see that the best-fitting HOD does provide a good fit to the projected 2PCF and the redshift-space 2PCF. While a few bins in ξ(r_p, π) seem to exhibit higher normalized error in the middle panel, it is important to note that the error bars σ(ξ) are computed from the diagonal of the inverse covariance matrix, underestimating the true uncertainty due to the high off-diagonal power in the covariance matrix (refer to the right-hand panel of Fig. 2). Thus, the actual difference between the mock and the data in these bins is less significant than it appears. The right-hand panel incorporates the full covariance matrix and is thus more informative in judging the consistency between data and mock in each bin. One bin (column 4 row 4) stands out as it contributes 7 to the final χ². Excluding this bin from the HOD fit does not meaningfully change the final parameter values.

To take a closer look at the best-fitting parameter values, let us first take a look at the assembly bias parameters. The concentration-based assembly bias parameter A varies somewhat depending on the details of the fit, but generally yields rather negative best-fitting values from −0.5 to −0.9. The variation is likely affected by degeneracies within the HOD space and multimodality in the likelihood surface, in which case the error bar shown is underestimated. The negative A has the effect of moving galaxies into less massive and puffier haloes. In terms of its clustering signature, a negative A increases the projected clustering at intermediate and large transverse scales (r_p > 1 h⁻¹ Mpc) while reducing clustering at large LOS separations, especially at small transverse scales (r_p < 1 h⁻¹ Mpc), suggesting a smaller velocity dispersion. Lange et al. (2019) also show degeneracies between A and cosmology, specifically fσ₈. Thus, a negative A could also indicate that our presumed fσ₈ is too high. Unfortunately, we do not have a sufficiently large range in the cosmologies probed by our simulations to properly verify this possibility. Thus, we leave it as an opportunity for a future study.

The environment-based assembly bias parameter A_e yields a best-fitting value of 0.04 and is stable across different fits, suggesting that galaxies preferentially populate haloes in denser environments. We point out that the relative magnitudes of A and A_e parameters are deceptive as the actual amplitude of the effect depends on that value of the normalized concentration δ_c and the environment factor f_env (equation 5 and 7). The actual contribution of A_e = 0.04 and A = −0.8 on ξ(r_p, π) are comparable in amplitude.

The other extended parameters also yield interesting fits. s, which sets the radial distribution of satellite galaxies, is slightly positive but consistent with zero. We find a large variation in the best-fitting s_v, anywhere from 0.2 to 0.8, depending on the details of the fit. This variation is again likely due to degeneracies between s_v and some other levers in the extended HOD, leading to multimodality in the likelihood surface. Regardless, the positive s_v has the effect of increasing satellite velocities and thus increasing the Fingers-of-God effect. Unlike s_v, the best-fit for the central velocity bias parameter α_c is stable at ∼0.2 across different fits. This is consistent with the multipole fits in Guo et al. (2015). The best-fitting value of s_p is also consistently close to −1 across our fits, suggesting that the observation strongly favours to put some satellites on highly eccentric orbits that pass through the central regions of the halo. s_p is a novel addition to the HOD, and its significance possibly indicates the existence of a subset of infalling or splashback galaxies in the CMASS sample. This is an interesting result and we reserve a more detailed discussion on s_p in a separate paper.

5.2 ξ(r_p, π) fit without both A and A_e

The second and third fit in Table 1 only incorporate one assembly bias term, A and A_e, respectively. The fourth fit includes neither assembly bias terms. Compared to the first fit with both A and A_e, the second and third fit yield significantly worse χ², with large increase to the BIC, ΔBIC = 15 and 19, respectively. Conversely, comparing to the fit with no assembly bias, the inclusion of either A or A_e leads to significantly better fits. Comparing the first fit and the fourth fit, we see that the inclusion of both assembly bias term is strongly favoured by the data, with a ΔBIC = −36. Thus, we conclude that the redshift-space 2PCF calls for the simultaneous inclusion of both a concentration-based assembly bias and an environmental assembly bias, at least in our HOD framework at Planck 2015 cosmology.

Fig. 4 shows the residuals in the redshift-space 2PCF ξ(r_p, π) compared to the data when neither assembly biases is included in the fit, i.e. the last fit shown in Table 1. Compared to the middle panel of Fig. 3, where we show the residuals of the HOD fit including both assembly biases, we see that the residuals here are noticeably larger, especially at large r_p ∼5–10 h⁻¹ Mpc. The LOS structure reproduced by the no assembly bias fit is also worse, across all r_p. To explain this, note that on the one hand, the concentration-based assembly bias A has a strong effect on the galaxy velocity dispersion as it moves galaxies into more or less massive haloes. Thus, A is strongly sensitive to the LOS structure of ξ(r_p, π), at small and large r_p. This explains why the inclusion of A improves the fit on the LOS structure of ξ(r_p, π). The environmental assembly bias A_e, on the other hand, does not produce as strong a LOS signature, but it predominantly affects the projected clustering on intermediate scales r_p ∼2–10 h⁻¹ Mpc. This explains why the inclusion of A_e in the model reduces the residuals at r_p ∼5–10 h⁻¹ Mpc. This comparison between Fig. 4 and the middle panel of Fig. 3 directly showcases how the inclusion of the two assembly bias terms improve the ξ(r_p, π) fit and highlights their importance in a flexible HOD model.

Figure 4.

The residual in the redshift-space 2PCF ξ(r_p, π) relative to the data when neither assembly biases is included in the HOD, corresponding to the last best fit listed in Table 1. Comparing to the fit including both assembly biases (middle panel of Fig. 3), we see significantly larger residuals at large r_p, and worse prediction of the LOS structure.

Open in new tab Download slide

In terms of the average halo mass per galaxy, we see that the inclusion of either assembly bias results in a 10–12 |${{\rm per\ cent}}$| decrease compared to the fit with no assembly bias, whereas the inclusion of both terms results in a |$23{{\ \rm per\ cent}}$| decrease. In comparison, a five-parameter HOD plus parameters s, A, and A_e constrained on the projected correlation function w_p and galaxy number density gives an average mass of |$\bar{M}_\mathrm{ h} = 10^{13.62}\,\mathrm{ M}_\odot$|⁠, |$26{{\ \rm per\ cent}}$| larger than that of the ξ(r_p, π) fit with both assembly bias turned on. This indicates that the redshift-space 2PCF prefers to assign galaxies to lower mass haloes at fixed bias, and the two assembly bias terms give the HOD flexibility to do so, allowing for a much better fit. The projected 2PCF does not exhibit this preference, even when modelled with both assembly biases turned on. This result highlights the extra constraining power offered by the LOS structure of the redshift-space 2PCF. The decrease in typical halo mass for galaxies also has significant implications for the galaxy–galaxy lensing signal, as we will discuss in the following subsection.

While we should not overinterpret the final parameter values of these fits due to the poor χ², we can still compare the parameter values across these fits to gain intuition on what exactly is driving the fit. When A is included, the fit prefers a strongly negative value, which has the effect of increasing large-scale clustering while decreasing the typical halo mass of galaxies. However, a negative A also results in a smaller velocity dispersion on the very small scale due to the less massive and puffier haloes. To compensate for this effect, the fit chooses a larger log₁₀M₁, α, and s_v, which increases the Fingers-of-God effect on the small scale. In other words, it seems that the strong clustering amplitude at large scales is driving galaxies into less massive haloes with a negative A, and then the satellite distribution parameters are then tuned to match the small-scale Fingers-of-God signature.

The inclusion of A_e has a similar effect, where a positive A_e increases clustering on larger scales while decreasing the typical halo mass of galaxies. However, compared to A, the clustering signature of A_e is more dependent on r_p, specifically its signature is strongest in the 1 < r_p < 4 h⁻¹ Mpc range and weakens beyond that, whereas the signature of A remains strong up to 30 h⁻¹ Mpc. A_e also produces a weaker gradient along the LOS and has a rather small signature on the small scale (r_p < 1 h⁻¹ Mpc) compared to A. Thus, the inclusion of A_e triggers less of response from the parameters that control the satellites and the velocity dispersion, but it further decreases the average halo mass while boosting intermediate- to large-scale clustering.

5.3 The galaxy–galaxy lensing prediction

A well-known tension exists between galaxy clustering and galaxy–galaxy lensing. Leauthaud et al. (2017) found discrepancies of 20–40 |${{ \rm per\ cent}}$| between their measurements of galaxy–galaxy lensing for CMASS galaxies and a model predicted from mock galaxy catalogues generated at Planck cosmology that match the CMASS projected correlation function (Reid et al. 2014; Saito et al. 2016; see fig. 7 of Leauthaud et al. 2017). Lange et al. (2019) extended this result by finding a similar |${\sim } 25{{\ \rm per\ cent}}$| discrepancy between the projected clustering measurement and the galaxy–galaxy lensing measurement in the BOSS LOWZ sample. In Yuan et al. (2020), we reaffirmed this tension by fitting simultaneously the projected galaxy clustering and galaxy–galaxy lensing with an extended the HOD incorporating a concentration-based assembly bias prescription. The left-hand panel of Fig. 5 reproduces this tension, where the blue curve showcases the observed lensing signal of the CMASS sample, and the red curve shows the prediction of a HOD constrained on the projected 2PCF. The dashed green curve indicates the joint fit from Yuan et al. (2020). The best-fitting HOD corresponding to the red curve achieves a very good fit of the projected 2PCF (χ² = 5.6 and degrees of freedom = 9), with |$\lt 1{{\ \rm per\ cent}}$| error across all bins. However, it provides a very poor lensing prediction, approximately 20–40 |${{ \rm per\ cent}}$| larger than the observation. The joint fit of the projected 2PCF w_p and the galaxy–galaxy lensing shown in the dashed green curve fails to fit either observable well, resulting in a 10–20 |${{ \rm per\ cent}}$| discrepancy with the observed w_p (shown in fig. 4 of Yuan et al. 2020) while reducing the lensing discrepancy by |${\sim }10{{\ \rm per\ cent}}$|⁠, not enough to reconcile with the observation.

Figure 5.

The comparison between the predicted galaxy–galaxy lensing signal and the observed galaxy–galaxy lensing signal. In both panels, the blue curve shows the observed lensing signal of the CMASS galaxy sample, quoted directly from Leauthaud et al. (2017). In the left-hand panel, the solid red curve shows the predicted lensing signal if we constrain the HOD with just the projected correlation function w_p and the galaxy number density. The dashed green line shows the w_p + ΔΣ emulator fit from Yuan et al. (2020). In the right-hand panel, the magenta curve shows the predicted lensing signal of our standard ξ(r_p, π) fit, including both A and A_e. The dashed yellow (cyan) curve shows the prediction when we fit the ξ(r_p, π) with only A (A_e). The grey dotted line shows the prediction when we do not include either assembly biases in the HOD model. We see that by fitting the full redshift-space correlation function and incorporating both assembly biases into the HOD, we significantly reduce the tension between data and predictions, down to about 1σ level.

Open in new tab Download slide

The inclusion of both assembly bias terms (A and A_e) and switch to redshift-space 2PCF present an opportunity at reducing this tension. As we have shown, both assembly bias terms allow the redshift-space 2PCF to drive the fit in the direction of reducing the typical halo mass of galaxies. The right-hand panel of Fig. 5 shows the galaxy–galaxy lensing predictions of our redshift-space clustering fits. Again, the solid blue curve shows the measurement from Leauthaud et al. (2017) on the CMASS sample. The solid magenta line represents the ξ(r_p, π) fit with both A and A_e, whereas the dashed yellow and cyan curves show the fit with just A and A_e, respectively. The dotted grey curve shows the fit with neither assembly bias terms. As expected, the introduction of both assembly bias terms leads to lower predicted lensing signal by assigning galaxies to less massive haloes.

The prediction of the ξ(r_p, π) fit with both A and A_e matches the observation to within 1σ. This is a significant improvement compared to the 3σ discrepancies found with previous model predictions that fit the projected 2PCF w_p using more simplistic HOD models (e.g. Reid et al. 2014; Rodríguez-Torres et al. 2016; Saito et al. 2016; Alam et al. 2017; Lange et al. 2019; Yuan et al. 2020). Comparing the predictions on the right-hand panel, we see that the decrease in the lensing signal really comes from a combination of incorporating assembly biases and constraining on the redshift-space 2PCF. Without the assembly bias terms, the ξ(r_p, π) fit actually shows no improvement over the w_p fit. The flexibility introduced by the assembly bias terms is what allows for a good ξ(r_p, π) fit, which drives down the typical halo mass of galaxies, thus decreasing the lensing signal. Comparing the ‘with A’ fit on the right with the ‘w_p + ΔΣ fit’ on the left, we see that the inclusion of just a concentration-based assembly bias results in a similar lensing prediction, even though one is constrained on ξ(r_p, π) and the other is constrained on w_p + ΔΣ. However, it is clear that the just a concentration-based assembly bias term is not sufficient in predicting the observed lensing signal. The combination of A and A_e is what uniquely brings the lensing prediction in agreement with the observation while also providing a good fit to ξ(r_p, π).

It is interesting that the inclusion of A brings the galaxy–galaxy lensing prediction into better consistency with data than the inclusion of A_e. This hierarchy between A and A_e echoes with the fact that the inclusion of A yields a somewhat lower BIC and a slightly better fit to ξ(r_p, π) than the inclusion of A_e in Table 1. This could mean that while both secondary dependencies are required to yield consistent predictions with data, the secondary dependency on concentration is more important than the dependency on environment in producing a more realistic HOD.

While we have shown that the combination of A and A_e is powerful in modelling both redshift-space clustering and galaxy–galaxy lensing, we do not claim that we have found the silver bullet to resolving the lensing tension. However, we believe that this represents a promising path towards reducing the lensing tension. A full solution of the lensing tension will likely also appeal to better handling of systematics and possibly small corrections in cosmology. We also do not claim that we have found the true prescription of galaxy assembly bias or the correct HOD model. Nevertheless, our findings highlight the importance of constructing flexible galaxy–halo connection models and employing more informative clustering statistics such as the redshift-space 2PCF. This result echoes the findings of Zu (2020), where the author found a sophisticated HOD model with detailed treatments of selection effects could reconcile the galaxy–galaxy lensing tension. However, it is argued that their model triggers an unrealistic satellite fraction (Lange et al. 2021). Amodeo et al. (2020) recently constrained cluster gas dynamics using Sunyaev–Zel'dovich effect and found an excess non-thermal pressure due to baryonic processes that would reduce the lensing tension by |$50{{\ \rm per\ cent}}$|⁠. These energetic ejection processes towards larger halo radii are consistent with the positive environmental dependency that we found, highlighting the importance of baryonic structure beyond the typical virial radius of the halo.

5.4 Investigating the environmental assembly bias A_e

The novel environmental assembly bias parameter deserves some special attention as we showed that it behaves rather differently than the concentration-based assembly bias parameter and that it is indispensable in modelling the redshift-space 2PCF and predicting the observed lensing signal. We find further support for its legitimacy in the fact that its best-fitting value is remarkably stable across all our fits, despite variations to the HOD, likelihood function, compression of the data, and modest perturbations to the assumed cosmology. We visualize its best-fitting values across all these variations in Fig. 6, where the blue markers represent fits using the AbacusCosmos simulations, and the red markers represent fits using the AbacusSummit simulations. We briefly describe each of these fits as follows.

ξ(r_p, π) fit. The ξ(r_p, π) fit with both A and A_e as shown in Table 1, using eight AbacusCosmos boxes at Planck 2015 cosmology.
Seed 1–2. Same as ξ(r_p, π) fit, except using different random number seeds to marginalize over shot noise effects.
Diff boxes. Same as ξ(r_p, π) fit, except using a different set of eight of the 20 simulation boxes to marginalize over sample variance effects.
Weak |$\chi ^2_{n_\mathrm{ g}}$|⁠. Weakening the n_g component of the likelihood function. Specifically, we modify |$\chi ^2_{n_\mathrm{ g}}$| as defined in equation (12) to essentially a step function:
$$\begin{eqnarray*} \chi ^2_{n_\mathrm{ g}} = \left\lbrace \begin{array}{@{}l@{\quad }l@{}}\left(\frac{n_{\mathrm{mock}} - n_{\mathrm{data}}}{\sigma _{n}}\right)^2 & (n_{\mathrm{mock}} < n_{\mathrm{data}}), \\ 0 & (n_{\mathrm{mock}} \ge n_{\mathrm{data}}), \end{array}\right. \end{eqnarray*}$$
(14)
where σ_n ≈ 3 × 10⁻⁶ h³ Mpc⁻³ is the jackknife uncertainty on the observed n_g. This new |$\chi ^2_{n_\mathrm{ g}}$| does not penalize the HOD for producing too many galaxies, allowing for rather low incompleteness factors. The best fit yields an incompleteness factor of f_ic = 0.67.
New π bins. Non-linear binning along the LOS direction to give more weight to the very small scales ∼1 h⁻¹ Mpc. The new bins are π = [0, 0.5, 1, 5, 10, 20, 30].
Sigmoid n_cent. Using a sigmoid function instead of an error function for |$\bar{n}_\mathrm{cent}$|⁠. This test is done to address concerns that the error function was chosen arbitrarily for the HOD and might not correctly represent the physical central galaxy occupation distribution. The sigmoid function serves as an alternative ‘switch’ function from 0 to 1, providing a ‘softer’ ramp-up relative to the error function. Fig. 7 showcases the difference between a pair of similar sigmoid and error functions. We find that the best-fitting A_e is consistent with that of the error function fits. We also find no significant preference for using either the error function or the sigmoid function in the HOD.
w_p + multipole. Fitting w_p + ξ₀ + ξ₂ instead of ξ(r_p, π). This fit and the next fit test whether different compressions of the redshift-space 2PCF affects the best-fitting A_e.
ξ(s, μ) fit. Fitting ξ(s, μ) instead of ξ(r_p, π). s and μ essentially represent a polar coordinate system in the pair separation space, where s is the scalar separation between the two galaxies, and μ denotes the angle between the pair separation vector and the LOS.
Summit. ξ(r_p, π) fit using one AbacusSummit box at Planck 2018 cosmology. The haloes are identified using the CompaSO halo finder instead of rockstar.
Summit + |$2{{\ \rm per\ cent}} \ \sigma _8$|⁠. Same as Summit fit, but with |$2{{\ \rm per\ cent}}$| larger σ₈ in the assumed cosmology.
Summit − |$2{{\ \rm per\ cent}} \ \sigma _8$|⁠. Same as Summit fit, but with |$2{{\ \rm per\ cent}}$| lower σ₈ in the assumed cosmology.
Summit + |$2{{\ \rm per\ cent}} \ \Omega _\mathrm{ M}$|⁠. Same as Summit fit, but with |$2{{\ \rm per\ cent}}$| larger Ω_M in the assumed cosmology.
Summit − |$2{{\ \rm per\ cent}} \ \Omega _\mathrm{ M}$|⁠. Same as Summit fit, but with |$2{{\ \rm per\ cent}}$| lower Ω_M in the assumed cosmology.

Figure 6.

The best-fitting values of the environmental assembly bias parameter A_e across various fits. The blue markers represent fits using the AbacusCosmos simulations, whereas the red markers represent fits using the AbacusSummit simulations. The error bars are marginalized 1σ error bars. The dashed green line represents the average best-fitting value across all the fits.

Open in new tab Download slide

Figure 7.

The top panel shows an example of a sigmoid function compared to a similar error function. The specific functions shown here are 0.5 erfc(x) in blue and sigmoid(2.5x) in orange. The bottom panel shows the difference between the two functions. Overall, the sigmoid produces a steeper incline in the middle but has a slower convergence to 1.

Open in new tab Download slide

We see that within the AbacusCosmos fits in blue, all the A_e values are consistent within 1σ, regardless of all the variations we introduced. The A_e values for the AbacusSummit fits in red are more dispersed, likely caused by changes in cosmology and larger uncertainty due to the smaller simulation volume at perturbed cosmologies. The shift in A_e between the two sets of simulations is likely due to a combination of the slightly different cosmology (Planck 2015 versus Planck 2018), different N-body codes, and different halo finders (rockstar versus CompaSO). Regardless, the consistent 3σ–5σ preference for a positive A_e indicates that the environment dependence is a relatively standalone effect independent of concentration-based assembly bias and other HOD parameters, at least in our HOD framework. The positive A_e value with reasonable signal-to-noise ratio is also consistent with the environmental assembly bias signatures found in hydrodynamical simulations (Hadzhiyska et al. 2020, 2021; Xu et al. 2021). Finally, we note that while we find that the environmental dependence is important, we do not yet understand why it is so. We make a few attempts at gaining intuition on this phenomenon in Sections 6.3 and 6.4.

6 DISCUSSION

6.1 Testing HOD parameter recovery

To verify the effectiveness of our fitting procedure, we apply the fitting procedure to a mock redshift-space 2PCF generated from the simulations using a fiducial extended HOD. We also generate the jackknife covariance matrix from the same set of mocks, normalized to the BOSS CMASS volume. In the first test, we generate the mock observed 2PCF from the same set of eight simulation boxes that are then used to do the fitting. This avoids the effects of cosmic variance, and only tests whether our optimization routine is capable of recovering the correct underlying HOD. We find excellent recovery of all HOD parameters, with errors typically |$\lt 1{{\ \rm per\ cent}}$|⁠. The parameter κ has the worst recovery, with errors of a few per cent.

In the second test, we generate the mock observed 2PCF from eight different simulation boxes than the ones used for fitting, thus introducing sample variance. We repeat this test for different fiducial HODs, we find excellent recovery of both A and A_e, with maximum recovery error significantly less than their 1σ error bars quoted in Table 1. We also generally get good recovery accuracy (<1σ error) on most other HOD parameters, such as M_cut, M₁, σ, α, s, s_v, α_c, and s_p. However, the recovery accuracy on κ is a notably worse, at around 2σ. This might be attributed to the fact that the redshift-space 2PCF is not particularly sensitive to changes to κ, as it only modifies satellite occupation at small halo mass. Overall, our tests show that at the current level of systematics, the two assembly bias parameters are well constrained by the redshift-space 2PCF on the scales that we chose. Most of the other extended HOD parameters are also well constrained relative to their error bars, except for κ.

6.2 Fitting the redshift-space multipoles

While in this work we adopted the novel approach of directly fitting the 2D ξ(r_p, π), the more common approach is to fit the first multipoles of the redshift-space 2PCF. In principle, the full multipole expansion should contain the same information as the 2D ξ(r_p, π). However, different choices in binning and the fact that most fits were done with only the first two or three multipole terms mean that different regions of the (r_p, π) separation space enter the multipole fit with different weights. Thus, the first multipoles capture a different set of clustering information compared to ξ(r_p, π).

The most relevant work is Guo et al. (2015), where the authors achieved a good fit on a set of BOSS CMASS redshift-space multipoles without invoking any assembly bias prescription in their HOD, seemingly contradicting our finding. Besides that fact that our study chooses a different data compression in ξ(r_p, π), another key difference between Guo et al. (2015) and this study lies in our novel velocity bias model. While our velocity bias model changes the satellites’ velocities and positions simultaneously to preserve Newtonian physics, the Guo et al. (2015) model modifies the satellite velocities without changing their radial positions, allowing for exotic satellite trajectories that do not obey the physics of the potential well. It is possible that this extra flexibility in satellite velocity removes the need for assembly bias in the HOD model. To test this hypothesis, we perform ξ(r_p, π) fits, replacing our velocity bias model with that of Guo et al. (2015). However, we continue to find strong evidence for assembly bias, where the inclusion of assembly bias parameter A improves the χ²/degrees of freedom by 1.

This suggests that the different data compression (ξ(r_p, π) versus multipoles) might be responsible for the contradicting conclusions regarding assembly bias. To test this, we fit the first multipoles, up to l = 4, adopting the velocity bias model of Guo et al. (2015). Indeed, we find that, in this case, the inclusion of A improves the fit by a much smaller amount (Δχ²/degrees of freedom = 0.2), yielding weak evidence for assembly bias. Furthermore, we showcase the best-fitting multipoles in Fig. A1. We see that the best fit manages to reproduce the data up to l = 6, but fails to reproduce the data at l = 8. This suggests that perhaps the first multipoles do not capture the full redshift-space clustering information at these scales, and there is a significant amount of information leftover in the high multipoles. This might not be surprising considering the small-scale redshift-space distortion (RSD) signature is dominated by pairs with small transverse separation (r_p ∼ 1 Mpc) and large LOS separation (π ∼ 20 Mpc). Such signature is poorly localized in a multipole decomposition. We also examine the multipoles of the ξ(r_p, π) fit and find that while it does not reproduce the data at l = 2, 4 quite as well as the multipole fit, it does reproduce the data much better at l = 8. Thus, this shows that the multipoles capture a different subset of the full clustering information than ξ(r_p, π), resulting in different HOD fits.

Saito et al. (2016) implement a subhalo abundance matching (SHAM) model based on the peak maximum circular velocity V_peak to model the BOSS CMASS 2PCF. The V_peak-based SHAM model is of particular interest here because it naturally accounts for some assembly bias as V_peak is a derivative of the halo merger tree and thus encodes some assembly history information. However, while they found a good fit on the projected 2PCF, they found poor consistency with the first and second redshift-space multipoles. This again highlights how different compressions of the full redshift-space clustering capture different subset of the information content, and also the lack of constraining power of the projected 2PCF.

6.3 Alternative environment definitions and scale dependence

In this section, we further explore the environmental assembly bias by testing variations to the definition of halo environment. Specifically, we vary the radius, r_max, within which the local environment is calculated. Then, we test a different definition of the local environment altogether, where we use the Gaussian smoothed local density field instead of mass enveloped within r_max.

The left-hand panel of Fig. 8 shows the best-fitting environmental assembly bias parameter A_e and the corresponding best-fitting χ² when we vary the maximum radius r_max of the halo environment definition. Note again that the environment is defined as the sum of the mass of neighbouring haloes beyond the virial radius but within r_max. We refer to this mass definition as the top-hat definition. We see that the amplitude of the A_e best-fitting peaks at r_max = 6 h⁻¹ Mpc, whereas the smallest χ² is achieved at r_max = 4 h⁻¹ Mpc. At smaller and larger r_max, the best-fitting A_e trends towards zero and the χ² increases. Qualitatively, this shows that the goodness-of-fit of the model is sensitive to the choice of r_max, specifically we find r_max ≈ 4–6 h⁻¹ Mpc to be strongly preferred by the data.

The scale dependence of the Ae fit. (a) The best-fitting environmental assembly bias parameter Ae and the corresponding best-fitting χ2 as a function of maximum radius of the environment rmax. (b) The same as (a), except the environment is now defined in terms of the local halo density field smoothed with a Gaussian of scale rsmoothed. The error bars represent 1σ uncertainties. We see that, in the top-hat case, a positive Ae is only preferred by the data when rmax ≈ 4–6 h−1 Mpc, and in the smoothed case, a positive Ae is only preferred when rsmoothed ≈ 2 h−1 Mpc.

Figure 8.

The scale dependence of the A_e fit. (a) The best-fitting environmental assembly bias parameter A_e and the corresponding best-fitting χ² as a function of maximum radius of the environment r_max. (b) The same as (a), except the environment is now defined in terms of the local halo density field smoothed with a Gaussian of scale r_smoothed. The error bars represent 1σ uncertainties. We see that, in the top-hat case, a positive A_e is only preferred by the data when r_max ≈ 4–6 h⁻¹ Mpc, and in the smoothed case, a positive A_e is only preferred when r_smoothed ≈ 2 h⁻¹ Mpc.

Open in new tab Download slide

The right-hand panel shows the best-fitting A_e and the corresponding χ² for the Gaussian smoothed environment definition, where we define the environment as the local halo number density smoothed with a Gaussian kernel of scale r_smoothed. We see a qualitatively similar behaviour as the top-hat definition, where the goodness-of-fit of the model is strongly dependent on the choice of r_smoothed. In this case, we find r_smoothed ≈ 2 h⁻¹ Mpc is the smoothing scale that best minimizes χ² and maximizes the A_e value. This is qualitatively consistent with the preference for r_max ≈ 4–6 h⁻¹ Mpc in the top-hat case, given the extended nature of a Gaussian filter. We should note that the actual value of A_e is not comparable between the two environment definitions, as the amplitude of A_e is degenerate with the amplitude of the environment parameter f_env (equation 7). Because the Gaussian smoothed environment is defined on the halo neighbour count while the top-hat environment is defined on mass, the amplitude and distribution of f_env are actually quite different between the two definitions.

Both panels suggest that the halo environment at some specific scale of a few megaparsecs is particularly informative of the assembly bias effects, at least for the particular observable we are fitting. While it is not clear what phenomena drive this preference for a specific scale, it does vote against explanations that focus on much smaller scales.

It is possible that the preference for specific r_max is a result of the scales that our redshift-space 2PCF is defined on, specifically from 0.169 to 30 h⁻¹ Mpc in the transverse direction and from 0 to 30 h⁻¹ Mpc in the LOS direction. As shown in fig. 4 of Xu et al. (2021), environment defined at different scales dramatically change its signature on galaxy clustering. Thus, it is possible that the same observable at a different scale would prefer environment defined within a different radius. To test this effect, we generate the redshift-space 2PCF at twice the scale, specifically in eight log-spaced bins from 0.338 to 60 h⁻¹ Mpc in the transverse direction and six linearly spaced bins from 0 to 60 h⁻¹ Mpc in the LOS direction. We fit this enlarged redshift-space 2PCF using our extended HOD with the top-hat environment definition and show the best-fitting values of A_e and the corresponding χ² in Fig. 9. We see the same behaviour as the left-hand panel of Fig. 8 despite doubling the scales of the 2PCF, with the amplitude of A_e peaking at r_max = 6 h⁻¹ Mpc and the χ² minimized at r_max = 4 h⁻¹ Mpc. This suggests that the scale preference in r_max is not a result of the scales imprinted in the 2PCF bins. We believe that this serves as further evidence that the local environment, specifically defined with an r_max = 4–6 h⁻¹ Mpc, is a physically meaningful indicator of assembly bias and is likely tracing underlying processes happening at a few megaparsecs around haloes that truly drive the assembly bias signature.

Figure 9.

The scale dependence of the A_e fit for the enlarged redshift-space 2PCF. The environment is defined with the top-hat, same as the left-hand panel of Fig. 8. The enlarged redshift-space 2PCF is computed in eight log-spaced bins from 0.338 to 60 h⁻¹ Mpc in the transverse direction and six linearly spaced bins from 0 to 60 h⁻¹ Mpc in the LOS direction. We see the same behaviour as the left-hand panel of Fig. 8, with the amplitude of the A_e best-fitting peaking at r_max = 6 h⁻¹ Mpc and the χ² minimized at r_max = 4 h⁻¹ Mpc.

Open in new tab Download slide

It is beyond the scope of this paper to explore the exact underlying processes that drive the environmental assembly bias signature, but we speculate that galaxies and haloes in dense environments undergo processes such as mergers, tidal disruptions, and feedback processes, whereas galaxies in underdense environments evolve mostly passively. These processes can lead to environment dependence in halo occupation. Amodeo et al. (2020) found observational evidence of baryons being expelled beyond the halo virial radius due to a number of feedback effects. The same study also found that this effect could account for up to 50 |${{ \rm per\ cent}}$| of the lensing tension. Another likely relevant phenomenon is the splashback radius (Adhikari, Dalal & Chamberlain 2014; Diemer & Kravtsov 2014; More et al. 2015), where studies have found that a traditional halo boundary definition such as r_virial excludes some gravitationally bound subhaloes on highly eccentric orbits. Other parallel studies have also suggested that a more physical halo radius is ∼2–3 times larger than the traditional virial radius (Wetzel et al. 2014; Wetzel & Nagai 2015; Sunayama et al. 2016). In the context of HOD models, it might be better to consider these splashback haloes as part of the host halo instead of as individual haloes in the vicinity of larger haloes. The fact that we get an extreme best-fitting value for s_p, which puts some satellites on highly eccentric orbits, might further be indicative of splashback haloes. It is possible that the incorporation of splashback radius could alleviate the need for the environmental assembly bias. In fact, Mansfield & Kravtsov (2020) found that the splashback radius can account for much, though not all, of the halo assembly bias signature in simulations.

The importance of halo environment and the recent pushes to enlarge halo boundaries might also point to shortcomings of the halo model overall. In the context of galaxy–halo connection models, instead of populating galaxies per halo, we might get closer to the true galaxy distribution by populating galaxies in larger groups, which are loosely defined as a group of closely associated haloes and their local environment. Such a group-based galaxy occupation model, if correctly defined, could eliminate the need to account for local environment and to redefine halo radii. It can significantly simplify existing extended HOD models, such as the one we used in this paper. We defer an exploration of this topic to a future paper.

6.4 Comparisons to previous studies on environment-based HOD

Two previous papers, Hadzhiyska et al. (2020) and Xu et al. (2021), systematically tested the effectiveness of various secondary HOD dependences in capturing the assembly bias signature, Hadzhiyska et al. (2020) through hydrodynamical simulations and Xu et al. (2021) through semi-analytical models. While they both found the halo environment to be an excellent indicator of assembly bias, there are some key differences between our work and the two previous papers.

In terms of galaxy samples, in this work we are focusing on LRGs, whereas both previous works focused on much less massive L_⋆-type galaxies. Hadzhiyska et al. (2020) considered a mass-selected sample of L_⋆-type galaxies with a number density of 1.3 × 10⁻³ h³ Mpc⁻³, an order of magnitude higher than our LRG sample. Xu et al. (2021) similarly looked at three samples of mass-selected L_⋆-type galaxies of density n₁ = 0.00316 h³ Mpc⁻³, n₂ = 0.01 h³ Mpc⁻³, and n₃ = 0.0316 h³ Mpc⁻³, which correspond to stellar mass thresholds of 3.88 × 10¹⁰, 1.42 × 10¹⁰, and 0.185 × 10¹⁰ h⁻¹ M_⊙, respectively. In comparison, CMASS LRGs are believed to have a typical stellar mass of a few times 10¹¹ M_⊙ (Maraston et al. 2013).

The definition of halo environment is also different across all three works. Our work largely inherits the environment definition from Hadzhiyska et al. (2020) with minor differences, namely the normalized enclosed mass of subhaloes within 5 h⁻¹ Mpc. Most notably, Hadzhiyska et al. (2020) picked the r_200m as the inner radius of the environment, while we used the virial radius for convenience. Xu et al. (2021), however, defined their halo environment as the local mass overdensity smoothed with a Gaussian filter, computed with all simulation particles instead of just subhaloes. They found that the smoothed overdensity with a filter scale of 1.25 h⁻¹ Mpc best predicts the assembly bias signature. In a more recent work, Hadzhiyska et al. (2021) adopt an environment definition more similar to that of Xu et al. (2021), which uses the smoothed matter density field (with a Gaussian kernel of scale 1.1 Mpc h⁻¹) around a halo to quantify its environment dependence. They show that augmenting the HOD model with a secondary dependence on environment in a manner that matches the two-point clustering, also successfully recovers additional statistical probes of the galaxy distribution such as the galaxy–galaxy lensing signal, the void–galaxy cross-correlation function, and moments of the galaxy density field. In this work, we find a somewhat larger optimal radius, which can be attributed to the different environment definitions and the different galaxy samples. It is also possible that different underlying physical processes drive the environmental assembly bias in L-type galaxies versus in LRGs. Finally, we caution that all these studies are limited by the fidelity of the hydrodynamical simulations and the semi-analytical models used.

7 CONCLUSION

In this paper, we model the observed redshift-space 2PCF of BOSS CMASS galaxy sample with an extended HOD model that includes two prescriptions of galaxy assembly bias and other physically motivated additions. We found that while the standard five-parameter HOD provides a poor fit to the redshift-space 2PCF (χ² = 151, d.o.f = 42), the extended HOD achieves a substantially better fit with χ² = 50 (d.o.f = 37; see Table 1 and Fig. 3). The redshift-space 2PCF also strongly prefers the simultaneous inclusion of both A and A_e, which, respectively, represent the assembly bias associated with halo concentration and halo environment. The preference for HOD models incorporating A and A_e is expressed with their corresponding ΔBIC: ΔBIC = 19 for A and ΔBIC = 15 for A_e. The HOD model that includes both assembly biases is strongly favoured over an HOD that includes neither, with ΔBIC = 36. When only one assembly bias term is included, the fit on the redshift-space 2PCF is significantly worse than when both assembly bias terms are included. Our results highlight the importance of a flexible assembly bias and HOD model in accurately modelling redshift-space clustering on the non-linear scale and expose the deficiencies of the standard five-parameter HOD model in producing realistic galaxy distributions.

The best fit yields a negative concentration-based assembly bias parameter A (∼10σ, preferentially assigning galaxies to puffier less massive haloes) and a positive environmental assembly bias parameter A_e (∼3σ–5σ, preferentially assigning galaxies to less massive haloes in denser environments). Compared to the projected correlation function, the redshift-space 2PCF pushes both assembly bias parameters in the direction of reducing typical halo mass per galaxy at fixed clustering, showcasing the extra constraining power contained in the LOS structure of the redshift-space 2PCF. Specifically, the inclusion of A decrease the average halo mass per galaxy by |$12{{\ \rm per\ cent}}$|⁠, whereas the inclusion of A_e decreases the average halo mass by |$10{{\ \rm per\ cent}}$|⁠. The HOD fit with both assembly biases constrained on redshift-space 2PCF yields |$26{{\ \rm per\ cent}}$| lower average halo mass compared to an HOD constrained on the projected correlation function.

We additionally showed that, by assigning galaxies to lower mass haloes, the extended HOD constrained on the redshift-space 2PCF predicts the observed galaxy–galaxy lensing signal to within 1σ (Fig. 5). This represents a significant improvement compared to predictions by HODs constrained on the projected 2PCF. This result translates the perceived tension between galaxy clustering and lensing to a tension between more informative clustering measurements and oversimplified galaxy–halo connection models. This result again highlights the importance of building more flexible galaxy–halo connection models and utilizing the more informative statistics such as the redshift-space 2PCF.

We offer strong evidence for including an environmental galaxy assembly bias term in the HOD. In addition to showing how the environmental assembly bias significantly improves the redshift-space 2PCF fit and the lensing prediction, we further showcase the consistency of a positive A_e fit over variations to the fitting routine, variations to the HOD model, and modest perturbations to the cosmology (Fig. 6). Combining this result with previous simulation studies that showed halo environment as an excellent indicator of assembly bias and being able to recover various observables, we believe an environmental assembly bias term is a physical and indispensable addition to the HOD. We recommend that future studies include environmental assembly bias in their HOD prescriptions in order to construct more realistic galaxy mocks.

In order to better understand the underlying processes driving the environmental dependence, we tested different maximum radii in the environment definitions, and we found that halo environment defined on scales of around 4–6 h⁻¹ Mpc is preferred by the data (Fig. 8). This scale preference holds even when doubling the bin sizes of the observed 2PCF (Fig. 9). This suggests that the preference for 4–6 h⁻¹ Mpc is not a result of the specific scales chosen when binning the observable, but rather determined by the underlying physical processes that exhibit a similar characteristic scale. We speculate that such processes are intrahalo and might include mergers, tidal disruptions, and feedback processes. We also suggest that the dependence on environment within 4–6 h⁻¹ Mpc might also point to the need to define haloes at a much larger radius, such as the splashback radius.

While we found strong observational evidence for our prescriptions of assembly bias, there is no guarantee that our prescriptions capture the full underlying galaxy assembly bias effect. It is entirely possible that a better and possibly more simplistic assembly bias prescription can achieve the same or even better fit on the redshift-space galaxy clustering and galaxy–galaxy lensing. In upcoming papers, we plan on continuing to explore more physically motivated halo properties as sources of assembly bias, potentially leveraging the halo merger tree and constructing mass-dependent assembly bias prescriptions. We additionally plan on performing a joint analysis on the redshift-space 3PCF, galaxy–galaxy lensing, and the squeezed three-point correlation function (Yuan, Eisenstein & Garrison 2017) with our extended HOD model to derive cosmological constraints.

ACKNOWLEDGEMENTS

We would like to thank Lehman Garrison, Johannes Lange, Jeremy Tinker, Joe DeRose, Andrew Hearin, Martin White, Andrew Zentner, and Josh Speagle for fruitful discussions. DJE is supported by U.S. Department of Energy grant DE-SC0013718 and as a Simons Foundation Investigator. SB is supported by Harvard University through the ITC Fellowship. HG is supported by the National Science Foundation of China (Nos 11773049, 11833005, 11828302, and 11922305).

DATA AVAILABILITY

The simulation data are available at https://lgarrison.github.io/AbacusCosmos/ and https://abacussummit.readthedocs.io/en/latest/. Researchers wishing to gain access to the extended HOD code can refer to the publicly available grand-hod code at https://github.com/SandyYuan/GRAND-HOD or contact the lead author of this paper for details.

Footnotes

1

https://github.com/SandyYuan/GRAND-HOD

2

For more details, see https://lgarrison.github.io/AbacusCosmos.

3

For more details, see https://abacussummit.readthedocs.io/en/latest/abacussummit.html.

4

https://github.com/keurfonluu/StochOPy

REFERENCES

Abadi

M. G.

,

Navarro

J. F.

,

Fardal

M.

,

Babul

A.

,

Steinmetz

M.

,

2010

,

MNRAS

,

407

,

435

10.1111/j.1365-2966.2010.16912.x

Crossref

Adhikari

S.

,

Dalal

N.

,

Chamberlain

R. T.

,

2014

,

J. Cosmol. Astropart. Phys.

,

11

,

019

10.1088/1475-7516/2014/11/019

Crossref

Alam

S.

,

Miyatake

H.

,

More

S.

,

Ho

S.

,

Mandelbaum

R.

,

2017

,

MNRAS

,

465

,

4853

10.1093/mnras/stw3056

Crossref

Amodeo

S.

et al. ,

2020

,

preprint (arXiv:2009.05558)

Artale

M. C.

,

Zehavi

I.

,

Contreras

S.

,

Norberg

P.

,

2018

,

MNRAS

,

480

,

3978

10.1093/mnras/sty2110

Crossref

Behroozi

P. S.

,

Wechsler

R. H.

,

Wu

H.-Y.

,

2013

,

ApJ

,

762

,

109

10.1088/0004-637X/762/2/109

Crossref

Behroozi

P.

,

Wechsler

R. H.

,

Hearin

A. P.

,

Conroy

C.

,

2019

,

MNRAS

,

488

,

3143

10.1093/mnras/stz1182

Crossref

Berlind

A. A.

,

Weinberg

D. H.

,

2002

,

ApJ

,

575

,

587

10.1086/341469

Crossref

Berlind

A. A.

et al. ,

2003

,

ApJ

,

593

,

1

10.1086/376517

Crossref

Blumenthal

G. R.

,

Faber

S. M.

,

Primack

J. R.

,

Rees

M. J.

,

1984

,

Nature

,

311

,

517

10.1038/311517a0

Crossref

Bolton

A. S.

et al. ,

2012

,

AJ

,

144

,

144

10.1088/0004-6256/144/5/144

Crossref

Bond

J. R.

,

Cole

S.

,

Efstathiou

G.

,

Kaiser

N.

,

1991

,

ApJ

,

379

,

440

10.1086/170520

Crossref

Bose

S.

,

Eisenstein

D. J.

,

Hernquist

L.

,

Pillepich

A.

,

Nelson

D.

,

Marinacci

F.

,

Springel

V.

,

Vogelsberger

M.

,

2019

,

MNRAS

,

490

,

5693

10.1093/mnras/stz2546

Crossref

Chen

Y.-C.

et al. ,

2017

,

MNRAS

,

466

,

1880

10.1093/mnras/stw3127

Crossref

Chua

K. T. E.

,

Pillepich

A.

,

Rodriguez-Gomez

V.

,

Vogelsberger

M.

,

Bird

S.

,

Hernquist

L.

,

2017

,

MNRAS

,

472

,

4343

10.1093/mnras/stx2238

Crossref

Contreras

S.

,

Zehavi

I.

,

Padilla

N.

,

Baugh

C. M.

,

Jiménez

E.

,

Lacerna

I.

,

2019

,

MNRAS

,

484

,

1133

10.1093/mnras/stz018

Crossref

Croton

D. J.

,

Gao

L.

,

White

S. D. M.

,

2007

,

MNRAS

,

374

,

1303

10.1111/j.1365-2966.2006.11230.x

Crossref

Dawson

K. S.

et al. ,

2013

,

AJ

,

145

,

10

10.1088/0004-6256/145/1/10

Crossref

Diemer

B.

,

Kravtsov

A. V.

,

2014

,

ApJ

,

789

,

1

10.1088/0004-637X/789/1/1

Crossref

Dragomir

R.

,

Rodríguez-Puebla

A.

,

Primack

J. R.

,

Lee

C. T.

,

2018

,

MNRAS

,

476

,

741

10.1093/mnras/sty283

Crossref

Duffy

A. R.

,

Schaye

J.

,

Kay

S. T.

,

Dalla Vecchia

C.

,

Battye

R. A.

,

Booth

C. M.

,

2010

,

MNRAS

,

405

,

2161

10.1111/j.1365-2966.2010.16613.x

Crossref

Eisenstein

D. J.

et al. ,

2011

,

AJ

,

142

,

72

10.1088/0004-6256/142/3/72

Crossref

Foreman-Mackey

D.

,

Hogg

D. W.

,

Lang

D.

,

Goodman

J.

,

2013

,

PASP

,

125

,

306

10.1086/670067

Crossref

Gao

L.

,

White

S. D. M.

,

2007

,

MNRAS

,

377

,

L5

10.1111/j.1745-3933.2007.00292.x

Crossref

Gao

L.

,

Springel

V.

,

White

S. D. M.

,

2005

,

MNRAS

,

363

,

L66

10.1111/j.1745-3933.2005.00084.x

Crossref

Garrison

L. H.

,

Eisenstein

D. J.

,

Ferrer

D.

,

Metchnik

M. V.

,

Pinto

P. A.

,

2016

,

MNRAS

,

461

,

4125

10.1093/mnras/stw1594

Crossref

Garrison

L. H.

,

Eisenstein

D. J.

,

Ferrer

D.

,

Tinker

J. L.

,

Pinto

P. A.

,

Weinberg

D. H.

,

2018

,

ApJS

,

236

,

43

10.3847/1538-4365/aabfd3

Crossref

Guo

H.

,

Zehavi

I.

,

Zheng

Z.

,

2012

,

ApJ

,

756

,

127

10.1088/0004-637X/756/2/127

Crossref

Guo

H.

et al. ,

2015

,

MNRAS

,

446

,

578

10.1093/mnras/stu2120

Crossref

Guo

H.

,

Yang

X.

,

Lu

Y.

,

2018

,

ApJ

,

858

,

30

10.3847/1538-4357/aabc56

Crossref

Hadzhiyska

B.

,

Bose

S.

,

Eisenstein

D.

,

Hernquist

L.

,

Spergel

D. N.

,

2020

,

MNRAS

,

493

,

5506

10.1093/mnras/staa623

Crossref

Hadzhiyska

B.

,

Bose

S.

,

Eisenstein

D.

,

Hernquist

L.

,

2021

,

MNRAS

,

501

,

1603

Hansen

N.

,

Ostermeier

A.

,

2001

,

Evolutionary Comput.

,

9

,

159

Hartlap

J.

,

Simon

P.

,

Schneider

P.

,

2007

,

A&A

,

464

,

399

10.1051/0004-6361:20066170

Crossref

Hearin

A. P.

,

Zentner

A. R.

,

van den Bosch

F. C.

,

Campbell

D.

,

Tollerud

E.

,

2016

,

MNRAS

,

460

,

2552

10.1093/mnras/stw840

Crossref

Klypin

A. A.

,

Trujillo-Gomez

S.

,

Primack

J.

,

2011

,

ApJ

,

740

,

102

10.1088/0004-637X/740/2/102

Crossref

Kraljic

K.

et al. ,

2019

,

MNRAS

,

483

,

3227

10.1093/mnras/sty3216

Crossref

Laigle

C.

et al. ,

2018

,

MNRAS

,

474

,

5437

10.1093/mnras/stx3055

Crossref

Lam

S. K.

,

Pitrou

A.

,

Seibert

S.

,

2015

,

LLVM ’15: Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC

.

Association for Computing Machinery

,

New York

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Landy

S. D.

,

Szalay

A. S.

,

1993

,

ApJ

,

412

,

64

10.1086/172900

Crossref

Lange

J. U.

,

van den Bosch

F. C.

,

Zentner

A. R.

,

Wang

K.

,

Hearin

A. P.

,

Guo

H.

,

2019

,

MNRAS

,

490

,

1870

10.1093/mnras/stz2664

Crossref

Lange

J. U.

,

Leauthaud

A.

,

Singh

S.

,

Guo

H.

,

Zhou

R.

,

Smith

T. L.

,

Cyr-Racine

F.-Y.

,

2021

,

MNRAS

,

502

,

2074

10.1093/mnras/stab189

Crossref

Leauthaud

A.

et al. ,

2016

,

MNRAS

,

457

,

4021

10.1093/mnras/stw117

Crossref

Leauthaud

A.

et al. ,

2017

,

MNRAS

,

467

,

3024

10.1093/mnras/stx258

Crossref

Lee

C. T.

,

Primack

J. R.

,

Behroozi

P.

,

Rodríguez-Puebla

A.

,

Hellinger

D.

,

Dekel

A.

,

2018

,

MNRAS

,

481

,

4038

10.1093/mnras/sty2538

Crossref

Levi

M.

et al. ,

2013

,

preprint (arXiv:1308.0847)

Li

Y.

,

Mo

H. J.

,

Gao

L.

,

2008

,

MNRAS

,

389

,

1419

10.1111/j.1365-2966.2008.13667.x

Crossref

McEwen

J. E.

,

Weinberg

D. H.

,

2018

,

MNRAS

,

477

,

4348

10.1093/mnras/sty882

Crossref

Mansfield

P.

,

Kravtsov

A. V.

,

2020

,

MNRAS

,

493

,

4763

10.1093/mnras/staa430

Crossref

Mao

Y.-Y.

,

Zentner

A. R.

,

Wechsler

R. H.

,

2018

,

MNRAS

,

474

,

5143

10.1093/mnras/stx3111

Crossref

Maraston

C.

et al. ,

2013

,

MNRAS

,

435

,

2764

10.1093/mnras/stt1424

Crossref

More

S.

,

Miyatake

H.

,

Mandelbaum

R.

,

Takada

M.

,

Spergel

D. N.

,

Brownstein

J. R.

,

Schneider

D. P.

,

2015

,

ApJ

,

806

,

2

10.1088/0004-637X/806/1/2

Crossref

Navarro

J. F.

,

Frenk

C. S.

,

White

S. D. M.

,

1997

,

ApJ

,

490

,

493

10.1086/304888

Crossref

Obuljen

A.

,

Percival

W. J.

,

Dalal

N.

,

2020

,

J. Cosmol. Astropart. Phys.

,

10

,

058

Paranjape

A.

,

Kovač

K.

,

Hartley

W. G.

,

Pahwa

I.

,

2015

,

MNRAS

,

454

,

3030

10.1093/mnras/stv2137

Crossref

Peacock

J. A.

,

Smith

R. E.

,

2000

,

MNRAS

,

318

,

1144

10.1046/j.1365-8711.2000.03779.x

Crossref

Peirani

S.

et al. ,

2017

,

MNRAS

,

472

,

2153

10.1093/mnras/stx2099

Crossref

Planck Collaboration XIII

,

2016

,

A&A

,

594

,

A13

10.1051/0004-6361/201525830

Crossref

Poudel

A.

,

Heinämäki

P.

,

Tempel

E.

,

Einasto

M.

,

Lietzen

H.

,

Nurmi

P.

,

2017

,

A&A

,

597

,

A86

10.1051/0004-6361/201629639

Crossref

Pujol

A.

,

Gaztañaga

E.

,

2014

,

MNRAS

,

442

,

1930

10.1093/mnras/stu1001

Crossref

Reid

B. A.

,

Seo

H.-J.

,

Leauthaud

A.

,

Tinker

J. L.

,

White

M.

,

2014

,

MNRAS

,

444

,

476

10.1093/mnras/stu1391

Crossref

Reid

B.

et al. ,

2016

,

MNRAS

,

455

,

1553

10.1093/mnras/stv2382

Crossref

Rodríguez-Torres

S. A.

et al. ,

2016

,

MNRAS

,

460

,

1173

10.1093/mnras/stw1014

Crossref

Saito

S.

et al. ,

2016

,

MNRAS

,

460

,

1457

10.1093/mnras/stw1080

Crossref

Salerno

J. M.

,

Martínez

H. J.

,

Muriel

H.

,

2019

,

MNRAS

,

484

,

2

10.1093/mnras/sty3456

Crossref

Scoccimarro

R.

,

Sheth

R. K.

,

Hui

L.

,

Jain

B.

,

2001

,

ApJ

,

546

,

20

10.1086/318261

Crossref

Sinha

M.

,

Garrison

L. H.

,

2020

,

MNRAS

,

491

,

3022

10.1093/mnras/stz3157

Crossref

Skibba

R. A.

,

van den Bosch

F. C.

,

Yang

X.

,

More

S.

,

Mo

H.

,

Fontanot

F.

,

2011

,

MNRAS

,

410

,

417

10.1111/j.1365-2966.2010.17452.x

Crossref

Song

H.

et al. ,

2020

,

MNRAS

,

preprint (arXiv:2009.00013)

Sunayama

T.

,

Hearin

A. P.

,

Padmanabhan

N.

,

Leauthaud

A.

,

2016

,

MNRAS

,

458

,

1510

10.1093/mnras/stw332

Crossref

Tinker

J. L.

,

Wetzel

A. R.

,

Conroy

C.

,

Mao

Y.-Y.

,

2017

,

MNRAS

,

472

,

2504

10.1093/mnras/stx2066

Crossref

Tinker

J. L.

,

Hahn

C.

,

Mao

Y.-Y.

,

Wetzel

A. R.

,

Conroy

C.

,

2018a

,

MNRAS

,

477

,

935

10.1093/mnras/sty666

Crossref

Tinker

J. L.

,

Hahn

C.

,

Mao

Y.-Y.

,

Wetzel

A. R.

,

2018b

,

MNRAS

,

478

,

4487

10.1093/mnras/sty1263

Crossref

van den Bosch

F. C.

,

Weinmann

S. M.

,

Yang

X.

,

Mo

H. J.

,

Li

C.

,

Jing

Y. P.

,

2005

,

MNRAS

,

361

,

1203

10.1111/j.1365-2966.2005.09260.x

Crossref

Villarreal

A. S.

et al. ,

2017

,

MNRAS

,

472

,

1088

10.1093/mnras/stx2045

Crossref

Walsh

K.

,

Tinker

J.

,

2019

,

MNRAS

,

488

,

470

10.1093/mnras/stz1351

Crossref

Wechsler

R. H.

,

Tinker

J. L.

,

2018

,

ARA&A

,

56

,

435

10.1146/annurev-astro-081817-051756

Crossref

Wechsler

R. H.

,

Bullock

J. S.

,

Primack

J. R.

,

Kravtsov

A. V.

,

Dekel

A.

,

2002

,

ApJ

,

568

,

52

10.1086/338765

Crossref

Wechsler

R. H.

,

Zentner

A. R.

,

Bullock

J. S.

,

Kravtsov

A. V.

,

Allgood

B.

,

2006

,

ApJ

,

652

,

71

10.1086/507120

Crossref

Wetzel

A. R.

,

Nagai

D.

,

2015

,

ApJ

,

808

,

40

10.1088/0004-637X/808/1/40

Crossref

Wetzel

A. R.

,

Tinker

J. L.

,

Conroy

C.

,

van den Bosch

F. C.

,

2014

,

MNRAS

,

439

,

2687

10.1093/mnras/stu122

Crossref

White

S. D. M.

,

Rees

M. J.

,

1978

,

MNRAS

,

183

,

341

10.1093/mnras/183.3.341

Crossref

Wibking

B. D.

et al. ,

2019

,

MNRAS

,

484

,

989

10.1093/mnras/sty2258

Crossref

Xu

X.

,

Zehavi

I.

,

Contreras

S.

,

2021

,

MNRAS

,

in press (arXiv:2007.05545)

Yoshikawa

K.

,

Jing

Y. P.

,

Börner

G.

,

2003

,

ApJ

,

590

,

654

10.1086/375148

Crossref

Yuan

S.

,

Eisenstein

D. J.

,

Garrison

L. H.

,

2017

,

MNRAS

,

472

,

577

10.1093/mnras/stx2032

Crossref

Yuan

S.

,

Eisenstein

D. J.

,

Garrison

L. H.

,

2018

,

MNRAS

,

478

,

2019

10.1093/mnras/sty1089

Crossref

Yuan

S.

,

Eisenstein

D. J.

,

Leauthaud

A.

,

2020

,

MNRAS

,

493

,

5551

10.1093/mnras/staa634

Crossref

Zehavi

I.

,

Contreras

S.

,

Padilla

N.

,

Smith

N. J.

,

Baugh

C. M.

,

Norberg

P.

,

2018

,

ApJ

,

853

,

84

10.3847/1538-4357/aaa54a

Crossref

Zentner

A. R.

,

2007

,

Int. J. Modern Phys. D

,

16

,

763

10.1142/S0218271807010511

Crossref

Zentner

A. R.

,

Berlind

A. A.

,

Bullock

J. S.

,

Kravtsov

A. V.

,

Wechsler

R. H.

,

2005

,

ApJ

,

624

,

505

10.1086/428898

Crossref

Zentner

A. R.

,

Hearin

A. P.

,

van den Bosch

F. C.

,

2014

,

MNRAS

,

443

,

3044

10.1093/mnras/stu1383

Crossref

Zentner

A. R.

,

Hearin

A.

,

van den Bosch

F. C.

,

Lange

J. U.

,

Villarreal

A.

,

2019

,

MNRAS

,

485

,

1196

10.1093/mnras/stz470

Crossref

Zhao

D. H.

,

Mo

H. J.

,

Jing

Y. P.

,

Börner

G.

,

2003

,

MNRAS

,

339

,

12

10.1046/j.1365-8711.2003.06135.x

Crossref

Zhao

D. H.

,

Jing

Y. P.

,

Mo

H. J.

,

Börner

G.

,

2009

,

ApJ

,

707

,

354

10.1088/0004-637X/707/1/354

Crossref

Zheng

Z.

,

Weinberg

D. H.

,

2007

,

ApJ

,

659

,

1

10.1086/512151

Crossref

Zheng

Z.

et al. ,

2005

,

ApJ

,

633

,

791

10.1086/466510

Crossref

Zheng

Z.

,

Coil

A. L.

,

Zehavi

I.

,

2007

,

ApJ

,

667

,

760

10.1086/521074

Crossref

Zhu

G.

,

Zheng

Z.

,

Lin

W. P.

,

Jing

Y. P.

,

Kang

X.

,

Gao

L.

,

2006

,

ApJ

,

639

,

L5

10.1086/501501

Crossref

Zu

Y.

,

2020

,

preprint (arXiv:2010.01143)

APPENDIX: MULTIPOLE FITS

In Section 6.2, we performed a fit on the redshift-space multipoles, adopting the velocity bias model of Guo et al. (2015). The resulting best-fitting multipoles and projected correlation function w_p are shown in Fig. A1. We see that the multipole fit reproduces the data well up to l = 6, but is inconsistent with the data at l = 8. This suggests that the first three multipoles fail to capture the full information of redshift-space clustering and there is significant amount of information leftover in the higher multipoles.

Figure A1.

The redshift-space multipole fit up to l = 4. The fit reproduces the data well up to l = 6, but fails to reproduce the data at l = 8. This suggests that there is significant information leftover in the higher order multipoles.

Open in new tab Download slide

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Download all slides

Month:	Total Views:
January 2021	1
February 2021	3
March 2021	8
April 2021	12
May 2021	7
June 2021	7
July 2021	4
August 2021	1
September 2021	7
October 2021	13
November 2021	7
December 2021	4
January 2022	9
February 2022	26
March 2022	28
April 2022	25
May 2022	15
June 2022	19
July 2022	22
August 2022	23
September 2022	14
October 2022	13
November 2022	31
December 2022	19
January 2023	33
February 2023	5
March 2023	16
April 2023	14
May 2023	11
June 2023	16
July 2023	16
August 2023	18
September 2023	17
October 2023	21
November 2023	10
December 2023	21
January 2024	23
February 2024	29
March 2024	14
April 2024	10
May 2024	21
June 2024	11
July 2024	21
August 2024	13
September 2024	24
October 2024	14
November 2024	9
December 2024	3
January 2025	16
February 2025	13
March 2025	16
April 2025	23
May 2025	8

Article Contents

Evidence for galaxy assembly bias in BOSS CMASS redshift-space galaxy correlation function Free

ABSTRACT

1 INTRODUCTION

2 THE EXTENDED HOD FRAMEWORK

2.1 The baseline model

2.2 Satellite profile parameters s and sp

2.3 Velocity bias parameters sv and αc

2.4 Assembly bias parameters A and Ae

2.5 Incompleteness factor fic

3 DATA AND SIMULATIONS

3.1 BOSS CMASS galaxy sample

3.2 N-body simulations and halo finders

4 METHODS

4.1 The maximum likelihood routine

4.2 Accelerating the HOD code

5 RESULTS

5.1 ξ(rp, π) fit with both A and Ae

5.2 ξ(rp, π) fit without both A and Ae

5.3 The galaxy–galaxy lensing prediction

5.4 Investigating the environmental assembly bias Ae

6 DISCUSSION

6.1 Testing HOD parameter recovery

6.2 Fitting the redshift-space multipoles

6.3 Alternative environment definitions and scale dependence

6.4 Comparisons to previous studies on environment-based HOD

7 CONCLUSION

ACKNOWLEDGEMENTS

DATA AVAILABILITY

Footnotes

REFERENCES

APPENDIX: MULTIPOLE FITS

Citations

Views

Altmetric

Email alerts

Astrophysics Data System

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

Evidence for galaxy assembly bias in BOSS CMASS redshift-space galaxy correlation function

2.2 Satellite profile parameters s and s_p

2.3 Velocity bias parameters s_v and α_c

2.4 Assembly bias parameters A and A_e

2.5 Incompleteness factor f_ic

5.1 ξ(r_p, π) fit with both A and A_e

5.2 ξ(r_p, π) fit without both A and A_e

5.4 Investigating the environmental assembly bias A_e