Foreground modelling via Gaussian process regression: an application to HERA data

Ghosh, Abhik; Mertens, Florent; Bernardi, Gianni; Santos, Mário G; Kern, Nicholas S; Carilli, Christopher L; Grobler, Trienko L; Koopmans, Léon V E; Jacobs, Daniel C; Liu, Adrian; Parsons, Aaron R; Morales, Miguel F; Aguirre, James E; Dillon, Joshua S; Hazelton, Bryna J; Smirnov, Oleg M; Gehlot, Bharat K; Matika, Siyanda; Alexander, Paul; Ali, Zaki S; Beardsley, Adam P; Benefo, Roshan K; Billings, Tashalee S; Bowman, Judd D; Bradley, Richard F; Cheng, Carina; Chichura, Paul M; DeBoer, David R; Acedo, Eloy de Lera; Ewall-Wice, Aaron; Fadana, Gcobisa; Fagnoni, Nicolas; Fortino, Austin F; Fritz, Randall; Furlanetto, Steve R; Gallardo, Samavarti; Glendenning, Brian; Gorthi, Deepthi; Greig, Bradley; Grobbelaar, Jasper; Hickish, Jack; Josaitis, Alec; Julius, Austin; Igarashi, Amy S; Kariseb, MacCalvin; Kohn, Saul A; Kolopanis, Matthew; Lekalake, Telalo; Loots, Anita; MacMahon, David; Malan, Lourence; Malgas, Cresshim; Maree, Matthys; Martinot, Zachary E; Mathison, Nathan; Matsetela, Eunice; Mesinger, Andrei; Neben, Abraham R; Nikolic, Bojan; Nunhokee, Chuneeta D; Patra, Nipanjana; Pieterse, Samantha; Razavi-Ghods, Nima; Ringuette, Jon; Robnett, James; Rosie, Kathryn; Sell, Raddwine; Smith, Craig; Syce, Angelo; Tegmark, Max; Thyagarajan, Nithyanandan; Williams, Peter K G; Zheng, Haoxuan

doi:10.1093/mnras/staa1331

ABSTRACT

The key challenge in the observation of the redshifted 21-cm signal from cosmic reionization is its separation from the much brighter foreground emission. Such separation relies on the different spectral properties of the two components, although, in real life, the foreground intrinsic spectrum is often corrupted by the instrumental response, inducing systematic effects that can further jeopardize the measurement of the 21-cm signal. In this paper, we use Gaussian Process Regression to model both foreground emission and instrumental systematics in ∼2 h of data from the Hydrogen Epoch of Reionization Array. We find that a simple co-variance model with three components matches the data well, giving a residual power spectrum with white noise properties. These consist of an ‘intrinsic’ and instrumentally corrupted component with a coherence scale of 20 and 2.4 MHz, respectively (dominating the line-of-sight power spectrum over scales k_∥ ≤ 0.2 h cMpc⁻¹) and a baseline-dependent periodic signal with a period of ∼1 MHz (dominating over k_∥ ∼ 0.4–0.8 h cMpc⁻¹), which should be distinguishable from the 21-cm Epoch of Reionization signal whose typical coherence scale is ∼0.8 MHz.

instrumentation: interferometers, methods: statistical, dark ages, reionization, first stars, diffuse radiation, large-scale structure of Universe, cosmology: observations

1 INTRODUCTION

Observations of the redshifted 21-cm signal from neutral Hydrogen hold the promise of revealing the detailed astrophysical processes occurring during the Epoch of Reionization (EoR) and the Cosmic Dawn (CD). The 21-cm signal can provide insights into the formation and evolution of the first structures in the Universe (see e.g. Furlanetto, Oh & Briggs 2006; Morales & Wyithe 2010; Pritchard & Loeb 2012; Mellema et al. 2013, for reviews): for example, when the intergalactic medium (IGM) is still largely neutral, it is a sensitive probe of the first sources of Ly α and X-ray radiation (Mesinger & Furlanetto 2007; Santos et al. 2010, 2011; McQuinn & O’Leary 2012; Fialkov, Barkana & Visbal 2014; Fialkov et al. 2017) and, during the subsequent EoR, its large-scale fluctuations map the evolution of the global ionization fraction (Lidz et al. 2008; Bolton et al. 2011). The 21-cm emission gives insights into the nature of formation of the first stars, galaxies, and their impact on the physics of the IGM (Loeb & Furlanetto 2013; Zaroubi 2013).

At present, several experiments are attempting to detect the power spectrum of the 21-cm signal from the EoR (e.g. GMRT,¹ LOFAR,² MWA,³ PAPER⁴) or the sky-averaged 21-cm emission using a single dipole (Bowman & Rogers 2010; Greenhill & Bernardi 2012; Patra et al. 2015; Bernardi et al. 2016). Some of these ongoing efforts have achieved increasingly better upper limits on the 21-cm signal power spectra (Ali et al. 2015; Beardsley et al. 2016; Patil et al. 2017; Barry et al. 2019; Kolopanis et al. 2019; Li, Pober & Barry 2019), showing the way for the second-generation experiments such as the Square Kilometre Array (SKA⁵) and the Hydrogen Epoch of Reionization Array (HERA⁶). Recently, a detection of an absorption profile in the sky-averaged 21-cm signal centred at 78 MHz has been reported (Bowman et al. 2018), although the unexpected depth of the trough is calling for independent confirmations (Fraser et al. 2018) – including interferometric observations (Gehlot et al. 2019).

The main challenge in detecting the faint 21-cm signal is the presence of Galactic and extra-galactic foregrounds that are around 3–4 orders of magnitude stronger (e.g. Bernardi et al. 2009, 2010; Ghosh et al. 2011; Dillon et al. 2014; Parsons et al. 2014). Foregrounds as well as the instrumental response have a highly correlated continuum spectrum and can, in principle, be separate from the 21-cm signal that has structure on smaller frequency scales due to the intrinsic redshift evolution of the IGM (e.g. Bharadwaj & Sethi 2001; Zaldarriaga, Furlanetto & Hernquist 2004; Santos, Cooray & Knox 2005). However, the inherent smoothness of the foreground emission is often compounded by the interferometric response (mode-mixing), including frequency-dependent primary beams, side-lobe from bright, mis-subtracted sources, and ionospheric distortions (Bowman, Morales & Hewitt 2009; Koopmans 2010; Ghosh et al. 2011; Vedantham, Udaya Shankar & Subrahmanyan 2012; Yatawatta et al. 2013; Barry et al. 2016; Vedantham & Koopmans 2016; Patil et al. 2017; Byrne et al. 2019; Gehlot et al. 2019). Polarization leakage due to improper calibration may also add additional spectral structures to the unpolarized cosmological 21-cm window (Geil, Gaensler & Wyithe 2011; Asad et al. 2015; Nunhokee et al. 2017).

The study of foreground properties and their separation from the 21-cm signal have been a very active research area over the years (e.g. Datta, Bowman & Carilli 2010; Liu & Tegmark 2011; Morales et al. 2012; Trott, Wayth & Tingay 2012; Dillon et al. 2014). One strategy is to attempt to ‘avoid’ foregrounds, i.e. to avoid k modes that are contaminated by foregrounds and to estimate the 21-cm power spectrum using the uncontaminated modes. This assumes that foregrounds are well localized in k-space and the mode-mixing effects can be kept well under control (Thyagarajan et al. 2015). This foreground avoidance method has the disadvantage of considerably reducing the sensitivity of the instrument because of reduction in the number of k-modes that can be probed to characterize the EoR signal (Pober et al. 2014). The second approach involves subtracting the best possible foreground model and, possibly, recover access to the foreground-dominated power spectrum region. One of the possible disadvantages here is the risk of contamination of the cosmological 21-cm signal from the cleaning process. Foreground wedge also corrupts nearly all the redshift space 21-cm signal, making it difficult to extract cosmological information without foreground subtraction (Pober 2015; Jensen et al. 2016). There are also recent efforts to develop a somewhat hybrid analysis where a Galactic and Extragalactic All-sky MWA (GLEAM; Hurley-Walker et al. 2017) catalogue of sources including Pictor A and Fornax A were first subtracted from Precision Array for Probing the Epoch of Re-ionization data and then the power spectrum was estimated. This is equivalent to an additional visibility-based filtering within the foreground avoidance paradigm (Kerrigan et al. 2018).

Chapman et al. (2016) pointed out that blind foreground removal methods such as Generalized Morphological Component Analysis (GMCA; Chapman et al. 2013) can still model relatively non-smooth foregrounds effectively on short baselines (k_⊥ ≲ 0.2 h cMpc⁻¹), while avoidance suffers some degradation as the frequency-dependent small-scale structure cannot be confined purely in a region at small k_∥ modes. Several techniques have been proposed to model and remove foreground emission taking advantage of its spectral smoothness, including parametric (Jelić et al. 2008; Bonaldi & Brown 2015) and non-parametric methods (Harker et al. 2009; Chapman et al. 2013; Mertens, Ghosh & Koopmans 2018; Mertens et al. 2020). Both methods have the limitation that they may suppress the 21-cm signal and do not always reach a level of modelling error better than the noise for k ≤ 0.3 h cMpc⁻¹, compared to the desired level of the 21-cm signal power spectrum (Mertens et al. 2018). In general, foreground subtraction allows to use virtually all k modes at the risk of contamination of the 21-cm signal, whereas foreground avoidance does not corrupt the cosmological signal within the EoR window, but cannot take advantage of any mode in the foreground wedge.

Recently, a novel, non-parametric (in the signal) method based on Gaussian Process Regression (GPR) has been studied in detail with simulations, where intrinsic smooth foregrounds, mid-scale frequency fluctuations associated with mode-mixing, Gaussian random noise, and a basic 21-cm signal model, are modelled with Gaussian Process (GP), and subsequently a separation with a precise estimation of the uncertainty was carried out (Mertens et al. 2018; Gehlot et al. 2019). The advantage of this method over previous ones is its implementation in a Bayesian framework that allows to incorporate different physical processes in the form of covariance structure priors (currently spectral and possible spatial implementation in future) on the various components. GPR further allows much better control over the coherence structure (and hence power spectra) of all components rather then be ‘blind’ for their physical origins, as are GMCA, Independent Component Analysis, or fitting polynomials. Further, it also offers a good way to extract foreground models from the data.

In this paper, we apply GPR to model foregrounds in a ∼100 min long observation with HERA-47. The GPR method was originally developed to be applied to observations with good uv coverage, but here we adapted it to work directly to visibilities, without being affected by the sparse HERA uv coverage. Foreground modelling helps us to assess the level of contamination of the data and the covariance models that can properly describe foregrounds. It can ultimately guide the foreground cleaning process and help finding the scales that should be safe to use in a foreground avoidance approach. We use the line-of-sight and the delay power spectrum in the (k_⊥, k_∥) plane as our metric to characterize the foreground models.

The paper is organized as follows: Section 2 summarizes our observations along with the delay power spectrum estimation procedure; Section 3 describes our technique to calculate the foreground power spectrum using the GPR formalism. Finally, we conclude in Section 4. Cosmological parameters used here are from Planck Collaboration XIII (2016).

2 OBSERVATIONS AND DATA REDUCTION

The HERA (DeBoer et al. 2017) is an ongoing experiment to use the red-shifted 21-cm radiation originating from the cosmological distribution of neutral hydrogen (⁠|$\rm H\,{\small I}$|⁠) to study the formation of first stars and black holes from CD (z ∼ 30) to the full IGM ionization history (6 ≲ z ≲ 12). In its final configuration, the array will consist of 350 parabolic dishes of ∼14 m diameter, with an effective area of ∼154 m² per antenna, closely packed in a hexagonal split-core (Dillon & Parsons 2016), plus outriggers up to ∼1 km distance. The experiment is optimized for robust power spectrum detection while minimizing foreground contamination (Pober et al. 2014; Thyagarajan et al. 2015; Ewall-Wice et al. 2017)

In this paper, we used data from the deployment of the first 47 dish (HERA–47) array. It covers a frequency range of 100–200 MHz with a channel resolution of ∼97.6 kHz. The results presented in this paper were generated from 10 nights of HERA–47 data, starting on 2017 October 5, using only the ‘xx’ polarization cross-products. In the paper, we refer to this as stokes ‘I’. We selected snapshots of 10 min (see Fig. 1 for the corresponding uv coverage) of data close to 21^h Local Sidereal Time (LST) over 10 d from the HERA data repository.⁷ In total, we used around 100 min of data. We used the pyuvdata⁸ software (Hazelton et al. 2017) to convert the correlator output to the Common Astronomy Software Applications (casa)⁹ Measurement Set format. Antenna 50 was found to be bad for the initial 7 d and was permanently flagged. We also flagged the band edges as well as the channels that were persistently affected by Radio Frequency Interference, i.e. mostly the following channels: 0–100, 379–387, 510–512, 768–770, 851–852, and 901–1023, where channel ‘0’ corresponds to 100 MHz and channel ‘1023’ to 200 MHz. We then used the casa task RFLAG to perform further flagging in time and frequency. The threshold for ‘timedevscale’ and ‘freqdevscale’ was fixed to the default values of ‘5’ each. This implies that for each channel any visibility will be flagged if the local RMS of its real and imaginary part, is larger than five times (RMS + median deviation) within a sliding time window. Similarly, for each integration time, the real and imaginary parts of the visibilities were flagged if they exceed five times the deviation from the median value across channels.

Figure 1.

This figure shows the 10 min uv coverage around 150 MHz of HERA-47 which are analyzed in this paper.

Open in new tab Download slide

Calibration was performed using custom casa pipelines.¹⁰ The starting flux density model included the five brightest point sources within the HERA field of view (GLEAM 2101–2800, GLEAM 2107–2525, GLEAM 2107–2529, GLEAM 2101–2803, and GLEAM 2100–2829), chosen from the MWA GLEAM point source catalogue (Hurley-Walker et al. 2017). Their model flux density was corrected for the HERA primary beam response following the electromagnetic simulations of the HERA feed and dish (Fagnoni & de Lera Acedo 2016), obtaining a flux density estimate for each source at each frequency channel. This sky model was used to solve for three types of antenna gains: antenna-based delay (‘K’ term in the casa terminology), followed by a complex gain for all the channels and the whole 10 min interval (‘G’ term in the casa terminology) and by a complex bandpass calibration (‘B’ term in the casa terminology). Calibration solutions were determined for the snapshot observation on 2017 October 15 and applied to the rest of the nine nights of data, though data from each day was flagged individually. The calibration solutions are used as-is, and are not smoothed across frequency before applying them to the data. This can allow spectral-dependent calibration errors generated by unmodelled sky sources and baseline-dependent systematics to be applied to the data, and can further corrupt the EoR window (Kern et al. 2020b); however, in this work we seek to model these terms through a combination of the foreground mode mixing and periodic kernel discussed next.

Calibrated visibilities were phased to a common right ascension α = 21^h and Fourier transformed into 21° × 21° images using the w-projection algorithm with 128 planes and the multifrequency synthesis algorithm to combine the whole bandwidth together. Uniform weights were used, leading to a 43.2 arcmin × 35.4 arcmin synthesized beam. Each image was conservatively deconvolved down to a threshold of |$10{{\ \rm per\ cent}}$| of the image peak using the Cotton–Schwab algorithm implemented in the casaCLEAN task.

Images of the 10 snapshots are shown in Fig. 2. Images at different days are very similar, qualitatively showing good instrumental stability. Image to image variation of the RMS noise in regions of the sky that are mostly empty (away from phase center and void of sources) is between 0.35 and 0.45 Jy beam⁻¹. In these parts of the sky, the primary beam response for the individual fields is considerably lower than the field centre and we expect them to be noise dominated. As the primary beam response slightly changes based on the transit time at the HERA location, we find that the peak flux density of the images varies up to |$5\!-\!10{{\ \rm per\ cent}}$| over the 10 d (Fig. 3), likely due to time variations of the bandpass and imperfect primary beam corrections across snapshots – the primary beam was computed for the first snapshot but observations took place at slightly different LSTs. This variation essentially sets the accuracy of our absolute flux density calibration. We also note that Cygnus A is above the horizon at the time when observations were taken. Although ∼70° away from the pointing direction and therefore heavily attenuated by the primary beam, it still appears as a source with ∼7 Jy beam⁻¹ peak flux density, possibly affecting the bandpass calibration. We also leave for future work the application of techniques that leverage on the array redundancy to improve calibration (Marthi & Chengalur 2014; Zhang, Liu & Parsons 2018; Grobler, Bernardi & Kenyon 2018; Dillon et al. 2018; 2020).

This figure shows 10 snapshot images from HERA-47 at 150 MHz. The flux density scale is in Jy beam−1 units.

Figure 2.

This figure shows 10 snapshot images from HERA-47 at 150 MHz. The flux density scale is in Jy beam⁻¹ units.

Open in new tab Download slide

This figure display the ‘difference image’ where the mean image has been subtracted out. The flux density scale is in Jy beam−1 units.

Figure 3.

This figure display the ‘difference image’ where the mean image has been subtracted out. The flux density scale is in Jy beam⁻¹ units.

Open in new tab Download slide

2.1 LST binning and SEFD evaluation

We bin each night of visibility data in LST. We chose a 2 min bin resolution, such that we can minimize the variation of the primary beam. For each observing night and LST bin, we only average redundant baselines (e.g. baselines of the same length and orientation). This ensures that we are coherently averaging the baselines and not mixing up emissions from the sky as the earth rotates.

We empirically estimate the System Equivalent Flux Density (SEFD) of the different LST combined visibility data sets by taking the difference of two adjacent frequency channels (Patil et al. 2016). This difference can be used to estimate the noise RMS, σ(u, v, ν). For each polarization, we have (Thompson, Moran & Swenson 2017)

$$\begin{eqnarray*} \sigma (u, v, \nu) = \frac{1}{\sqrt{N_{\rm vis}(u, v, \nu)}} \frac{\rm {SEFD(\nu)}}{ \sqrt{\Delta \nu \, \Delta t}}, \end{eqnarray*}$$

(1)

where Δν and Δt are the frequency channel width and the integration time, respectively. We use equation (1) to estimate the SEFD for the different LST bins as a function of frequency (Fig. 4). The factor N_vis(u, v, ν) in equation (1) is the number of redundant visibilities. We find a SEFD ∼(9.5 ± 2.4) × 10³ Jy (here, the mean and the uncertainty is estimated from all the LST bins and frequency channels in Fig. 4) for the 157.03–167.09 MHz range that we use for the power spectrum analysis. In temperature units, this is equivalent to ∼327 ± 84 K around a central frequency of 162 MHz. We use a scaling factor of |$(10^{-26} \, \lambda ^2) / (2 \, {\rm k_{B}} \, \Omega _{{\rm P}})$|⁠, where Ω_P is the angular area of the primary beam (Parsons et al. 2017), to convert from Jy to K. The estimated SEFD values are consistent with the HERA system temperature derived using differences of visibility spectra for sky-calibrated data for a fixed LST on two consecutive days (Carilli 2017).

Figure 4.

Estimated SEFD as a function of frequency for the different LST bins. The dashed line shows the mean value for all the LST bins and frequency channels.

Open in new tab Download slide

Visibilities observed at the same LST time should ‘see’ the same sky. Assuming that over the 2-min LST bin the change in the primary beam is not significant, all the 2-min averaged visibilities corresponding to similar LST bins are therefore also coherently averaged (after baselines of same length and slope have been averaged). Visibility data sets from different LST bins, on the other hand, correspond to different parts of the sky and therefore cannot be coherently averaged. However, the 21-cm signal power spectra should only depend on baseline length, not time. We can thus incoherently combine them when producing power spectra (e.g. we average the power spectrum from different LST bins). In the following subsection, we describe our power spectrum estimation procedure. We focus our discussion on the line-of-sight and delay power spectrum in the k_⊥−k_∥ plane or, equivalently, baseline–delay plane.

2.2 Delay power spectrum

Intrinsic flat spectrum sky emission appears as a Dirac delta function in delay space, where the Fourier transform along the frequency axis (delay transform) acts as a one-dimensional, per-baseline ‘image’ (Parsons et al. 2012a). Smooth-spectrum foregrounds are bound by the maximum geometric delay that depends upon the baseline length. We investigate such foreground isolation via the ‘delay spectrum’ |$\hat{V}(\mathbf {u}, \tau)$|⁠, defined as the inverse Fourier transform of |$V(\mathbf {u}, f)$| along the frequency coordinate (Parsons et al. 2012a,b):

$$\begin{eqnarray*} \hat{V}(\boldsymbol {u}, \tau) \equiv \int V(\boldsymbol {u}, f)\, W(f)\, e^{i2\pi f\tau }\, \mathrm{d}f , \end{eqnarray*}$$

(2)

where W(f) is a spectral window function (Vedantham et al. 2012; Thyagarajan et al. 2013; Choudhuri et al. 2016) and τ represents the signal delay between antenna pairs |$\tau = \frac{\boldsymbol {u}\cdot \hat{\boldsymbol{s}}}{c}$|⁠, where|$\boldsymbol {u}$| is the baseline vector towards the direction |$\hat{\boldsymbol{s}}$| and c is the speed of light. We finally squared the visibilities, |$\hat{V}(\boldsymbol {u}, \tau)$|⁠, to form the delay power spectrum. Unlike an image-based estimator where the upper and lower frequencies incorporate information from baselines of different physical length, the delay power spectrum respects baseline migration, i.e. the same baselines contribute to all frequencies (e.g. Morales et al. 2012). In our analysis, we used baselines with length |$|\boldsymbol {u}| \le 60 \, {\rm m}$| and a non-uniform discrete Fourier transform to compute the line-of-sight delay transform of the visibilities in order to take proper account of the flagged frequency channels. We choose a Blackman window function that offers a ∼−67 dB side lobe suppression.

For a single baseline, we can estimate the delay power spectrum (e.g. the cylindrical power spectrum) as (Parsons et al. 2012a):

$$\begin{eqnarray*} P(k_{\perp }, k_{\parallel }) \approx \left (\frac{10^{-26} \lambda ^2}{2 \, {\rm k_B}} \right)^2 \times \frac{X^2 Y}{\Omega _{\mathrm{PP}} B} \left| \hat{V}(\boldsymbol {u}, \tau)\right|^2 , \end{eqnarray*}$$

(3)

where λ corresponds to the wavelength of the mid-frequency of the band, k_B is the Boltzmann constant, B is the bandwidth, Ω_PP is the angular area of the primary beam, and X, Y are the conversion factors from angle and frequency to co-moving scales. As discussed, the power spectrum is averaged over all LST bins. Moreover, we also average over all k modes with the same k_⊥, i.e. the modes that have the same baseline length. We used the power-square beam from the HERA beam measurements¹¹^,12(Parsons et al. 2017) to estimate the beam area (equation B10 in Parsons et al. 2014). The power spectrum has units of |${\rm K^{2} \, [h^{-3}\, cMpc^{3}}]$|⁠. Fourier modes (k_⊥, k_∥) are in units of inverse co-moving distance and are given by (e.g. Morales, Bowman & Hewitt 2006; Trott et al. 2012)

$$\begin{eqnarray*} k_{\perp } = \frac{2 \pi |\boldsymbol {u}|}{D_M(z)}, \end{eqnarray*}$$

(4)

$$\begin{eqnarray*} k_{\parallel } = \frac{2 \pi H_0 \nu _{21} E(z)}{c(1+z)^2} \tau , \end{eqnarray*}$$

(5)

$$\begin{eqnarray*} k = \sqrt{k_{\perp }^2 + k_{\parallel }^2} , \end{eqnarray*}$$

(6)

where D_M(z) is the transverse co-moving distance, H₀ is the Hubble constant, ν₂₁ is the frequency of the hyperfine transition, and E(z) is the dimensionless Hubble parameter (Hogg 1999).

Fig. 5 shows the delay power spectrum for a 2 min LST binned data (e.g. we consider one LST bin only) as a function of k_∥, up to τ ∼ 3.6 μs, corresponding to k_∥ ∼ 2.0 h cMpc⁻¹. We used a 10 MHz bandwidth centred at 162.06 MHz to estimate the delay spectrum. We found most of the foreground power is confined within k_∥ ≤ 0.2 h cMpc⁻¹ and foreground excess beyond that is largely limited for most baselines, however, there is some signature of a signal with an ∼1 MHz period (Kern et al. 2020a), corresponding to k_∥ ∼ 0.5 h cMpc⁻¹, for all the baselines considered.

Figure 5.

Power spectrum of a 2 min LST binned data as a function of k_∥ for the baselines included in our analysis. The baselines in units of meters are shown in the legend.

Open in new tab Download slide

Fig. 6 presents the delay power spectra in the k_⊥−k_∥ plane related to the same LST bin. We found that the smooth diffuse foreground in the k_⊥−k_∥ plane dominates at low k_∥, with most power localized within k_∥ ≤ 0.2 h cMpc⁻¹. The foreground power drops by four to five orders of magnitude in the k_∥ ≥ 0.2 h cMpc⁻¹ region, where the EoR signal is expected to dominate over the foreground emission. We notice some signature of a wedge-like structure in k space (Datta et al. 2010; Morales et al. 2012), although in the current HERA antenna layout we are mostly limited to short baselines and hence the foreground wedge is not clearly visible. This wedge line is defined by (Liu, Parsons & Trott 2014; Dillon et al. 2015)

$$\begin{eqnarray*} k_{\parallel } = \left[\mathrm{sin}(\theta _{\mathrm{field}}) \frac{H_{0}D_{M}(z)E (z)}{c \, (1+z)}\right]k_{\perp }, \end{eqnarray*}$$

(7)

where θ_field is the angular radius of the field of view. We also show on the figure the wedge line corresponding to the horizon limit (θ_max = 90°).

Figure 6.

Delay power spectrum in the (k_⊥−k_∥) plane for a 2 min LST binned data. The dashed line represents the horizon line corresponding to θ_max = 90°.

Open in new tab Download slide

3 GAUSSIAN PROCESS REGRESSION AND FOREGROUND CHARACTERIZATION

The delay power spectrum results show that the data is mostly dominated by foregrounds. GPR offers a way to model these foregrounds in a maximum likelihood way. In this section, we summarize the GPR formalism (for a detailed review of how the method works see Mertens et al. 2018) and apply it to model foreground components in HERA-47 observations.

In this framework, the different components of 21-cm observations, such as the astrophysical foregrounds, mode-mixing contaminants, and the 21-cm signal, are modelled with a GP. A GP is the joint distribution of a collection of normally distributed random variables (Rasmussen et al. 2005; Gelman et al. 2014). The covariance matrix of this distribution is specified by a covariance function, which defines the covariance between pairs of observed data points (i.e. at different frequencies). The co-variance function ultimately determines the structure that the GP will be able to model (for example, here the smoothness of the foregrounds).

The GPR process requires the choice of the model for the covariance function and a selection of the best-fittingting parameters of such a model (what we call the hyper-parameters). Model selection is done in a Bayesian sense by maximizing the marginal likelihood, also called the evidence, which is the integral of the likelihood over the prior range, given the data. For a fixed model, standard gradient-based optimization or Monte Carlo Markov Chain (MCMC) methods can be adapted to determine the best-fitting parameters of the covariance functions. We note here that currently we model the data only in the frequency axis and no baseline dependence has been introduced in the hyper-parameter optimization with GPR (i.e. there is no dependence on baseline length). This assumption is supported by Fig. 5, where we can see that the power spectrum is similar for different baseline lengths.

In the following equations, |$\boldsymbol {d}$| represents the time-averaged visibilities within a given LST bin and we have not explicitly shown the time dependence of the data. Considering an observed data |$\boldsymbol {d}$| and a GP co-variance model that includes a foreground term K_f and a residual term (noise and 21-cm signal) K_r, the data co-variance can be expressed as, K = K_f + K_r. After GPR, we can retrieve the foreground part of the signal |$E(\boldsymbol {f_{\mathrm{ fg}}})$| that always refers to the total signal except for noise or 21-cm signal through basically a Wiener filter (Wiener 1949):

$$\begin{eqnarray*} E(\boldsymbol {f_{\mathrm{ fg}}}) = K_{\mathrm{f}} \left[K_{\mathrm{f}} + K_{\mathrm{r}} \right]^{-1} \boldsymbol {d}. \end{eqnarray*}$$

(8)

In the GPR context, this is referred to as the posterior mean matrix, while

$$\begin{eqnarray*} \mathrm{cov}(\boldsymbol {f_{\mathrm{ fg}}}) = K_{\mathrm{f}} - K_{\mathrm{f}} \left[K_{\mathrm{f}} + K_{\mathrm{r}} \right]^{-1} K_{\mathrm{f}} \end{eqnarray*}$$

(9)

is the posterior co-variance matrix.

Assuming that the GP co-variance model is optimal and taking |$\langle \bf {d} \bf {d}^{H}\rangle = K_{\mathrm{f}} + K_{\mathrm{r}}$|⁠, then |$\langle E(\boldsymbol {f_{\mathrm{ fg}}})E(\boldsymbol {f_{\mathrm{ fg}}})^H \rangle = K_{\mathrm{f}} - \mathrm{cov}(\boldsymbol {f_{\mathrm{ fg}}})$|⁠. This highlights that to obtain the expected co-variance model of the foregrounds, K_f, directly from |$E(\mathbf {f_{fg}})$|⁠, we need to un-bias the estimator using |$\mathrm{cov}(\mathbf {f_{fg}})$|⁠. We implement a similar unbiasing for the delay power-spectra of the different foreground components by first taking the delay transform of |$E(\mathbf {f_{fg}})$| and |$\mathrm{cov}(\mathbf {f_{fg}})$| and then adding them in the power spectrum domain. We finally normalize by the observed cosmological volume to construct the delay power spectrum P(k_⊥, k_∥) in units of |${\rm K^{2} \, [h^{-3}\, cMpc^{3}}]$|⁠. More specifically, we calculate the covariance matrices by fitting the hyper-parameters to all the data, while the posterior mean is obtained for each time-averaged visibility (so, the covariance calculated from |$E(\mathbf {f_{fg}})$| is not necessarily the same as the initial K_f). In this paper, we consider the power spectrum of the different foreground components. This implies calculating |$E(\boldsymbol {f_{\mathrm{ fg}}})$| for each of the foreground components, where we replace K_f by the optimized co-variance of the corresponding foreground component (while keeping the term in square brackets, [K_f + K_r], the same, since it is the total co-variance).

3.1 Covariance functions

In this section, we review the co-variance functions for the different components of the data. The selection of a co-variance function κ for the 21-cm signal can be chosen by comparison to a range of 21-cm signal simulations. For this analysis, we choose a Matern η = 1/2 co-variance function with a frequency coherence scale l parameter:

$$\begin{eqnarray*} \kappa _{\mathrm{Matern}}(\nu _p, \nu _q) = \sigma _{\rm f}^2 \frac{2^{1 - \eta }}{\Gamma (\eta)}\left(\frac{\sqrt{2\eta }r}{l}\right)^{\eta } K_{\eta }\left(\frac{\sqrt{2\eta }r}{l}\right), \end{eqnarray*}$$

(10)

where σ_f is the signal variance, r = |ν_q − ν_p| and K_η is the modified Bessel function of the second kind. The parameter η controls the smoothness of the resulting function. For η = 1/2, the Matern kernel is equivalent to an exponential kernel. The choice of this co-variance kernel well matches the co-variance of the EoR signal with 21cmFAST (Mesinger, Furlanetto & Cen 2011, Fig. 7). Following Mertens et al. (2018), we used a uniform prior in the 0.01–1.25 MHz range on the hyper-parameter l.

Figure 7.

Example of the normalized GP exponential co-variance function with a frequency coherence scale l_inj = 0.8 MHz (dot–dashed line), compared to the co-variance of a simulated 21-cm EoR signal at different redshifts using 21cmFAST (Mesinger & Furlanetto 2007; Mesinger et al. 2011).

Open in new tab Download slide

The intrinsic smooth foregrounds are modelled with a Radial Basis Function (RBF) kernel (also known as the ‘squared-exponential’ or a ‘Gaussian’ kernel):

$$\begin{eqnarray*} \kappa _{\mathrm{RBF}}(\nu _p, \nu _q) = \sigma _{\rm f}^2 \exp \left(- \frac{r^2}{2 l^2}\right), \end{eqnarray*}$$

(11)

where the coherence scale l controls the smoothness of the function, σ_f is the signal variance and the frequency coherence scale was bounded in the 10–200 MHz range. We note that the Matern kernel (equation 10) is a generalization of the RBF kernel, parametrized by an additional parameter η. When η tends to infinity, the kernel becomes equivalent to RBF kernel. Medium-scale fluctuations coming from a combination of the instrumental chromaticity and imperfect calibration (termed as ‘mode-mixing’ components) are also modelled by a GP with an RBF covariance function where the characteristic coherence scale l is bounded in the 2–20 MHz range.

3.2 Foreground modelling

Here, we discuss the GPR model foreground components, including the modelling and subsequent removal of the frequency and amplitude, modulated periodic signal with an additional GP co-variance kernel. Again, following Mertens et al. (2018), we modelled the GP co-variance function by decomposing the foreground co-variance as

$$\begin{eqnarray*} K_{\mathrm{fg}} = K_{\mathrm{sky}} + K_{\mathrm{mix}}, \end{eqnarray*}$$

(12)

where the ‘sky’ denotes the intrinsic smooth foreground sky and ‘mix’ denotes the mode-mixing contaminants that introduce oscillations in frequency mostly caused by the instrument. It is expected that K_sky will pick up the frequency dependence of the foreground signal at a given uv point, whereas the mixing component K_mix can model relatively rapidly varying foreground components such as the fact that the uv point itself also moves in the uv plane with frequency (Morales et al. 2012) and hence is sensitive to extra angular and frequency scale structures. We remind the reader that here we use the RBF kernel to model the foregrounds and an exponential kernel is used to represent the 21-cm signal co-variance function. To select the optimal mode-mixing co-variance function, we considered the Matern kernel with η = 3/2 and 5/2 along with the RBF kernel with a uniform prior in the 2–20 MHz range. We found that the difference in the log-likelihood for the RBF kernel from the Matern 3/2 and the Matern 5/2 kernels are 1717 and 854, respectively (keeping the other covariances fixed). Based on this evidence, we choose to use the RBF kernel to model the mode-mixing component. We optimize the log-marginal-likelihood for the full set of visibilities (real and imaginary part separately) for the six variances and the coherence length scales hyper-parameters (namely, |$\sigma _{21}^2$|⁠, l₂₁, |$\sigma _{{\rm sky}}^2$|⁠, l_sky, |$\sigma _{{\rm mix}}^2$|⁠, and l_mix), assuming the coherence scale is spatially invariant, i.e. the same for each baseline type. The python package gpy¹³ is used to do the optimization using the full set of visibilities. The noise term is modelled with a fixed variance where the covariance matrix describes the variance along the frequency direction. The noise in the real data has both a frequency and a time dependence but here we choose only the frequency axis to approximate the noise variance. We found that the frequency coherence scale of the ‘sky’ and ‘mix’ co-variance kernel are about 20 and 2.4 MHz, respectively.

In general, the coherence scale is expected to be dependent on the baseline length, with longer baselines de-correlating faster than shorter baselines. We investigate this effect by implementing a ‘per-baseline’ GPR approach that allows us to model a coherence scale for different baseline lengths using a smaller data set with an increased number of degrees of freedom. We found that the coherence scale decreases at longer baselines (Fig. 8), from l ∼ 3.2 MHz for the 14.6 m baseline to l ∼ 2.2 MHz for ∼60 m baseline. In the ‘all-baselines’ GPR implementation, the optimal coherence scale was about 2.4 MHz, which falls inside the maximum–minimum range of ‘per-baseline’ GPR approach. These results further agree well with our initial choice of the 2–20 MHz prior range for the medium-scale frequency fluctuations introduced in Section 3.1. The main reason for this behaviour is the limited baseline range used in this analysis over which the foreground co-variance remains similar.

Figure 8.

The optimized co-variance scale l_mix for the mode-mixing kernel as a function of baseline lengths using a ‘per-baseline’ GPR technique.

Open in new tab Download slide

The inclusion of significantly longer baselines would likely require a ‘per-baseline’ GPR fit as the foreground coherence will change more significantly across the range of baseline lengths. This could be implemented without significantly increasing the number of degrees of freedom by allowing the coherence scale parameters to be a function of the baseline length. We leave this investigation for future work.

We then considered all nights, coherently LST combined data sets for which we estimate different foreground components using GPR. For the GPR foreground modelling and the power spectrum estimation we used the python package ps_eor.¹⁴

Fig. 9 shows the power spectrum and the variance across frequency for the different foreground components that we recover using the GPR technique. Note that in our GPR power spectrum estimation method, the hyper-parameters of the covariance model are optimized using the Bayesian evidence. Doing an MCMC, we then get the posterior distribution of these hyper-parameters, and this is the first source of uncertainty that we use and propagate to the power spectra. The shaded area in Fig. 9 shows the propagated 2σ uncertainty on the hyper-parameters and the uncertainty on the model fit (equation 9) and from the MCMC run (for details see Section 3.2.3) on to the different foreground power spectrum components. An important point to note here is that the uncertainty on the power spectra that we estimate are correct assuming that our assumed co-variance functions are appropriate.

Figure 9.

Power spectra and derived foreground components after GPR analysis of the coherently averaged combined data: spherically averaged power spectra (left-hand panel), delay spectra averaged over all baselines (middle panel). The right-hand panel shows the variance of the different foreground components across frequency. The shaded area highlights the uncertainty (2σ) from the GP process (equation 9) and the uncertainty on the model fit from the MCMC run on to the foreground power spectrum components.

Open in new tab Download slide

We notice that the ‘FG mix’ model has a small coherence scale (2–3 MHz) and therefore the variance has a wave-like pattern, but for the intrinsic foregrounds it is mostly smooth across frequency. We detect a ‘bump’ in the power spectrum around k_∥ ∼ 0.5 h cMpc⁻¹, corresponding to a ∼900 ns delay, indicating the presence of a non-negligible contamination in the data (possibly due to internal signal chain reflections, or a more dominant instrumental cross-talk feature spanning delays of 800–1200 ns that does not look like an EoR signal and can therefore be filtered out, Kern et al. 2020a), which we investigate in more detail in Section 3.2.1.

Fig. 10 shows the correlation along the frequency direction of the different GPR components (intrinsic foreground, mode-mixing foreground, and residuals) as a function of LST difference for all the combinations of LST binned data sets, covering an LST range of ∼20.92^h−21.26^h (each cross represents an LST bin difference). Here, we compute the correlation for every combination of the LST binned visibility data sets for each baseline and finally we average over the baselines to determine the final value. The correlation coefficient is given by

$$\begin{eqnarray*} \rho _{{\rm LST1, LST2}}^{{\rm cmpt}} = \frac{{\rm cov\left({\rm fg_{LST1}^{{\rm cmpt}}, \rm fg_{LST2}^{{\rm cmpt}}}\right)}}{\sigma \left(\rm fg_{LST1}^{{\rm cmpt}}\right) \sigma \left(\rm fg_{LST2}^{{\rm cmpt}}\right)} , \end{eqnarray*}$$

(13)

where |$\rm fg_{LST1}^{{\rm cmpt}}$| and |$\rm fg_{LST2}^{{\rm cmpt}}$| are the foreground model components corresponding to the ‘sky’, ‘mix’, or the residual at two different LSTs. For large LST differences, the correlation should go down since we are looking at different parts of the sky. From Fig. 10, we see that the intrinsic foreground correlation remains above |$80 {{\ \rm per\ cent}}$| regardless of the time difference. The correlation coefficient starts to decay only for LST differences >2–4 min (as the sky starts to shift). The mode-mixing de-correlates significantly as a function of LST difference. This typically depends on the coherence scale in the uv-plane as a baseline moves through it and is faster for longer baselines. The mode-mixing is also more affected by LST difference de-correlation because it contains fluctuations due to small beam differences mainly further away from the phase centre.

$Correlation coefficient $\rho _{{\rm LST1, LST2}}^{{\rm sky, mix, res}}$ as a function of LST difference for all the night, LST-binned data. Each marker symbol (‘ + ’, ‘x’ or ‘o’) represents an LST difference.$

Figure 10.

Correlation coefficient |$\rho _{{\rm LST1, LST2}}^{{\rm sky, mix, res}}$| as a function of LST difference for all the night, LST-binned data. Each marker symbol (‘ + ’, ‘x’ or ‘o’) represents an LST difference.

Open in new tab Download slide

3.2.1 Characteristics of the periodic signal

After foreground removal, the residual power spectrum is dominated by noise and an almost periodic signal that reveals itself by an excess power at k_∥ ∼ 0.5 h cMpc⁻¹. We find the periodic signal is baseline dependent and it also varies with LST difference. Fig. 11 shows the correlation of the residual visibilities (equation 13) for different baselines as a function of LST difference. Two periodic signals from two different LST times appear to be phase shifted. A closer inspection reveals that the amplitude and periodicity of this signal does not remain stationary but varies with frequency. For example, the residual visibilities for a specific baseline and the fit to the periodic signal is shown in Fig. 12. Similar frequency-dependent complex patterns are also seen for other baselines. This profile can be fitted using the GPR method and a combination of a RBF and Cosine co-variance function, K_per, on each baseline individually. The co-variance function κ_per for the periodic kernel depends on the characteristic coherence scale l_per over which the periodic signal vary, the signal variance |$\sigma _{\mathrm{per}}^2$|⁠, and the period p_per:

$$\begin{eqnarray*} \kappa _{\mathrm{per}}(\nu _p, \nu _q) = \sigma _{\rm per}^2 \, \exp {\left(- \frac{r^2}{2 l_{\rm per}^2}\right)} \cos \left(\frac{2 \pi r}{p_{per}}\right). \end{eqnarray*}$$

(14)

We found the main periodicity is ∼1 MHz.

Figure 11.

Correlation coefficients of the residual visibility data after GPR as a function of baseline length and LST difference. The plot shows very strong dependence of the periodic signal on baseline length |u| and it also varies with LST difference.

Open in new tab Download slide

$Example of the periodic signal for 38.6 m baseline with coordinates $(u, v) =(-19.8, 7)\, \lambda$. The real part of the visibility is shown in ‘K’ units. The different transparent lines correspond to different LST data sets on which we apply a frequency phase offset to align the periodic signal. This signal can be fitted using GPR with a combination of an RBF and Cosine co-variance kernel (the solid line).$

Figure 12.

Example of the periodic signal for 38.6 m baseline with coordinates |$(u, v) =(-19.8, 7)\, \lambda$|⁠. The real part of the visibility is shown in ‘K’ units. The different transparent lines correspond to different LST data sets on which we apply a frequency phase offset to align the periodic signal. This signal can be fitted using GPR with a combination of an RBF and Cosine co-variance kernel (the solid line).

Open in new tab Download slide

Kern et al. (2020a) provides a thorough investigation of such systematic effect, attributing it to a combination of instrumental cross-coupling (e.g. mutual coupling and cross-talk) and cable reflections within the analogue signal chain. Kern et al. (2019) present methods for modelling and removing these systematic terms in the data. In the following section, we show how it can be modelled and subtracted in the GPR formalism.

3.2.2 Filtering the periodic signal with GPR

To model this locally periodic signal with amplitude varying over a certain coherence scale, we introduce an additional kernel in the foreground co-variance model. We use a combination of an RBF and a cosine kernel to model the period in frequency. The updated foreground co-variance function is modelled as

$$\begin{eqnarray*} K_{\mathrm{fg}} = K_{\mathrm{sky}} + K_{\mathrm{mix}} + K_{\mathrm{per}} , \end{eqnarray*}$$

(15)

where the K_per represents the periodic signal contaminant (see equation 14). We used this updated foreground co-variance model in our GP optimization. The GPR estimates of the parameters for the periodic co-variance function are found to be p_per ∼ 1 MHz and l_per ∼ 1.2 MHz, respectively.

Fig. 13 displays the power spectrum of different GPR modelled components including the periodic signal. We notice that the periodic signal peaks around ∼0.4–0.8 h cMpc⁻¹. In the middle panel, we display the GPR model that nicely isolates the periodic signal component around k_∥ ∼ 0.5 h cMpc⁻¹. In general, we find that the periodic signal is k dependent. It appears at k ∼ 0.17 h cMpc⁻¹, reaching a ∼10⁷ mK² peak at k ∼ 0.4 cMpc⁻¹ – approximately six orders of magnitude brighter than the expected 21-cm power spectrum (Mesinger et al. 2011).

Figure 13.

Same as Fig. 9 but now including the periodic signal and the estimated noise power spectrum and variance.

Open in new tab Download slide

The average variance across frequency for the periodic signal component is ∼3.8 × 10⁴ mK², while the mean variance of the mode-mixing signal is around ∼1.2 × 10⁷ mK², approximately three orders of magnitude higher. The noise power spectrum shown in Fig. 13 is estimated by splitting the data set in even and odd times with a 10.7 s time separation and taking the difference between the two. At this time resolution, the foregrounds cancel out almost perfectly. We find the residual power spectrum level is close to the estimated noise power spectrum, especially at |k_∥| ≥ 0.85 h cMpc⁻¹.

The residual in Fig. 14 reveals that there is still some time correlation left, but overall, we find the residuals have become more uncorrelated and noise-like compared to Fig. 10 where the GP foreground co-variance was modelled only with a combination of ‘sky’ and ‘mode-mixing’ kernels.

Figure 14.

Same as Fig. 10, but here the periodic signal has been modelled with an RBF and cosine kernel and then removed.

Open in new tab Download slide

3.2.3 Foreground model hyper-parameter uncertainties

We sampled the posterior distribution of the foreground model hyper-parameters and characterize their correlation with an MCMC. We used the emcee python package¹⁵ (Foreman-Mackey et al. 2013), which uses an ensemble sampler algorithm based on the affine-invariant sampling algorithm (Goodman & Weare 2010).

Fig. 15 shows the resulting posterior probability distribution of the GP model hyper-parameters. The variance of the EoR kernel, which was modelled with a GP exponential kernel, is found to be un-constrained and low. The data can be well modelled by the ‘sky’, ‘mix’, ‘per’ (periodic) foreground kernels and the noise covariance matrix (modelled with a fixed variance) that contributes a large part of the variance at large k_∥. We compared the evidence values with and without the EoR co-variance kernel in the GP optimization. We find the evidence remains mostly unchanged and the Bayes factor (Jeffreys 1961) is around ∼0.93 for GP models with and without the EoR co-variance kernel. This essentially confirms that the signal is dominated by a noise-like component once the foregrounds are removed and adding an EoR kernel has an insignificant effect. Overall, the confidence intervals of other kernel hyper-parameters are reasonably well constrained, except the variance of the ‘21-cm signal’ component that is consistent with zero. The significance of the coherence scale of the ‘21-cm signal’ is also reduced given the non-significant variance of this component. Table 1 highlights the parameter estimates and confidence intervals for the posterior probability distribution of the foreground model hyper-parameters. The estimated median values of the frequency coherence scale of the ‘sky’ and ‘mix’ covariance kernel is about 19.4 and 2.4 MHz, respectively, which is close to the GPR optimized values as presented in Section 3.2.

$Posterior probability distributions of the GP model hyper-parameters. We show here the coherence scale and strength of the EoR co-variance kernel (l21 in MHz and σ21 in K2), the coherence scale and strength of the mode-mixing foreground kernel (lmix in MHz, σmix in K2), the intrinsic foreground kernel (lsky in MHz, σsky in K2) and the periodic co-variance kernel hyper-parameters ($l_{{\rm per}}, \, p_{{\rm per}}$ in MHz and $\sigma _{{\rm per}}^2$ in K2). The vertical dashed lines show the first, second, and third (Q1, Q2, and Q3) quantile levels. The diagonal panels represent the marginalized probability distribution of each parameter.$

Figure 15.

Posterior probability distributions of the GP model hyper-parameters. We show here the coherence scale and strength of the EoR co-variance kernel (l₂₁ in MHz and σ₂₁ in K²), the coherence scale and strength of the mode-mixing foreground kernel (l_mix in MHz, σ_mix in K²), the intrinsic foreground kernel (l_sky in MHz, σ_sky in K²) and the periodic co-variance kernel hyper-parameters (⁠|$l_{{\rm per}}, \, p_{{\rm per}}$| in MHz and |$\sigma _{{\rm per}}^2$| in K²). The vertical dashed lines show the first, second, and third (Q1, Q2, and Q3) quantile levels. The diagonal panels represent the marginalized probability distribution of each parameter.

Open in new tab Download slide

Table 1.

Open in new tab

Summary of the estimated median and confidence intervals (first and third quantile levels (Q1 and Q3)) of the respective GP model hyper-parameters including the periodic co-variance kernel.

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

Table 1.

Open in new tab

Summary of the estimated median and confidence intervals (first and third quantile levels (Q1 and Q3)) of the respective GP model hyper-parameters including the periodic co-variance kernel.

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

4 DISCUSSION AND CONCLUSIONS

In this paper, we have used a novel foreground separation method, first introduced in Mertens et al. (2018), in order to model foregrounds with the HERA-47 array. The mainstream HERA data analysis takes advantage of the concept of avoiding foregrounds and provides a single-baseline power spectrum estimate: recent data analysis showed evidence of systematic effects that contaminate the EoR window, and motivated the development of strategies to mitigate their impact to the avoidance paradigm (Kern et al. 2020b). An alternative effort to detect the 21-cm signal using closure quantities is actively being pursued (Thyagarajan, Carilli & Nikolic 2018; Carilli et al. 2018; Carilli, Thyagarayan & Kent 2020).

The method presented here uses GPR to model various stochastic foreground components, such as the spectrally smooth intrinsic sky, mode-mixing components generating from the chromatic instrument and imperfect calibration, as well as a 21-cm signal. It therefore bears analogies with the avoidance approach as they both attempt to model and subtract systematic effects in the EoR window, but also more broadly models the foreground emission – which is not within the purpose of the avoidance approach. Foreground modelling may be a necessary step in order to reduce the leakage in the EoR window and access the high signal-to-noise ratio small k modes (e.g. Kerrigan et al. 2018; Ewall-Wice et al. 2020; Lanman, Pober & Kern 2020).

Our analysis included a different co-variance function for each of intrinsic sky, mode-mixing, and 21-cm signal components in the GP modelling. We found that the frequency coherence scale of the ‘sky’ and ‘mix’ co-variance kernel are about 20 and 2.4 MHz, respectively. As a comparison, the typical (theoretical) frequency coherence scale for the 21-cm EoR signal is found to be around ∼0.8 MHz, when fitted to the co-variance of a simulated 21-cm EoR template. The foreground power spectrum is shown to be contaminated by an ∼1 MHz periodic signal whose amplitude changes from baseline to baseline. The periodic signal dominates the 0.25 < k < 0.9 h cMpc⁻¹ range. We included a combination of RBF and a cosine kernel to model this signal within our GPR method and found a fairly cleaner and flatter residual power spectrum across the 0.05 < k < 1.83 h cMpc⁻¹ range. The residual power spectrum is also mostly consistent with the estimated noise power spectrum, especially at high k_∥ values, whereas residuals are still present in the foreground and periodic signal-dominated region of the pwoer spectrum.

As foreground subtraction is potentially at risk of altering the 21-cm signal, we plan to further explore this approach using more HERA data and test the cleaning with signal injection tests using full-scale HERA simulations. In this paper, we have restricted ourselves to foreground modelling only and left the characterization of residual power spectra to future work, which will include end-to-end signal injection tests.

Finally, we note that the foreground model used in this paper might not be complete, although it seems to be enough at this noise level. In particular, it does not include other foregrounds contaminants such as the instrumental polarization leakage, residual RFIs, and the phase errors caused by the ionosphere or imperfect calibration. We plan to include these additional subtle effects in our GP co-variance modelling. In addition to these, we intend to implement a per-baseline GPR approach where the coherence scale parameters are a function of the baseline length without exploding the number of degrees of freedom of the GPR fit. This will be relevant for longer HERA baselines where the larger baselines will de-correlate faster compared to the shorter baselines. Also, the present mode-mixing model can be improved by integrating the k_⊥ dependence of the foreground wedge. We further plan to include the isotropic nature of the 21-cm signal and its evolution at different redshift bins, which will also ensure a more sensitive and detailed modelling.

ACKNOWLEDGEMENTS

We thank the anonymous referee and the editors for their useful comments and suggestions. This material is based upon work supported by the National Science Foundation under grant nos. 1636646 and 1836019 and institutional support from the HERA collaboration partners. This work is funded in part by the Gordon and Betty Moore Foundation and the National Research Foundation of South Africa (grants nos. 103424 and 113121). HERA is hosted by the South African Radio Astronomy Observatory (SARAO), which is a facility of the National Research Foundation, an agency of the Department of Science and Innovation. Parts of this research were supported by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013. AG would like to thank SARAO for support through SKA postdoctoral fellowship, 2016. AG would also like to thank Dr. Sushanta Kumar Mondal for editing some of the figures. FGM and LVEK would like to acknowledge support from a SKA-NL Roadmap grant from the Dutch ministry of OCW. LVEK, FGM, and BG also acknowledge support by an NWO–NRF Exchange Programme in Astronomy and Enabling technologies in Astronomy (NWO grant no. 629.003.021). GB acknowledges support from the Royal Society and the Newton Fund under grant NA150184. MGS acknowledges support from the SARAO and the National Research Foundation (grant no. 84156). GB acknowledges funding from the INAF PRIN-SKA 2017 project 1.05.01.88.04 (FORECaST). We acknowledge the support from the Ministero degli Affari Esteri e della Cooperazione Internazionale – Direzione Generale per la Promozione del Sistema Paese Progetto di Grande Rilevanza ZA18GR02. AL acknowledges support from the New Frontiers in Research Fund Exploration grant program, a Natural Sciences and Engineering Research Council of Canada Discovery Grant and a Discovery Launch Supplement, the Sloan Research Fellowship, as well as the Canadian Institute for Advanced Research Azrieli Global Scholars program. We acknowledge the HERA staff who made these observations possible.

Footnotes

1

http://www.gmrt.ncra.tifr.res.in/

2

Low-Frequency Array, http://www.lofar.org.

3

Murchison Widefield Array, http://www.mwatelescope.org.

4

Precision Array to Probe EoR, http://eor.berkeley.edu.

5

http://www.skatelescope.org

6

http://reionization.org

7

https://github.com/HERA-Team/librarian/

8

https://github.com/RadioAstronomySoftwareGroup/pyuvdata

9

http://casa.nrao.edu

10

https://github.com/Trienko/heracommissioning

11

https://github.com/HERA-Team/hera-cst

12

http://reionization.org/science/memos/

13

https://sheffieldml.github.io/GPy/

14

https://gitlab.com/flomertens/ps_eor

15

http://dfm.io/emcee/current/

REFERENCES

Ali

Z. S.

et al. .,

2015

,

ApJ

,

809

,

61

10.1088/0004-637X/809/1/61

Crossref

Search ADS

Asad

K. M. B.

et al. .,

2015

,

MNRAS

,

451

,

3709

10.1093/mnras/stv1107

Crossref

Search ADS

Barry

N.

,

Hazelton

B.

,

Sullivan

I.

,

Morales

M. F.

,

Pober

J. C.

,

2016

,

MNRAS

,

461

,

3135

10.1093/mnras/stw1380

Crossref

Search ADS

Barry

N.

et al. .,

2019

,

884

,

1

Beardsley

A. P.

et al. .,

2016

,

ApJ

,

833

,

102

10.3847/1538-4357/833/1/102

,

Crossref

Search ADS

Bernardi

G.

et al. .,

2009

,

A&A

,

500

,

965

10.1051/0004-6361/200911627

Crossref

Search ADS

Bernardi

G.

et al. .,

2010

,

A&A

,

522

,

A67

10.1051/0004-6361/200913420

Crossref

Search ADS

Bernardi

G.

et al. .,

2016

,

MNRAS

,

461

,

2847

10.1093/mnras/stw1499

Crossref

Search ADS

Bharadwaj

S.

,

Sethi

S. K.

,

2001

,

J Astrophys. Astron.

,

22

,

293

10.1007/BF02702273

Crossref

Search ADS

Bolton

J. S.

,

Haehnelt

M. G.

,

Warren

S. J.

,

Hewett

P. C.

,

Mortlock

D. J.

,

Venemans

B. P.

,

McMahon

R. G.

,

Simpson

C.

,

2011

,

MNRAS

,

416

,

L70

10.1111/j.1745-3933.2011.01100.x

Crossref

Search ADS

Bonaldi

A.

,

Brown

M. L.

,

2015

,

MNRAS

,

447

,

1973

10.1093/mnras/stu2601

Crossref

Search ADS

Bowman

J. D.

,

Rogers

A. E. E.

,

2010

,

Nature

,

468

,

796

Bowman

J. D.

,

Morales

M. F.

,

Hewitt

J. N.

,

2009

,

ApJ

,

695

,

183

10.1088/0004-637X/695/1/183

Crossref

Search ADS

Bowman

J. D.

,

Rogers

A. E. E.

,

Monsalve

R. A.

,

Mozdzen

T. J.

,

Mahesh

N.

,

2018

,

Nature

,

555

,

67

Byrne

R.

et al. .,

2019

,

ApJ

,

875

,

70

10.3847/1538-4357/ab107d

Crossref

Search ADS

Carilli

C. L.

,

2017

,

HERA Memo 60

.

http://reionization.org/science/memos/

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Carilli

C. L.

,

Nikolic

B.

,

Thyagarayan

N.

,

Gale-Sides

K.

,

2018

,

Radio Sci.

,

53

,

845

10.1029/2018RS006537

Crossref

Search ADS

Carilli

C. L.

,

Thyagarayan

N.

,

Kent

J.

,

2020

,

ApJS

,

247

,

67

10.3847/1538-4365/ab77b1

Crossref

Search ADS

Chapman

E.

et al. .,

2013

,

MNRAS

,

429

,

165

10.1093/mnras/sts333

Crossref

Search ADS

Chapman

E.

,

Zaroubi

S.

,

Abdalla

F.

,

Dulwich

F.

,

Jelić

V.

,

Mort

B.

,

2016

,

MNRAS

,

458

,

2928

10.1093/mnras/stw161

Crossref

Search ADS

Choudhuri

S.

,

Bharadwaj

S.

,

Chatterjee

S.

,

Ali

S. S.

,

Roy

N.

,

Ghosh

A.

,

2016

,

MNRAS

,

463

,

4093

10.1093/mnras/stw2254

Crossref

Search ADS

Datta

A.

,

Bowman

J. D.

,

Carilli

C. L.

,

2010

,

ApJ

,

724

,

526

10.1088/0004-637X/724/1/526

Crossref

Search ADS

DeBoer

D. R.

et al. .,

2017

,

PASP

,

129

,

045001

10.1088/1538-3873/129/974/045001

Crossref

Search ADS

Dillon

J. S.

,

Parsons

A. R.

,

2016

,

ApJ

,

826

,

181

10.3847/0004-637X/826/2/181

Crossref

Search ADS

Dillon

J. S.

et al. .,

2014

,

Phys. Rev. D

,

89

,

023002

Dillon

J. S.

et al. .,

2015

,

Phys. Rev. D

,

91

,

123011

Dillon

J. S.

et al. .,

2018

,

MNRAS

,

477

,

5670

10.1093/mnras/sty1060

Crossref

Dillon

J. S.

et al. .,

2020

,

preprint (arXiv:2003.08399)

Ewall-Wice

A.

,

Dillon

J. S.

,

Liu

A.

,

Hewitt

J.

,

2017

,

MNRAS

,

470

,

1849

10.1093/mnras/stx1221

Crossref

Search ADS

Ewall-Wice

A.

et al. .,

2020

,

preprint (arXiv:2004.11397)

Fagnoni

N.

,

de Lera Acedo

E.

,

2016

,

CST Simulation of HERA and Comparison with Measurements, HERA Memo 21

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Fialkov

A.

,

Barkana

R.

,

Visbal

E.

,

2014

,

Nature

,

506

,

197

Fialkov

A.

,

Cohen

A.

,

Barkana

R.

,

Silk

J.

,

2017

,

MNRAS

,

464

,

3498

10.1093/mnras/stw2540

Crossref

Search ADS

Foreman-Mackey

D.

,

Hogg

D. W.

,

Lang

D.

,

Goodman

J.

,

2013

,

PASP

,

125

,

306

10.1086/670067

Crossref

Search ADS

Fraser

S.

et al. .,

2018

,

Phys. Lett. B

,

785

,

159

10.1016/j.physletb.2018.08.035

Crossref

Search ADS

Furlanetto

S. R.

,

Oh

S. P.

,

Briggs

F. H.

,

2006

,

Phys. Rep.

,

433

,

181

Gehlot

B. K.

et al. .,

2019

,

MNRAS

,

488

,

4271

10.1093/mnras/stz1937

Crossref

Search ADS

Geil

P. M.

,

Gaensler

B. M.

,

Wyithe

J. S. B.

,

2011

,

MNRAS

,

418

,

516

10.1111/j.1365-2966.2011.19509.x

Crossref

Search ADS

Gelman

et al. .,

2014

,

Bayesian Data Analysis

, 3rd edn.

Chapman & Hall

,

London

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Ghosh

A.

,

Bharadwaj

S.

,

Ali

S. S.

,

Chengalur

J. N.

,

2011

,

MNRAS

,

418

,

2584

10.1111/j.1365-2966.2011.19649.x

Crossref

Search ADS

Goodman

J.

,

Weare

J.

,

2010

,

Commun. Appl. Math. Comput. Sci.

,

5

,

65

Crossref

Search ADS

Greenhill

L. J.

,

Bernardi

G.

,

2012

,

preprint (arXiv:1201.1700)

Grobler

T. L.

,

Bernardi

G.

,

Kenyon

J. S.

,

2018

,

MNRAS

,

476

,

2410

10.1093/mnras/sty357

Crossref

Search ADS

Harker

G.

et al. .,

2009

,

MNRAS

,

397

,

1138

10.1111/j.1365-2966.2009.15081.x

Crossref

Search ADS

Hazelton

B.

et al. .,

2017

,

J. Open Source Softw.

,

2

,

140

10.21105/joss.00140

Crossref

Search ADS

Hogg

D. W.

,

1999

,

preprint (arXiv:astro-ph/9905116)

Hurley-Walker

N.

et al. .,

2017

,

MNRAS

,

464

,

1146

10.1093/mnras/stw2337

Crossref

Search ADS

Jeffreys

H.

,

1961

,

Theory of probability

, 3rd edn.

Clarendon Press

,

Oxford

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Jelić

V.

et al. .,

2008

,

MNRAS

,

389

,

1319

10.1111/j.1365-2966.2008.13634.x

Crossref

Search ADS

Jensen

H.

,

Majumdar

S.

,

Mellema

G.

,

Lidz

A.

,

Iliev

I. T.

,

Dixon

K. L.

,

2016

,

MNRAS

,

456

,

66

10.1093/mnras/stv2679

Crossref

Search ADS

Kern

N. S.

,

Parsons

A. R.

,

Dillon

J. S.

,

Lanman

A. E.

,

Fagnoni

N.

,

de Lera Acedo

E.

,

2019

,

ApJ

,

884

,

105

10.3847/1538-4357/ab3e73

Crossref

Search ADS

Kern

N. S.

et al. .,

2020a

,

ApJ

,

888

,

70

10.3847/1538-4357/ab5e8a

Crossref

Search ADS

Kern

N. S.

et al. .,

2020b

,

ApJ

,

890

,

122

10.3847/1538-4357/ab67bc

Crossref

Search ADS

Kerrigan

J. R.

et al. .,

2018

,

ApJ

,

864

,

131

10.3847/1538-4357/aad8bb

Crossref

Search ADS

Kolopanis

M.

et al. .,

2019

,

ApJ

,

883

,

133

10.3847/1538-4357/ab3e3a

Crossref

Search ADS

Koopmans

L. V. E.

,

2010

,

ApJ

,

718

,

963

10.1088/0004-637X/718/2/963

Crossref

Search ADS

Lanman

A.

,

Pober

J. C.

,

Kern

N. S.

,

2020

,

MNRAS

,

487

,

5840

10.1093/mnras/stz1639

Crossref

Search ADS

Li

W.

,

Pober

J. C.

,

Barry

N.

,

2019

,

ApJ

,

887

,

14

Crossref

Search ADS

Lidz

A.

,

Zahn

O.

,

McQuinn

M.

,

Zaldarriaga

M.

,

Hernquist

L.

,

2008

,

ApJ

,

680

,

962

10.1086/587618

Crossref

Search ADS

Liu

A.

,

Tegmark

M.

,

2011

,

Phys. Rev. D

,

83

,

103006

Liu

A.

,

Parsons

A. R.

,

Trott

C. M.

,

2014

,

Phys. Rev. D

,

90

,

023018

Loeb

A.

,

Furlanetto

S. R.

,

2013

,

The First Galaxies in the Universe, by Abraham Loeb and Steven R. Furlanetto

.

Princeton Univ. Press

,

Princeton, NJ

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Marthi

V. R.

,

Chengalur

J.

,

2014

,

MNRAS

,

437

,

524

10.1093/mnras/stt1902

Crossref

Search ADS

McQuinn

M.

,

O’Leary

R. M.

,

2012

,

ApJ

,

760

,

3

Crossref

Search ADS

Mellema

G.

et al. .,

2013

,

Exp. Astron.

,

36

,

235

10.1007/s10686-013-9334-5

Crossref

Search ADS

Mertens

F. G.

,

Ghosh

A.

,

Koopmans

L. V. E.

,

2018

,

MNRAS

,

478

,

3640

10.1093/mnras/sty1207

Crossref

Mertens

F. G.

et al. .,

2020

,

MNRAS

,

493

,

1662

10.1093/mnras/staa327

Crossref

Search ADS

Mesinger

A.

,

Furlanetto

S.

,

2007

,

ApJ

,

669

,

663

10.1086/521806

Crossref

Search ADS

Mesinger

A.

,

Furlanetto

S.

,

Cen

R.

,

2011

,

MNRAS

,

411

,

955

10.1111/j.1365-2966.2010.17731.x

Crossref

Search ADS

Morales

M. F.

,

Wyithe

J. S. B.

,

2010

,

ARA&A

,

48

,

127

10.1146/annurev-astro-081309-130936

Crossref

Search ADS

Morales

M. F.

,

Bowman

J. D.

,

Hewitt

J. N.

,

2006

,

ApJ

,

648

,

767

10.1086/506135

Crossref

Search ADS

Morales

M. F.

,

Hazelton

B.

,

Sullivan

I.

,

Beardsley

A.

,

2012

,

ApJ

,

752

,

137

10.1088/0004-637X/752/2/137

Crossref

Search ADS

Nunhokee

C. D.

et al. .,

2017

,

ApJ

,

848

,

47

10.3847/1538-4357/aa8b73

Crossref

Search ADS

Parsons

A. R.

,

2017

,

HERA Memo 27, Power Spectrum Normalizations for HERA

.

University of California

,

Berkeley

,

Parsons

A.

,

Pober

J.

,

McQuinn

M.

,

Jacobs

D.

,

Aguirre

J.

,

2012a

,

ApJ

,

753

,

81

10.1088/0004-637X/753/1/81

Crossref

Search ADS

Parsons

A. R.

,

Pober

J. C.

,

Aguirre

J. E.

,

Carilli

C. L.

,

Jacobs

D. C.

,

Moore

D. F.

,

2012b

,

ApJ

,

756

,

165

10.1088/0004-637X/756/2/165

Crossref

Search ADS

Parsons

A. R.

et al. .,

2014

,

ApJ

,

788

,

106

10.1088/0004-637X/788/2/106

Crossref

Search ADS

Patil

A. H.

et al. .,

2016

,

MNRAS

,

463

,

4317

10.1093/mnras/stw2277

Crossref

Search ADS

Patil

A. H.

et al. .,

2017

,

ApJ

,

838

,

65

10.3847/1538-4357/aa63e7

Crossref

Search ADS

Patra

N.

,

Subrahmanyan

R.

,

Sethi

S.

,

Udaya Shankar

N.

,

Raghunathan

A.

,

2015

,

ApJ

,

801

,

138

10.1088/0004-637X/801/2/138

Crossref

Search ADS

Planck Collaboration XIII

,

2016

,

A&A

,

594

,

A13

10.1051/0004-6361/201525830

Crossref

Search ADS

Pober

J. C.

,

2015

,

MNRAS

,

447

,

1705

10.1093/mnras/stu2575

Crossref

Search ADS

Pober

J. C.

et al. .,

2014

,

ApJ

,

782

,

66

10.1088/0004-637X/782/2/66

Crossref

Search ADS

Pritchard

J. R.

,

Loeb

A.

,

2012

,

Rep. Prog. Phys.

,

75

,

086901

10.1088/0034-4885/75/8/086901

Crossref

Search ADS

PubMed

Rasmussen

,

Carl Edward

,

Williams

,

Christopher

K. I.

,

2005

,

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning

.

MIT Press

,

Cambridge

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Santos

M. G.

,

Cooray

A.

,

Knox

L.

,

2005

,

ApJ

,

625

,

575

10.1086/429857

Crossref

Search ADS

Santos

M. G.

,

Ferramacho

L.

,

Silva

M. B.

,

Amblard

A.

,

Cooray

A.

,

2010

,

MNRAS

,

406

,

2421

10.1111/j.1365-2966.2010.16898.x

Crossref

Search ADS

Santos

M. G.

,

Silva

M. B.

,

Pritchard

J. R.

,

Cen

R.

,

Cooray

A.

,

2011

,

A&A

,

527

,

A93

10.1051/0004-6361/201015695

Crossref

Search ADS

Thompson

A. R.

,

Moran

J. M.

,

Swenson

G. W.

,

2017

,

Interferometry and Synthesis in Radio Astronomy

, 3rd edn.

Springer

,

Berlin

10.1007/978-3-319-44431-4

Thyagarajan

N.

et al. .,

2013

,

ApJ

,

776

,

6

10.1088/0004-637X/776/1/6

Crossref

Search ADS

Thyagarajan

N.

et al. .,

2015

,

ApJ

,

804

,

14

10.1088/0004-637X/804/1/14

Crossref

Search ADS

Thyagarajan

N.

,

Carilli

C. L.

,

Nikolic

B.

,

2018

,

Phys. Rev. Lett.

,

120

,

251301

10.1103/PhysRevLett.120.251301

Crossref

Search ADS

PubMed

Trott

C. M.

,

Wayth

R. B.

,

Tingay

S. J.

,

2012

,

ApJ

,

757

,

101

10.1088/0004-637X/757/1/101

Crossref

Search ADS

Vedantham

H. K.

,

Koopmans

L. V. E.

,

2016

,

MNRAS

,

458

,

3099

10.1093/mnras/stw443

Crossref

Search ADS

Vedantham

H.

,

Udaya Shankar

N.

,

Subrahmanyan

R.

,

2012

,

ApJ

,

745

,

176

10.1088/0004-637X/745/2/176

Crossref

Search ADS

Wiener

N.

,

1949

,

Extrapolation and Smoothing of Stationary Time Series: With Engineering Applications

.

MIT Press

,

Cambridge

Yatawatta

S.

et al. .,

2013

,

A&A

,

550

,

A136

10.1051/0004-6361/201220874

Crossref

Search ADS

Zaldarriaga

M.

,

Furlanetto

S. R.

,

Hernquist

L.

,

2004

,

ApJ

,

608

,

622

10.1086/386327

Crossref

Search ADS

Zaroubi

S.

,

2013

,

The First Galaxies

,

396

,

45

10.1007/978-3-642-32362-1_2

Crossref

Search ADS

Zhang

Y. G.

,

Liu

A.

,

Parsons

A. R.

,

2018

,

ApJ

,

852

,

110

10.3847/1538-4357/aaa029

Crossref

Search ADS

Author notes

NSF Astronomy and Astrophysics Postdoctoral Fellow.

Nithyanandan Thyagarajan is a Jansky fellow of the National Radio Astronomy Observatory.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Download all slides

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

Hyper-parameter	Prior	Estimate
l_mix (MHz)	\|$\mathcal {U}(2, 20)$\|	\|$2.40 \substack{+0.02\\-0.01}$\|
\|$\sigma _{{\rm mix}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.1, 0.9)$\|	\|$0.115 \substack{+0.007\\-0.005}$\|
l_sky (MHz)	\|$\mathcal {U}(10, 200)$\|	\|$19.42 \substack{+1.25\\-1.18}$\|
\|$\sigma _{{\rm sky}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.02, 2.5)$\|	\|$1.89 \substack{+0.10\\-0.09}$\|
l_per (MHz)	\|$\mathcal {U}(1, 5)$\|	\|$1.23 \substack{+0.01\\-0.01}$\|
p_per (MHz)	\|$\mathcal {U}(0.628, 1.256)$\|	\|$0.999 \substack{+0.002\\-0.002}$\|
\|$\sigma _{{\rm per}}^2 \ ({\rm K}^2)$\|	\|$\mathcal {U}(0.00001, 0.01)$\|	\|$0.000183\substack{+0.000004\\-0.000003}$\|

Month:	Total Views:
May 2020	2
June 2020	14
July 2020	5
August 2020	6
September 2020	13
October 2020	5
November 2020	9
December 2020	56
January 2021	16
February 2021	17
March 2021	29
April 2021	17
May 2021	35
June 2021	30
July 2021	14
August 2021	21
September 2021	32
October 2021	18
November 2021	24
December 2021	30
January 2022	15
February 2022	34
March 2022	29
April 2022	15
May 2022	16
June 2022	19
July 2022	36
August 2022	40
September 2022	141
October 2022	241
November 2022	78
December 2022	66
January 2023	61
February 2023	10
March 2023	36
April 2023	128
May 2023	43
June 2023	21
July 2023	17
August 2023	24
September 2023	34
October 2023	12
November 2023	24
December 2023	33
January 2024	56
February 2024	28
March 2024	21
April 2024	23
May 2024	26
June 2024	21
July 2024	25
August 2024	10
September 2024	26
October 2024	25
November 2024	21
December 2024	34
January 2025	19
February 2025	28
March 2025	17
April 2025	12
May 2025	2

Article Contents

Foreground modelling via Gaussian process regression: an application to HERA data

ABSTRACT

1 INTRODUCTION

2 OBSERVATIONS AND DATA REDUCTION

2.1 LST binning and SEFD evaluation

2.2 Delay power spectrum

3 GAUSSIAN PROCESS REGRESSION AND FOREGROUND CHARACTERIZATION

3.1 Covariance functions

3.2 Foreground modelling

3.2.1 Characteristics of the periodic signal

3.2.2 Filtering the periodic signal with GPR

3.2.3 Foreground model hyper-parameter uncertainties

4 DISCUSSION AND CONCLUSIONS

ACKNOWLEDGEMENTS

Footnotes

REFERENCES

Author notes

Citations

Views

Altmetric

Email alerts

Astrophysics Data System

Citing articles via

Latest

Most Read

Most Cited

Article Contents

Foreground modelling via Gaussian process regression: an application to HERA data

ABSTRACT

1 INTRODUCTION

2 OBSERVATIONS AND DATA REDUCTION

2.1 LST binning and SEFD evaluation

2.2 Delay power spectrum

3 GAUSSIAN PROCESS REGRESSION AND FOREGROUND CHARACTERIZATION

3.1 Covariance functions

3.2 Foreground modelling

3.2.1 Characteristics of the periodic signal

3.2.2 Filtering the periodic signal with GPR

3.2.3 Foreground model hyper-parameter uncertainties

4 DISCUSSION AND CONCLUSIONS

ACKNOWLEDGEMENTS

Footnotes

REFERENCES

Author notes

Citations

Views

Altmetric

Email alerts

Astrophysics Data System

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only