Abstract

We present a method for subtracting point sources from interferometric radio images via forward modelling of the instrument response and involving an algebraic non-linear minimization. The method is applied to simulated maps of the Murchison Wide-field Array but is generally useful in cases where only image data are available. After source subtraction, the residual maps have no statistical difference to the expected thermal noise distribution at all angular scales, indicating high effectiveness in the subtraction. Simulations indicate that the errors in recovering the source parameters decrease with increasing signal-to-noise ratio, which is consistent with the theoretical measurement errors. In applying the technique to simulated snapshot observations with the Murchison Wide-field Array, we found that all 101 sources present in the simulation were recovered with an average position error of 10 arcsec and an average flux density error of 0.15 per cent. This led to a dynamic range increase of approximately 3 orders of magnitude. Since all the sources were deconvolved jointly, the subtraction was not limited by source sidelobes but by thermal noise. This technique is a promising deconvolution method for upcoming radio arrays with a huge number of elements and a candidate for the difficult task of subtracting foreground sources from observations of the 21-cm neutral hydrogen signal from the epoch of reionization.

1 INTRODUCTION

The deconvolution of radio point sources is a problem that has been studied for several decades in radio astronomy.

When calibration errors can be neglected, the problem of subtracting point sources from deconvolved radio images ultimately reduces to a problem of fitting their positions and flux densities as accurately as the instrumental noise permits.

The methods used to deconvolve point source sidelobes are typically based on the clean algorithm (Hogbom 1974; Clark 1980). The clean algorithm looks for the brightest pixel in the image and subtracts a fraction of the dirty beam from the image at that location, forming a residual image. The search and subtraction loop is repeated until the sidelobes are reduced below the thermal noise level.

The model components that are found through this iterative process can be convolved with a two-dimensional Gaussian and introduced back into the residual image. The best estimate of flux density and position for each source is then found by fitting a two-dimensional Gaussian to the source.

The subtraction of point sources performed in this way has the known problem that the dynamic range achievable is limited by pixelization effects, i.e. by the fact that data are averaged and arranged into a regular grid. Therefore even a simple point source that does not lie at the centre of the grid cell cannot be represented by a single delta function model, but requires a potentially infinite number of components to be fully represented (Perley 1999; Briggs & Cornwell 2005).

In presence of the visibility data, the pixelization problem can be minimized and the dynamic range improved by subtraction of sources from the ungridded visibilities (Noordam & de Bruyn 1982; Voronkov & Wieringa 2004) and by centring the local pixel grid on the source to be deconvolved (Cotton & Uson 2008).

When the number of antenna elements to be correlated becomes extremely large, however, it becomes harder and harder to store the visibility data and the deconvolution has to be performed on images with, again, a limitation of the dynamic range due to pixelization effects.

This is a relevant issue for upcoming radio telescopes like the Murchison Wide-field Array (MWA; Lonsdale et al. 2009) or future instrumentation like the square kilometer array (SKA)1 since they will produce a huge number of correlated visibilities. MWA will generate data at such a rate (approximately a PByte d−1) that will be impractical to store the raw visibilities and go through the traditional self-calibration loop, and the deconvolution of radio sources will happen in the image plane.

The deconvolution of bright point sources is also a prominent issue in the view of the detection of the epoch of reionization (EoR) through the redshifted 21-cm line emission, which is one of the main goals of the MWA.

The problem of foreground subtraction for EoR experiments has been studied by various authors in the literature (Di Matteo, Ciardi & Miniati 2004; Morales & Hewitt 2004; Santos, Cooray & Knox 2005; McQuinn et al. 2006; Morales, Bowman & Hewitt 2006; Wang et al. 2006; Gleser, Nusser & Benson 2008; Jelić et al. 2008; Bowman, Morales & Hewitt 2009; Harker et al. 2009a; Harker et al. 2009b; Liu, Tegmark & Zaldarriaga 2009a; Liu et al. 2009b). Most of their efforts have been devoted to demonstrations that the diffuse Galactic synchrotron radiation and the classical confusion noise due to unresolved radio sources can be subtracted if it is assumed that they are spectrally smooth and absent of calibration errors. Recent observations (Ali, Bharadwaj & Chengalur 2008; Bernardi et al. 2009; Pen et al. 2009; Bernardi et al. 2010; Parsons et al. 2010) have started to characterize the diffuse foreground component.

All the simulations conducted so far, however, have assumed that the brightest point sources were perfectly subtracted from the data. Bowman et al. (2009) and Liu et al. (2009b) indicated that point sources should be subtracted down to a 10–100 mJy threshold in order to detect the EoR.

Datta, Bhatnagar & Carilli (2009) and Datta, Bowman & Carilli (2010) studied the problem of subtraction of bright sources in the presence of calibration errors and concluded that sources brighter than 1 Jy should be subtracted with a positional precision better than 0.1 arcsec if calibration errors remain correlated over ∼6 h of observation. If the errors are correlated on a shorter time length, however, they will tend to average down with time, and the requirement for positional accuracy will be less stringent.

Pindor et al. (2010) developed a technique based on matched filters to subtract bright point sources in MWA images in the presence of diffuse emission. They showed that the dynamic range of the residual images can be improved by a factor of ∼2–3 in this way.

In this paper, we present a method of subtracting point sources from MWA dirty images that involves forward modelling and a non-linear minimization scheme. Forward modelling is a general concept that can be used to extract astrophysical parameters from the data.

We applied our method to simulated MWA images to show that point sources can be deconvolved with an accuracy limited by thermal noise even without storing the visibility data.

The paper is organized as follows: in Section 2 we present the method; in Section 3 we apply the method to MWA simulated images and we conclude in Section 4.

2 THE METHOD

The method presented here relies on the fact that the sky emission can be forward modelled. Forward modelling is a generative model, i.e. a model that is related to the astrophysical parameters to be measured, is based on physical assumptions and can be generated a priori, independent of the actual data. Once the forward model is determined, a minimization scheme (generally non-linear) can be implemented to fit for the astrophysical parameters of interest.

Forward modelling has already been used in several astrophysical contexts; for example, Bailer-Jones (2010) used a forward modelling algorithm to estimate stellar parameters from optical spectra. Forward modelling finds a natural application in point source subtraction from radio images where visibility data are not accessible anymore.

In this case, the forward model does not need to be approximated by any analytical function, but it is simply the synthesized beam calculated at that particular position in the sky and scaled for the source flux density. In traditional radio astronomy, the synthesized beam can be considered to remain constant throughout the whole field of view. If we consider the future arrays which will operate at low frequencies, however, the synthesized beam changes as a function of position in the map due to wide field effects and direction-dependent primary beams. If very high dynamic range imaging is required – as it is to detect the EoR signal – the exact synthesized beam should be computed at each location in the map without relying on any analytical approximation. For the MWA, real-time calibration data will be stored in a data base and will be used to generate an accurate set of visibilities for each point source of interest. These visibilities can then be imaged and averaged in the same way that the true visibilities were imaged and averaged, resulting in a synthesized beam map for each source.

In the case of point source deconvolution, the astrophysical parameters that have to be determined via forward modelling are the position and flux density of each point source.

For a single point-source case, our algorithm can be described as follows. The image pixels are grouped into an N-element vector y, where N is the number of pixels in the map. Right ascension, declination and flux density of the source – i.e. the parameters to be fitted – are grouped into a three-element vector x. The forward model m(x) is also an N-element vector.

The nth iteration of the method is described as follows.

  • Generate the forward model (i.e. an image of the synthesized beam) mn=m(xn) for the current parameter estimate: right ascension; declination and flux density.

  • Compute the N× 3 Jacobian matrix, J, which contains the derivatives of the forward model-synthesized beam with respect to the parameters computed at the current parameter estimate xn,
  • Estimate the difference between the data and the model
  • Estimate the shift in each parameter which is the solution of the linear system of equations
    1
  • Compute the new estimate of the parameters

Steps (i)–(v) are repeated until convergence is reached (Fig. 1).

Flow chart of the source subtraction scheme.
Figure 1

Flow chart of the source subtraction scheme.

Equation (1) shows that the problem of source subtraction has become a non-linear least-squares minimization.

The forward model has a linear dependence on flux density, but non-linear on position; therefore the partial derivatives with respect to right ascension and declination are computed numerically using finite difference approximation.

Practically, the partial derivative with respect to right ascension is computed by generating an image of the synthesized beam with a small right ascension offset from the current estimated position. An image of the synthesized beam with a small declination offset is generated to compute the partial derivative with respect to declination. The derivative with respect to flux density is just a scaled version of the synthesized beam.

The generalization of the single point source case to M sources is straightforward, since the vector of parameters becomes a 3M vector, and the Jacobian matrix becomes a N× 3M matrix, and all the sources are fitted simultaneously at each iteration. It is also important to note that the matrix JTJ that has to be inverted does not depend upon the number of pixels in the map, but only upon the number of parameters; therefore its size increases linearly only with the number of sources to be subtracted.

In principle, an initial estimate of the parameters could be obtained by generating a grid of likely models, with a range of right ascension, declination and flux densities for each source, and selecting the model, x0, that best fits the data [i.e. minimizes (ym)T (ym)]. In practice, it is easier and faster to fit an elliptical Gaussian to the source position and use its best-fitting parameters as the initial guess.

Equation (1) can be generalized by assigning a weight to each pixel of the image. In this case, it becomes the general expression for non-linear weighted least squares:
2
where formula is the N×N weight matrix. Although different weighting schemes could be explored, in the following applications of our method we will assume that formula is a diagonal matrix with each diagonal element equal to the signal-to-noise ratio (S/N) of the corresponding pixel.

The advantage of this method compared to other image-based deconvolution techniques is that the forward model can be generated with an arbitrary level of precision in the parameter space grid and, therefore, is not affected by any pixelization effect. In the following section, we will apply this method to simulated MWA images.

3 APPLICATIONS

In this section, we test the method with simulated MWA images obtained through the real-time system (RTS; Mitchell et al. 2008). The main RTS data product will be dirty images – i.e. images where the synthesized beam has not been deconvolved – integrated over a period that can range from 8 s to a few minutes. It is these integrated images that require subtraction of point sources. The RTS will also save calibration information (primary beam and atmospheric models) to facilitate accurate off-line deconvolution.

3.1 Simulation set-up

We simulated a realistic MWA observation, with 20°× 20° images covering the MWA field of view. The simulations were constructed as follows. We populated the field of view with point sources according to the following log N–log S distribution:
where dN is the differential source count, N0 is the number of sources per steradian per Jy−1.5 and S is the source flux density. We have chosen N0 in such a way that there are 100 sources greater than 1 Jy in a 20°× 20° field. Random positions were assigned to the sources with no constraint on the minimum distance among them.

Visibility data were then created for the sources at 150 MHz, with a 2-s cadence and over a 40-kHz channel width using the maps package (Wayth et al. 2010).

Random noise was added to each ij visibility, for each polarization, according to the following expression:
3
where we have assumed a System Equivalent Flux Density (SEFD) = 10 000 Jy, Δt = 2 s and Δf = 40 kHz.

The visibilities were imaged through the RTS, which performs calibration and imaging of raw visibility data. However, since we assumed a perfect calibration in this work, only the imaging part was used. The RTS imaging pipeline is described by Ord et al. (2010), and we briefly summarize it here, referring the reader to the paper for a more detailed presentation.

Each MWA tile is constituted of 16 dipoles arranged in a 4 × 4 square configuration. The dipoles are fixed to the ground; therefore their projection on the sky changes with time. The primary beam response to the sky brightness is, therefore, time variable. We have assumed that the tile primary beams can be described by the sum of 16 complex numbers that represent gain terms for the individual, known, dipole beams. Since we are not dealing with calibration, all tile beams are assumed to have the same shape, amplitude and phase.

The RTS expects visibility data from the correlator to be integrated over 2 s and 40 kHz. The visibility data are then averaged and imaged over 8 s (Mitchell et al. 2008). Each individual 8-s snapshot is then resampled into the HEALPix frame (Górski et al. 2005) and integrated over time, with wide-field distortions corrected during the resampling. The time integration is performed, for each pixel, by summing over the measured values weighted by the complex conjugate of their primary beam response (the total weight is now the square of the beam). The sum of the weights – i.e. the square of the primary beam response integrated over the duration of the observation – is divided out at the end of the integration.

We generated images centred at 4h right ascension and −30° declination, which is one of the potential fields for EoR observations. We have assumed that the field was observed 1 h before transit.

We simulated two different sets of observations related to two different array configurations. First, we considered the 5 per cent prototype of the array that is currently deployed on the ground and is constituted by 32 tiles (32T). Secondly, we considered the full MWA configuration which will consist of 512 tiles (512T). The 32T system has less sensitivity compared to the 512T system and a coarser angular resolution since its longest baseline is ∼400 m, whereas the longest baseline is ∼1500 m in the 512T configuration. The instantaneous uv coverage of the 32T is also much worse than the 512T one (Fig. 2).

Simulated 32T uv coverage integrated over 10 min (top panel) and simulated 512T uv coverage for an 8-s snapshot (bottom panel). The image centre is at 4h right ascension and −30° declination.
Figure 2

Simulated 32T uv coverage integrated over 10 min (top panel) and simulated 512T uv coverage for an 8-s snapshot (bottom panel). The image centre is at 4h right ascension and −30° declination.

The presence of wide-field effects makes the synthesized beam position dependent even in the absence of calibration errors (Fig. 3). Since we are aiming at achieving a high dynamic range subtraction, we will generate the synthesized beam for each source at the specific source location to account for the difference.

An example of the difference between synthesized beams at two positions in the 512T simulated image. Solid line: the synthesized beam profile at the image centre. Dashed line: the synthesized beam profile 9° away from the image centre. Dot–dashed line: the difference between the two profiles. In this example, the difference is at the 10 per cent level.
Figure 3

An example of the difference between synthesized beams at two positions in the 512T simulated image. Solid line: the synthesized beam profile at the image centre. Dashed line: the synthesized beam profile 9° away from the image centre. Dot–dashed line: the difference between the two profiles. In this example, the difference is at the 10 per cent level.

3.2 32T results

We used the 32T simulations to test the applicability of our method to integrated snapshot images. Since the long integration images taken with MWA will be obtained by co-adding individual snapshots – whose duration can vary from 8 s to ∼5 min – it is relevant to test the capability of the algorithm to subtract sources over co-added images, where fainter sources can become visible because of sidelobe suppression and the lowering of thermal noise.

We used a 10-min integrated image, which has an rms thermal noise of ∼52 mJy beam−1, enabling detection of sources brighter than ∼300 mJy. However, since the computational load increases with the number of sources and the length of the observation, we limit ourselves to the 16 sources brighter than 4 Jy, working in a case of high S/N.

The subtraction was performed without any a priori assumption about the sky model, that is without identifying in advance sources via catalogued coordinates. This enables us to test the robustness of the algorithm under realistic conditions (i.e. without pre-supposition of a sky model generated by the MWA), where sidelobe structure from sources around the sky cannot be filtered.

For a 16-source model that includes thermal noise, only one source (97 Jy) is clearly visible because its sidelobes are bright enough to cover all the remaining sources (Fig. 4). The initial guess regarding the sky brightness distribution is limited to the parameters for this one source. The subtraction was performed according to the following steps.

Simulated image with the 32T uv coverage integrated over 10 min. The black and white scale runs linearly between −10 and 50 Jy beam−1. The sidelobes of the dominant source (∼97 Jy) obliterate the other 15 sources between it and a 4-Jy floor.
Figure 4

Simulated image with the 32T uv coverage integrated over 10 min. The black and white scale runs linearly between −10 and 50 Jy beam−1. The sidelobes of the dominant source (∼97 Jy) obliterate the other 15 sources between it and a 4-Jy floor.

  • The first source parameters were estimated through the forward modelling minimization (three iterations, Fig. 5); the source model was subtracted and initial guesses obtained for the three newly visible sources.

  • The four brightest sources were included in the sky model and subtracted. After three iterations another seven sources were detected in the image (Fig. 6) and initial estimates of their parameters were made.

  • A sky model made of 11 sources was subtracted. After three iterations all the remaining sources were identified (Fig. 7) and an initial estimate of their parameters performed.

  • The full sky model is minimized and subtracted jointly, giving the residual image of Fig. 8.

Residual image, with the brightest source subtracted, after three iterations. The black and white scale runs between −10 and 15 Jy beam−1. Three new sources are visible.
Figure 5

Residual image, with the brightest source subtracted, after three iterations. The black and white scale runs between −10 and 15 Jy beam−1. Three new sources are visible.

Residual image, with the four brightest sources subtracted, after three iterations. The black and white scale runs between −2 and 5 Jy beam−1. Seven new sources are visible.
Figure 6

Residual image, with the four brightest sources subtracted, after three iterations. The black and white scale runs between −2 and 5 Jy beam−1. Seven new sources are visible.

Residual image, with the eleven brightest sources subtracted, after three iterations. The black and white scale runs between −1 and 3 Jy beam−1. Five new sources are visible.
Figure 7

Residual image, with the eleven brightest sources subtracted, after three iterations. The black and white scale runs between −1 and 3 Jy beam−1. Five new sources are visible.

Residual image, with all the sources subtracted, after three iterations. The black and white scale runs between −0.4 and 0.4 Jy beam−1. The image shows only thermal noise.
Figure 8

Residual image, with all the sources subtracted, after three iterations. The black and white scale runs between −0.4 and 0.4 Jy beam−1. The image shows only thermal noise.

In order to characterize the statistics of the residuals and the accuracy of the subtraction, we compare the true flux densities and positions to the final estimates and to the theoretical measurement errors σRA,DECtheor computed as
4
where Θb∼ 18 arcmin is the synthesized beam (Fig. 9).
Errors in the fitted parameters for sources in the 32T simulated image: right ascension (top panel), declination (middle panel) and flux density (bottom panel). Solid lines in the two upper plots indicate the envelope of the theoretical measurement errors.
Figure 9

Errors in the fitted parameters for sources in the 32T simulated image: right ascension (top panel), declination (middle panel) and flux density (bottom panel). Solid lines in the two upper plots indicate the envelope of the theoretical measurement errors.

We observe that the error distribution narrows with increasing flux density and is within the theoretical values. No systematic offsets appear in the recovered source parameters, based on estimates of median and rms values (Table 1).

Table 1

Median and rms values of the offset between the model and the fitted parameters for the simulated 32T image.

ParameterMedianrms
RA−0.7 arcsec9.6 arcsec
Dec.−0.3 arcsec6.6 arcsec
Flux density0.3 per cent1.2 per cent
ParameterMedianrms
RA−0.7 arcsec9.6 arcsec
Dec.−0.3 arcsec6.6 arcsec
Flux density0.3 per cent1.2 per cent
Table 1

Median and rms values of the offset between the model and the fitted parameters for the simulated 32T image.

ParameterMedianrms
RA−0.7 arcsec9.6 arcsec
Dec.−0.3 arcsec6.6 arcsec
Flux density0.3 per cent1.2 per cent
ParameterMedianrms
RA−0.7 arcsec9.6 arcsec
Dec.−0.3 arcsec6.6 arcsec
Flux density0.3 per cent1.2 per cent
We also computed the angular power spectrum of the residual images as (Seljak 1997; Bernardi et al. 2009)
5
where ℓ = 180/Θ is the usual multipole value, Θ is the angular scale in degrees, Ω is the solid angle in radians, N is the number of Fourier modes around a certain ℓ value, X and X* are the Fourier transform of the image and its complex conjugate, respectively, and l is the two-dimensional coordinate in Fourier space. The power spectrum has a bin width of Δℓ = 50.

The amplitude of the power spectrum of the residual images decreases by more than 2 orders of magnitude as the number of subtracted sources increases (Fig. 10).

Power spectra of residual images for an increasing number of subtracted sources. From top to bottom: after subtracting one source (Fig. 5); after subtracting four sources (Fig. 6); after subtracting 11 sources (Fig. 7); after the whole sky model is subtracted (Fig. 8). The error bars are at the 1σ confidence level. The dashed line represents the noise power spectrum (see text for details).
Figure 10

Power spectra of residual images for an increasing number of subtracted sources. From top to bottom: after subtracting one source (Fig. 5); after subtracting four sources (Fig. 6); after subtracting 11 sources (Fig. 7); after the whole sky model is subtracted (Fig. 8). The error bars are at the 1σ confidence level. The dashed line represents the noise power spectrum (see text for details).

Fig. 10 also shows the noise power spectrum, estimated as the averaged power spectrum of 100 noise realizations. Each noise realization was generated by imaging visibilities which included only noise, following equation (3). A noise power spectrum was computed from each image. The estimated noise power spectrum was determined as the average among 100 power spectrum realizations. The error bars are the standard deviation of the 100 power spectrum realizations in each multipole bin.

It can be seen that the power spectrum of the residual image, after the full 16-source sky model is subtracted, agrees with the estimated thermal noise over the entire range of angular scales probed. This indicates that no systematic errors or statistical deviations from Gaussian-distributed noise are introduced by the method and that the source subtraction is accurate down to the thermal noise level.

3.3 512T results

The 512T simulation included 101 sources brighter than 1 Jy, observed in an 8-s snapshot and in a 40-kHz channel (Fig. 11). The thermal noise in the 512T image is ∼26 mJy beam−1. Unlike the 32T case, the 512T image prior to forward modelling already exhibits a great number of sources, due to the reduced sidelobes of the synthesized beam.

Simulated 8-s snapshot image with 512T uv coverage. The black and white scale runs between −1 and 2 Jy beam−1. The very good synthesized beam has low sidelobe levels and makes most of the sources directly visible without any subtraction.
Figure 11

Simulated 8-s snapshot image with 512T uv coverage. The black and white scale runs between −1 and 2 Jy beam−1. The very good synthesized beam has low sidelobe levels and makes most of the sources directly visible without any subtraction.

The subtraction was performed according to the following steps.

  • The brightest 15 sources were identified and an initial guess of their parameters estimated. They were then subtracted out through the minimization scheme (Fig. 12).

  • Another 35 sources were identified and their parameters estimated. The joint fit is now performed on 50 sources simultaneously (Fig. 13).

  • All the sources were included in the sky model. The minimization was carried out for all the 101 sources simultaneously and convergence was reached after five iterations (Fig. 14).

Residual image, with the brightest 15 sources subtracted, after five iterations. The black and white scale runs between −1 and 2 Jy beam−1.
Figure 12

Residual image, with the brightest 15 sources subtracted, after five iterations. The black and white scale runs between −1 and 2 Jy beam−1.

Residual image, with the 50 brightest sources subtracted, after five iterations. The black and white scale runs between −1 and 2 Jy beam−1. The presence of negative peaks is due to a subtraction in the absence of a full sky model.
Figure 13

Residual image, with the 50 brightest sources subtracted, after five iterations. The black and white scale runs between −1 and 2 Jy beam−1. The presence of negative peaks is due to a subtraction in the absence of a full sky model.

Residual image, with all sources subtracted, after five iterations. The black and white scale runs between −0.3 and 0.3 Jy beam−1. Only thermal noise is left after the subtraction of the whole sky model.
Figure 14

Residual image, with all sources subtracted, after five iterations. The black and white scale runs between −0.3 and 0.3 Jy beam−1. Only thermal noise is left after the subtraction of the whole sky model.

The final residual image after the whole sky model is subtracted is consistent with the initial thermal noise level, indicating an accurate subtraction of the sources. It is worth noting that in the intermediate steps, when only a partial sky model is subtracted, residual features due to an imperfect subtraction exist and appear as positive adjacent to negative peaks.

We computed the difference between the true flux density and position values and their final estimates (Fig. 15) as was done for the 32T simulation.

Same as Fig. 9, but for the 512T simulation. The synthesized beam is now Θb∼ 11.8 arcmin. The solid line in the bottom figure indicates the (S/N)−1 envelope.
Figure 15

Same as Fig. 9, but for the 512T simulation. The synthesized beam is now Θb∼ 11.8 arcmin. The solid line in the bottom figure indicates the (S/N)−1 envelope.

As in the 32T case, errors increase with decreasing flux densities and are well matched to their expected theoretical limits. The median and rms values show no systematic errors in the recovered parameters (Table 2).

Table 2

Median and rms values of the offset between the model and the fitted parameters for the simulated 512T image.

ParameterMedianrms
RA0.99 arcsec17 arcsec
Dec.0.56 arcsec11 arcsec
Flux density−0.15 per cent4 per cent
ParameterMedianrms
RA0.99 arcsec17 arcsec
Dec.0.56 arcsec11 arcsec
Flux density−0.15 per cent4 per cent
Table 2

Median and rms values of the offset between the model and the fitted parameters for the simulated 512T image.

ParameterMedianrms
RA0.99 arcsec17 arcsec
Dec.0.56 arcsec11 arcsec
Flux density−0.15 per cent4 per cent
ParameterMedianrms
RA0.99 arcsec17 arcsec
Dec.0.56 arcsec11 arcsec
Flux density−0.15 per cent4 per cent

The power spectrum of the residual image after subtracting all 101 sources agrees with the expected thermal noise within 1σ error for each multipole value, indicating that there is no significant statistical leftover from source subtraction (Fig. 16). The power spectra spans almost 3 orders of magnitude because the 512T simulation probes the log N–log S at lower flux densities. This demonstrates that the algorithm is able to simultaneously remove a large number of sources which span 2 orders of magnitude in flux density.

Power spectra of residual images for an increasing number of subtracted sources. From top to bottom: initial image without subtracting any source (Fig. 11); after subtracting 15 sources (Fig. 12); after subtracting 50 sources (Fig. 13); after the whole sky model is subtracted (Fig. 14). The error bars are at the 1σ confidence level. The dashed line represents the noise power spectrum.
Figure 16

Power spectra of residual images for an increasing number of subtracted sources. From top to bottom: initial image without subtracting any source (Fig. 11); after subtracting 15 sources (Fig. 12); after subtracting 50 sources (Fig. 13); after the whole sky model is subtracted (Fig. 14). The error bars are at the 1σ confidence level. The dashed line represents the noise power spectrum.

This is a relevant result in the light of EoR measurements, where a high accuracy in source subtraction is required to achieve the necessary dynamic range. The detection of the EoR is believed to require this high accuracy in foreground subtraction because the cosmological signal is 5–6 orders of magnitude below the strongest sources in the sky. Due to the time constraints that come with the real-time nature of the MWA, subtracting sources in the visibility domain –‘peeling’– is only practical for the brightest sources which are also required to accurately constrain antenna primary beam models. In order to achieve accurate calibration and subtraction using these sources, a comprehensive global sky model is required. This will be obtained by surveying the sky in the first months of operation with the full array. At the same time, the actual tile beams will be measured and used to improve the beam models. The knowledge of the sky and the beams can be improved in a bootstrapping fashion by repeating the sky survey.

We expect that the initial 105–106 dynamic range can be alleviated by 2–3 orders of magnitude through a very precise peeling procedure. A further subtraction of the remaining bright sources is required in the integrated images.

If the dynamic range is expressed as the ratio between the brightest source in the map and the noise rms, the source subtraction in the 512T case achieves a dynamic range of ∼3400 through our minimization scheme.

It is also interesting to introduce the relative dynamic range defined as the ratio between the true flux density of a source and the difference between the true and the recovered flux density (Pindor et al. 2010). This is another way of estimating the residual contamination due to an imperfect subtraction. Studies in the literature indicated that bright sources should be subtracted down to the 100–10 mJy level in order not to affect the subtraction of fainter foreground sources and, ultimately, the recovery of the EoR signal (Bowman et al. 2009; Liu et al. 2009b).

Fig. 17 displays the relative dynamic range for source subtraction in the 512T simulation. It can be seen that, with the level of noise present in our simulated image, ∼92 per cent of the sources are above the 100-mJy threshold.

The relative dynamic range as a function of flux density. The solid lines indicate the 10-mJy flux density threshold (upper) and the 100-mJy (lower) flux density threshold for source subtraction.
Figure 17

The relative dynamic range as a function of flux density. The solid lines indicate the 10-mJy flux density threshold (upper) and the 100-mJy (lower) flux density threshold for source subtraction.

Since we are fitting all the sources simultaneously and iteratively, the main limitation to the relative dynamic range comes from thermal noise rather than sidelobe contamination. Given the behaviour shown in Fig. 15, we expect the dynamic range to increase if we consider a longer integration where the thermal noise decreases. Section 3.4 will confirm this statement.

3.4 Out-of-beam sources

An image has the limitation of excluding all the sources outside the image itself (out-of-beam sources). If visibility data were accessible, the information corresponding to out-of-beam sources would still be accessible and they could be subtracted in a traditional self-calibration–deconvolution loop. Once the image is generated and visibility data discarded, information about out-of-beam sources is lost, apart from the sidelobes, which will still contaminate the image if they are bright enough.

We investigated how well our method subtracts out-of-beam sources by minimizing their sidelobe contribution to the image, i.e. by fitting the sidelobe pattern of a source, regardless of being able to image the source itself.

In order to make sure that the out-of-beam source sidelobes have good S/N, we integrated individual 8-s 512T snapshots up to 10 min in a 40-kHz channel (Fig. 18). The thermal noise in our 10-min-simulated image is 2.8 mJy beam−1.

Simulated image with the 10-min 512T uv coverage. Five sources are within the field of view and two outside. The colour scale runs between −0.3 and 1 Jy beam−1.
Figure 18

Simulated image with the 10-min 512T uv coverage. Five sources are within the field of view and two outside. The colour scale runs between −0.3 and 1 Jy beam−1.

In order to reduce the computational load, we included only the brightest seven sources used in the 512T simulation; therefore the faintest source is ∼6 Jy. Two sources were displaced from their previous positions and moved 2° outside the edge of the image. The out-of-beam sources had ∼14.1 and ∼12.7 Jy flux densities, and we assumed that an initial estimate of their parameters is know from a pre-existing source catalogue.

The subtraction was performed according to the following steps:

  • an initial parameter estimate of the five sources within the field of view was computed, and the sources were subtracted ignoring the out-of-beam sources (Fig. 19);

  • the two out-of-beam sources were included in the sky model and a joint parameter estimate performed. The best-fitting model is subtracted from the image (Fig. 20).

Residual image, with only the five in-beam sources subtracted, after five iterations. The colour scale runs between −0.2 and 0.2 Jy beam−1. The five sources within the field of view were well removed, revealing the sidelobe pattern of the unsubtracted out-of-beam sources.
Figure 19

Residual image, with only the five in-beam sources subtracted, after five iterations. The colour scale runs between −0.2 and 0.2 Jy beam−1. The five sources within the field of view were well removed, revealing the sidelobe pattern of the unsubtracted out-of-beam sources.

Residual image where all the sources were subtracted. Five iterations were performed. The colour scale runs between −0.03 and 0.03 Jy beam−1. The sidelobe pattern has been removed down to the thermal noise level.
Figure 20

Residual image where all the sources were subtracted. Five iterations were performed. The colour scale runs between −0.03 and 0.03 Jy beam−1. The sidelobe pattern has been removed down to the thermal noise level.

The final image after the whole sky model was subtracted is consistent with the thermal noise level and its power spectrum agrees with the noise power spectrum at all angular scales (Fig. 21). It is important to note that power spectrum of the residual image after removing only the sources within the field of view is still well above the expected noise power spectrum. The subtraction of the sidelobe pattern of the out-of-beam sources improves the dynamic range by a further factor of ∼5.

Power spectra of residual images for an increasing number of subtracted sources. From top to bottom: initial image without subtracting any source (Fig. 18), after the sources within the field of view were subtracted (Fig. 19), after all the sources (in and out-of-beam) were subtracted (Fig. 20). The error bars are at the 1σ confidence level. The dashed line represents the noise power spectrum.
Figure 21

Power spectra of residual images for an increasing number of subtracted sources. From top to bottom: initial image without subtracting any source (Fig. 18), after the sources within the field of view were subtracted (Fig. 19), after all the sources (in and out-of-beam) were subtracted (Fig. 20). The error bars are at the 1σ confidence level. The dashed line represents the noise power spectrum.

The plot of the relative dynamic range (Fig. 22) confirms the results of Section 3.3. The relative dynamic range of the sources inside the field of view has improved by ∼2 orders of magnitude by longer integration, and the two sources with the worst dynamic range are the out-of-beam sources, which have a poorer S/N. All the sources within the field of view are now above the 10-mJy threshold.

As in Fig. 17, but for the out-of-beam sources. The two sources with the poorest relative dynamic range are the out-of-beam sources for which the S/N is lower.
Figure 22

As in Fig. 17, but for the out-of-beam sources. The two sources with the poorest relative dynamic range are the out-of-beam sources for which the S/N is lower.

3.5 Computational costs

In our simulations, we have shown that forward modelling can achieve a high level of precision in source subtraction. This comes, however, with a significant computational cost. Up to the number of sources that we have considered in our 512T simulation, the greatest computational load comes from imaging rather than generating the visibilities or fitting for the astrophysical parameters. In the case of an 8-s snapshot image with a 20°× 20° field of view, we estimated that ∼70 Gflops are required to generate each image; therefore each iteration of the subtraction scheme requires ∼200 Gflops for every source that has to be subtracted. It takes ∼40 s on a normal Dell 2.4 GHz 2 quad Core machine.

Although such a computational need might require a very long processing time – particularly for long integrations – there are several ways of shortening the processing time length. The most straightforward way is to implement a graphics processing unit (GPU) pipeline in the most computationally intensive part of the process, i.e. imaging. By running our simulations on a GPU, a factor of ∼5 in time is gained.

The second shortcut is to parallelize the forward modelling loop. Although the simulations presented in this work were performed in serial, the calculation of the forward model and the partial derivatives can be run in parallel, potentially on a dedicated GPU machine.

Finally, it is important to note that such a high level of precision in source removal might be superfluous for very faint sources for which calibration errors are larger. In this case, convenient approximations in fitting source positions (e.g. Pindor et al. 2010) will speed up the calculations and might eventually give the same level of accuracy in the subtraction.

4 CONCLUSIONS

We have presented a point source deconvolution technique that makes use of forward modelling and an algebraic non-linear minimization scheme. The main motivation for this implementation was achieving high dynamic range images in the absence of visibility data. Current (MWA) and future (SKA) radio interferometers require such a huge number of elements that they are being forced to rely more and more on real-time calibration and imaging, without the use of traditional self-calibration techniques.

The basic idea of our scheme is to forward model the sky brightness, i.e. to filter the sky model through the same instrumental response that is applied to the data. In the case of radio point sources, the forward model is the synthesized beam which is generated for each source individually. In this way, position-dependent variations of the synthesized beam are accounted for.

Point source astrophysical parameters are recovered through a non-linear minimization over the image pixels. In this way, we overcome the known dynamic range limitations of image-based deconvolution due to pixelization effects. Since the presented technique minimizes all the sources simultaneously and in an iterative way, it is minimally sensitive to sidelobe noise and essentially limited by thermal noise.

It is worth noting that this method can be applied to different sky components and can incorporate calibration parameters such as ionospheric displacements and primary beam shapes measured from the actual data.

The technique was applied to three different simulated cases: a 10-min integration with the 32T MWA, an 8-s snapshot image of the 512T MWA and a 10-min integration with the 512T MWA where sources were placed inside and outside the field of view.

In all cases, we were able to subtract sources down to the thermal noise without assuming an a priori knowledge of the sky, with the exception of initializing the position and flux density of sources placed outside the field of view. The final residual images are consistent with the expected thermal noise on all the angular scales. Errors in the fitted parameters decrease with increasing S/N, in agreement with the expected theoretical measurement error distribution. Even when sources were not physically present in the images, we could subtract their sidelobes down to the thermal noise level.

The 512T simulations are relevant in the light of the MWA EoR experiment. Since only a limited number of sources can be subtracted in real time, an off-line subtraction of the residual sources will have to be performed on the images to a high level of accuracy in order to precisely remove them and their direction-dependent synthesized beams.

In the simulation of an 8-s image with the 512T array, we achieved a dynamic range of ∼3400, indicating that the subtraction of foreground sources can be improved by 3 orders of magnitude through this technique. Source parameters can be retrieved with an average error of 10 arcsec on positions and 0.15 per cent errors on flux densities.

The relative dynamic range of our subtraction is limited by the thermal noise and is above the 100-mJy threshold for 92 per cent of the sources. Since the best-fitting parameters improve with the S/N, a lower threshold – i.e. 10 mJy – can be reached by lowering the thermal noise through a longer integration. In fact, in the 512T 10-min simulation all the five sources present in the image had a dynamic range above the 10-mJy threshold, indicating that bright sources can be subtracted to a level that should not affect the detection of the EoR.

A sky model more realistic than only point sources could be forward modelled by modifying the procedure presented here. Extended sky emission modelled as a list of delta functions (i.e. the equivalent of clean components) could be directly treated by the present approach. More sophisticated modelling of extended emission that uses a set of basis functions like, for instance, shapelets (Yatawatta 2010) or a principal component analysis (de Oliveira-Costa et al. 2008) can be incorporated by convolving the model of the brightness distribution with the instrumental primary beam and then sampling it according to the uv distribution (see Wayth et al. 2010 for an example of this approach).

Future work will investigate these extensions and include a more realistic instrument model to better simulate the strategies for the EoR detection.

We thank an anonymous referee for useful comments. This work was supported by the US National Science Foundation under grants AST-0457585 and PHY-0835713.

REFERENCES

Ali
S. S.
Bharadwaj
S.
Chengalur
J. N.
,
2008
,
MNRAS
,
385
,
2166

Bailer-Jones
C. A. L.
,
2010
,
MNRAS
,
403
,
96

Bernardi
G.
et al.,
2009
,
A&A
,
500
,
965

Bernardi
G.
et al.,
2010
,
A&A
,
522
,
67

Bowman
J. D.
Morales
M. F.
Hewitt
J. N.
,
2009
,
ApJ
,
695
,
183

Briggs
D. S.
Cornwell
T. J.
,
2005
, in
Worall
D. M.
Biemesderfer
C.
,
Barnes
J.
, eds, ASP Conf. Ser. Vol. 347,
Astronomical Data Analysis Software and Systems I
.
Astron. Soc. Pac.
, San Francisco, p.
86

Clark
B. G.
,
1980
,
A&A
,
89
,
337

Cotton
W. D.
Uson
J. M.
,
2008
,
A&A
,
490
,
455

Datta
A.
Bhatnagar
S.
Carilli
C. L.
,
2009
,
ApJ
,
703
,
1851

Datta
A.
Bowman
J. D.
Carilli
C. L.
,
2010
,
ApJ
,
724
,
526

de Oliveira-Costa
A.
Tegmark
M.
Gaensler
B. M.
Jonas
J.
Landecker
T. L.
Reich
P.
,
2008
,
MNRAS
,
388
,
247

Di
Matteo T.
Ciardi
B.
Miniati
F.
,
2004
,
MNRAS
,
355
,
4

Gleser
L.
Nusser
A.
Benson
A. J.
,
2008
,
MNRAS
,
391
,
383

Górski
K.
et al.,
2005
,
ApJ
,
622
,
759

Harker
G.
et al.,
2009a
,
MNRAS
,
393
,
1449

Harker
G.
et al.,
2009b
,
MNRAS
,
397
,
1138

Hogbom
J. A.
,
1974
,
A&AS
,
15
,
417

Jelić
V.
et al.,
2008
,
MNRAS
,
389
,
1319

Liu
A.
Tegmark
M.
Zaldarriaga
M.
,
2009a
,
MNRAS
,
394
,
1575

Liu
A.
Tegmark
M.
Bowman
J. D.
Hewitt
J. N.
Zaldarriaga
M.
,
2009b
,
MNRAS
,
398
,
401

Lonsdale
C.
et al.,
2009
, preprint (arXiv:0903.4890)

McQuinn
M.
Zahn
O.
Zaldarriaga
M.
Hernquist
L.
Furlanetto
S. R.
,
2006
,
ApJ
,
653
,
815

Mitchell
D. M.
Greenhill
L. J.
Wayth
R. B.
Sault
R. J.
Lonsdale
C. J.
Cappallo
R. J.
Morales
M. F.
Ord
S. M.
,
2008
, IEEE J. Selected Topics Signal Processing,
2
,
707

Morales
M. F.
Hewitt
J. N.
,
2004
,
ApJ
,
615
,
7

Morales
M. F.
Bowman
J. D.
Hewitt
J. H.
,
2006
,
ApJ
,
648
,
767

Noordam
J. E.
de Bruyn
A. G.
,
1982
,
Nat
,
299
,
597

Ord
S. M.
et al.,
2010
,
PASP
,
122
,
1353

Parsons
A. R.
et al.,
2010
,
AJ
,
139
,
1468

Pen
U.
Chang
T.
Hirata
C.
Peterson
J. B.
Roy
J.
Gupta
Y.
Odegova
J.
Sigurdson
K.
,
2009
,
MNRAS
,
399
,
181

Perley
R. A.
,
1999
, in
Taylor
G. B.
Carilli
C. L.
Perley
R. A.
, eds, ASP Conf. Ser. Vol. 180,
Synthesis Imaging in Radio Astronomy II
.
Astron. Soc. Pac.
, San Francisco, p. 275

Pindor
B.
Wyithe
J. S. B.
Mitchell
D. A.
Ord
S. M.
Wayth
R. B.
Greenhill
L. J.
,
2010
,
Publ. Astron. Soc
. Australia, in press (arXiv:1007.2264)

Santos
M. G.
Cooray
A.
Knox
L.
,
2005
,
ApJ
,
625
,
575

Seljak
U.
,
1997
,
ApJ
,
74
,
597

Voronkov
M. A.
Wieringa
M. H.
,
2004
,
Exp. Astron.
,
650
,
529

Wang
X.
Tegmark
M.
Santos
M.
Knox
L.
,
2006
,
ApJ
,
650
,
529

Wayth
R. B.
et al.,
2010
,
PASP
, submitted

Yatawatta
S.
,
2010
, preprint (arXiv:1008.1892)