Reconstructing robust background integral field unit spectra using machine learning

ABSTRACT

In astronomy, spectroscopy consists of observing an astrophysical source and extracting its spectrum of electromagnetic radiation. Once extracted, a model is fit to the spectra to measure the observables, leading to an understanding of the underlying physics of the emission mechanism. One crucial, and often overlooked, aspect of this model is the background emission, which contains foreground and background astrophysical sources, intervening atmospheric emission, and artefacts related to the instrument such as noise. This paper proposes an algorithmic approach to constructing a background model for SITELLE observations using statistical tools and supervised machine learning algorithms. SITELLE is an imaging Fourier transform spectrometer located at the Canada-France-Hawaii Telescope, which produces a three-dimensional data cube containing the position of the emission (two dimensions) and the spectrum of the emission. SITELLE has a wide field of view (11 arcmin × 11 arcmin), which makes the background emission particularly challenging to model. We apply a segmentation algorithm implemented in photutils to divide the data cube into background and source spaxels. After applying a principal component analysis (PCA) on the background spaxels, we train an artificial neural network to interpolate from the background to the source spaxels in the PCA coefficient space, which allows us to generate a local background model over the entire data cube. We highlight the performance of this methodology by applying it to SITELLE observations obtained of a Star-formation, Ionized Gas and Nebular Abundances Legacy Survey galaxy, NGC 4449, and the Perseus galaxy cluster of galaxies, NGC 1275. We discuss the physical interpretation of the principal components and noise reduction in the resulting PCA-based reconstructions. Additionally, we compare the fit results using our new background modelling approach with standard methods used in the literature and find that our method better captures the emission from H ii regions in NGC 4449 and the faint emission regions in NGC 1275. These methods also demonstrate that the background does change as a function of the position of the data cube. While the approach is applied explicitly to SITELLE data in this study, we argue that it can be readily adapted to any integral field unit style data, enabling the user to obtain more robust measurements on the flux of the emission lines.

Machine Learning, Numerical Methods, Algorithms, Automatic Background Detection, SITELLE

1. INTRODUCTION

Integral field units (IFUs) are rapidly changing our understanding of galaxies (e.g. Barbosa et al. 2009; Edwards et al. 2009; Law et al. 2015; Emsellem et al. 2022; Groves et al. 2023). Due to their complementary spatial and spectral coverage, IFUs enable the study of spectra as a function of complex morphologies. However, extracting key parameters, such as the flux from strong emission lines, absorption lines, and continuum emission, requires careful consideration of the background emission. This problem is exacerbated in many IFU studies of nearby galaxies and galaxy clusters since the astrophysical objects of interest can take up the majority of the instrument’s field of view (FOV), thus rendering the background difficult to determine. Moreover, the background emission in IFUs with a large FOV, such as MUSE WFM (Bacon et al. 2017) and SITELLE (Drissen et al. 2019), is not homogeneous over the data cube.

The background is often decomposed into four primary components: the skylines and any background/foreground emission source in addition to detector noise. Additionally, a galaxy’s stellar continuum can contribute negatively to the overall flux measurements. Stellar continuum is regularly modelled using a combination of GANDALF/PPXF modelling (Cappellari 2012; Sarzi et al. 2017). This requires multiple (and strong) absorption components that are frequently unavailable (such as in SITELLE observations that typically only cover a small bandpass). Therefore, we need a robust method to model the background.

A standard technique employed in the past consists of taking a large background region without any point sources or emission from the astrophysical source and calculating the mean spectrum of this region; a large background region is taken to reduce the overall noise level. This assumes homogeneity of the background over the entire FOV of the instrument. While this works well for capturing skylines that are stable over the FOV, it neglects additional components such as intervening sources and/or the stellar continuum. Moreover, the skylines can vary strongly over larger FOVs depending on the observing conditions, so this method could fail. While numerous background detection algorithms exist, such as matched-filter analysis, they often depend on a priori knowledge of background region spectra (Ramella et al. 2001; Hopkins et al. 2002; Masias et al. 2012; Bacon et al. 2017). Matched-filter analysis takes a library of sample spectra and tries to find which template best fits the data by minimizing the Euclidean distance between them. Matched-filter analysis and similar methodologies require a broad library of template spectra covering any physical or astrophysical phenomena that may appear in the background. Thus, this method is impractical to implement.

The SITELLE instrument on the Canada-France-Hawaii Telescope boasts an FOV of 11 arcmin × 11 arcmin (e.g. Drissen et al. 2019). SITELLE is an imaging Fourier transform spectrograph, meaning that in addition to taking a two-dimensional (2D) image of a target, it also captures a spectrum over a bandpass for each pixel, generating a 3D data cube where the third axis is the spectrum. SITELLE contains over 4 million spaxels representing source and/or background emission. SITELLE’s unique design allows the observer to change the resolving power, R |$=\frac{\lambda }{\Delta \lambda }$|⁠, where λ is the wavelength and Δλ is the wavelength step. The spectral resolution of SITELLE ranges from R ∼ 1 up to R ∼ 10 000. The Star-formation, Ionized Gas and Nebular Abundances Legacy Survey (SIGNALS, Rousseau-Nepton et al. 2019) project uses SITELLE to target over 50 000 spatially resolved H ii regions in more than 40 galaxies (D ≤ 40 Mpc). One of the most challenging steps in tackling these observations has been the accurate estimation of the background emission, as most galaxies observed in SIGNALS cover the entire FOV.

This paper presents a novel approach to modelling the background using a combination of classic image segmentation algorithms, principal component analysis (PCA), and artificial neural networks. In Section 2, we outline each component of the algorithm. In Section 3, we discuss how the algorithm can be applied to SITELLE observations of two different environments: the dwarf galaxy NGC 4449 and the massive elliptical galaxy 1275. These two observations were chosen because they contrast each other and demonstrate this algorithm’s use in studying SIGNALS’ galaxies and other SITELLE targets. Both targets are detailed in Sections 3.1 and 3.2. In Section 4, we compare this methodology with standard background modelling techniques, discuss potential modifications to the algorithm depending on use cases, and explore its potential in other fields of astronomy and applications to other IFU-like instruments.

2. OBSERVATIONS AND METHODS

In this section, we start by describing the SITELLE observations chosen for the testing of our algorithm. We next explain the community’s traditional method of background modelling. We then outline the steps to obtain a background model in the proposed methodology, followed by a detailed discussion of each step.

2.1 Observations

2.1.1 NGC 4449

NGC 4449 was observed by SITELLE using the SN3 (6510–6850 Å) filter centred on RA 12h28m09.8s and DEC +44^○05^m51.2^s (P.I. Rousseau-Nepton; R ≈ 5000) as part of the SIGNALS project. The SN3 filter at this redshift contains strong emission lines present in H ii regions, diffuse ionized gas (DIG), planetary nebulae, and supernova remnants: [N ii] λ6548, [N ii] λ6583, Hα, [S ii] λ6716, and [S ii] λ6731. SITELLE samples the object at approximately 5.9 pc per pixel (0.32 arcsec) due to the proximity of the dwarf galaxy (z = 0.00 069). The SITELLE observations studied in this paper have been preprocessed using the standard SITELLE preprocessing pipeline known as Outil de Réduction Binoculaire pour SITELLE (ORBS, Martin & Drissen 2017). ORBS reduces SITELLE raw observations by calibrating the raw CCD images, aligning the data cubes from the two cameras on SITELLE, combining the interferometric data from the cameras, applying a phase correction to the combined cube, calculating the Fourier transformation of the combined cube, and applying a final wavelength and flux calibration to the combined cube. Once the data cube has been reduced, we create a deep image of the observation (shown in Fig. 1a). The deep image is created by summing over the spectral axis, creating a 2D image where each pixel value corresponds to the total flux in a pixel. The data analysis (including the background modelling, subtract, and emission-line fitting) is completed with luci – an analysis pipeline developed specifically for SITELLE (Rhea et al. 2021). luci is a general-purpose fitting pipeline that uses machine learning to reduce computation time while increasing computational accuracy. Using pre-trained neural networks, luci initializes fit parameters such as the velocity offset and broadening of emission lines, resulting in accurate solutions and robust flux measurements.

Figure 1.

Deep image of the SN3 filter observation of NGC 4449 (a) and NGC 1275 (b) taken with SITELLE. The colour scale was chosen to highlight the omnipresent DIG emission. Background and foreground objects appear as bright point sources in the image surrounding the galaxy. The cyan square represents the area over which the segmentation map was applied.

Open in new tab Download slide

2.1.2 NGC 1275

In Fig. 1b, we display the SN3 deep image of NGC 1275 (z = 0.0179) as observed by SITELLE. NGC 1275 was observed in 2016 as part of the science verification stage of SITELLE (P.I. Morrison). Since the observation was part of the science verification phase, the resolution of the cube was set to R ∼ 1800. The data were initially reduced using the ORBS software (Martin & Drissen 2017) and analysed using luci (Rhea et al. 2021). Just as with NGC 4449, the SN3 filter targets the following emission lines: [N ii] λ6548, [N ii] λ6583, Hα, [S ii] λ6716, and [S ii] λ6731.

2.2 Traditional background modelling

Previously, two different methodologies existed to model the background emission in a data cube from optical IFUs targeting extragalactic sources. The first consisted of taking a representative background region¹ and extracting the mean spectrum. In doing so, we capture the average amount of sky emission and minimize the noise. This representative background spectrum is then subtracted from each spaxel before fitting. This technique has been used extensively (e.g. Gendron-Marsolais et al. 2018; Alcorn et al. 2023; Groves et al. 2023), but, contrary to the methodology described in this paper, it assumes homogeneity across the FOV, which is not guaranteed since the amplitude of the skylines and/or the properties of the DIG for an extended source can change over the FOV. Similar strategies for choosing the pixels consisting of the background are proposed (see Rousseau-Nepton et al. (2019) or Vigneron et al. 2023 for examples) though these also assume relative homogeneity.

The second contemporary methodology extracts the background from individual emission regions. It relies on first pinpointing individual source emission regions, defining an aperture encircling the source emission, masking out the source region, and taking the median spectrum from this region (i.e. Bundy et al. 2015; Law et al. 2016; Schroetter et al. 2016; Jones et al. 2017; Moumen et al. 2019). Although this method creates very accurate backgrounds and skirts the problem of global homogeneity since it constructs local background models, it requires an algorithm that accurately identifies emission regions. Furthermore, it is most accurate when the emission regions are sufficiently disentangled spatially from one another to ensure an accurate background measurement. In the case of NGC 4449, this method is not applicable due to the tangled morphology of individual H ii regions. Similarly, the complex morphology of the emission in NGC 1275 makes this methodology impractical.

2.3 An overview of the algorithm

The following steps describe our proposed background spectrum methodology:

Use a segmentation algorithm to isolate the background pixels from the source pixels.
Apply PCA to construct a subspace representing the background components.
Project each background spaxel into a truncated PCA space.
2D interpolation on the masked (source) pixels into the truncated PCA space.

Following these steps, we isolate spaxels corresponding to the background, construct a reduced-order representation of the background, and impute the model in masked regions.

2.4 Segmentation algorithm

In order to construct a model of the background spectra, we must determine which pixels are associated with background emission versus source emission, where source emission includes emission from the foreground (or background) point sources, galaxies, and/or the astrophysical phenomena being studied. For example, a SITELLE data cube initially taken to study H ii regions in a nearby galaxy will contain contaminating stars, planetary nebulae, and supernova remnants that should not be included in the background model, as well as the hydrogen emission we aim to study. Here, instead of making any assumptions about the source spectra, we use the image segmentation algorithm from photutils on the SITELLE deep image, which is used to find extended and point sources. Therefore, the segmentation algorithm sees only the net flux in each pixel. We then apply the photutils image segmentation implementation (Bradley et al. 2023).

photutils detects sources in an image by applying an image segmentation algorithm. The first step in the segmentation algorithm is to create a rough estimate of the background. To do so, the image is gridded into subregions of a given box size – this is a user-chosen parameter. The box size is chosen so that it is larger than the scale of sources in the image and small enough to capture background variations. We have experimentally found a box size of 50 × 50 pixels to work well for two SITELLE data cubes imaging nearby galaxies. Pixels in a single grid box are convolved with a Gaussian kernel; we select a size of 3 × 3 following the recommendation of photutils. Then, the background level and background rms are calculated for each cell using the sigma-clipped median background. This background map is then interpolated to match the original size of the image and subtracted from the image. The background-subtracted deep image is then convolved with a 3 × 3 Gaussian in order to reduce the noise in the resulting image. Finally, we detect sources that are above a user-defined threshold using photutils.detect_sources. We set the threshold to be 0.01 times the background noise rms level. This produces a map of source regions, and, conversely, background regions, in the deep image.² Our methodology suffers from similar issues to the local background method discussed above if the segmentation algorithm does not accurately disentangle the background from the source emission.

2.5 Principal component analysis

Following the creation of the background pixel mask in the previous section, we construct a vector subspace that represents the ensemble of background spectra in a reduced-dimensional space. In order to accomplish this, we apply a PCA (e.g. Jolliffe 2005; Ringnér 2008; Abdi & Williams 2010; Bro & Smilde 2014). We apply a dimensionality reduction technique to the background spectra in order to capture the important spectral features and remove as much noise as possible from the background. PCA decomposes data into vectors that contain the maximum amount of variance describing the data. PCA is typically framed in the following manner:

Calculate the mean background spectrum.
After subtracting the mean from each spectrum, calculate the covariance matrix of the features.
Calculate the singular value decomposition of the data. Then, obtain the eigenvalues of the covariance matrix by taking the square of the singular values.
Construct a transformation matrix with the k-eigenvectors.

Mathematically, we can reconstruct any spectrum, s_r, using a linear combination of the principal components, p, and the coefficients, α, unique to a given spaxel and the mean spectrum μ:

$$\begin{eqnarray} s_{r} = \mu + \sum _{i=0}^k \alpha _i p_i. \end{eqnarray}$$

(1)

Each principal component covers the entire wavelength coverage of the spaxel. A user-defined hyperparameter, |$\tt {k}$|⁠, indicates the number of principal components to retain; in this manner, PCA acts as a dimensionality-reduction technique. Additionally, since each eigenvalue is sorted from greatest to the least, the first eigenvector (or eigenspectrum) holds the most variance, the second eigenspectrum holds the second-most variance, and so on. Therefore, eigenspectra of higher orders contain little to no variance and can be discarded as noise. PCA has been used extensively in the literature for these reasons to study galactic spectra (i.e. Heyer & Schloerb 1997; Ronen, Aragón-Salamanca & Lahav 1999; Yip et al. 2004; McGurk, Kimball & Ivezic 2010; Bailey 2012; Smith 2022). In this work, we apply the incremental PCA implementation from scikit-learn (Pedregosa et al. 2011).

Before applying the PCA, we apply two normalizations since, if we do not apply normalization, the PCA may select a feature to be more important than the others based only on its scale rather than the actual variance it explains. We first normalize each spectrum by the maximum value in the spectrum between 670 and 675 nm so that we can easily scale the source pixels as well. We chose this part of the spectrum since there are no strong emission lines, and it is dominated by noise while also encapsulating the continuum level. Since PCA works best with normalized values, we further normalize each spectrum by its maximum value. We do not restrict the wavelength over which the PCA is conducted.

The results of constructing a principal component-based subspace are k-eigenspectra and their corresponding coefficients for each spaxel in the background space. Thus, we can assign each pixel a k-dimensional coefficient vector. To use this subspace to make background spectra for masked source pixels, we can interpolate in our k-dimensional space.

Interestingly, the scree plots reveal that the principal components explain a relatively low amount of the total variance in the background spectra; additionally, the eigenspectra show that only the first few components have emission/absorption features while the remaining components are purely noise. Taken together, this indicates that the assumption of linearity inherent to PCA may not be appropriate here. This indicates that the majority of the variance throughout the FOV is noise. However, since we see signal-related features over several eigenspectra, it may be the case that a non-linear reconstruction is necessary; this is beyond the scope of this work.

2.6 Interpolation via an artificial neural network

The next step is to train an artificial neural network to interpolate the k-dimensional coefficient vectors from the background pixels onto the source pixels. Thus, the network will take the pixel’s x and y coordinates as input and return a k-dimensional vector in which each element corresponds to the pixel’s α value where the α value is the coefficient of the eigenspectrum generated during the PCA. This technique has been used extensively in the scientific literature (e.g. Chen 1996; Plaziac 1999; Rigol, Jarvis & Stuart 2001; Llanas & Sainz 2006). Indeed, this method is used frequently in geoscience to interpolate geophysical properties over maps and in computer science to interpolate network properties. To our knowledge, this is the first work in astronomy using this technique. We note that using standard interpolation techniques such as nearest neighbours, polynomial interpolation, or b-spline interpolation can result in numerous artefacts since the source regions can be large; this is a well-known issue with standard interpolation schemes. We explore the effects of different interpolation schemes in Section 3.1.3.

Simple artificial neural networks are capable of learning an interpolation over sparsely sampled multidimensional space (i.e. Rigol et al. 2001; Sivapragasam, Arun & Giridhar 2010). Therefore, we use a simple neural network deployed in tensorflow (Abadi et al. 2016; Chollet 2015) consisting of two layers of 200 and 300 nodes, respectively, each activated by the tanh function (Clevert, Unterthiner & Hochreiter 2015). We treat both structural parameters (such as the number of hidden layers and nodes) as hyperparameters in addition to the activation function, loss function, optimizer, and standard hyperparameters (i.e. the learning rate and learning rate decay factor). We optimized the hyperparameters of the network using optuna (Akiba et al. 2019). We use the Huber loss function (Huber 1964), the adam optimizer (Kingma & Ba 2017), and a learning rate of 10⁻². We apply a learning rate reducer that reduces the learning rate by a factor of 0.75 if the validation loss has not changed by more than a factor of 0.5 for five epochs. Additionally, we apply early stopping that will end training if the validation loss does not change for 10 epochs; the number of epochs is capped at 100.

We construct the training set by randomly selecting 99 per cent of the background pixels’ PCA representation. The remaining per cent is used to validate the training. Although this is a small percentage compared with the standard 10 per cent, we require a maximum number of spectra in the training set to ensure an appropriate spatial coverage. Moreover, the goal in training this neural network is not to create a network that will generalize to other PCA coefficient maps but rather to learn how to interpolate for a specific set of maps that are unique to the observation being studied. While we are not concerned with generalizability, we do need to contend with overfitting, which is why we have a small holdout set. We use this holdout set to verify that the network is not overfitting the data. We note that this is standard in the relevant literature.

The artificial neural network outperforms standard interpolation techniques when the background covers a small fraction of the FOV and the masked out regions are large. However, standard interpolation methodologies can be used instead of the neural network for small masked regions. We have implemented scipy.griddata in the code as an alternative interpolation strategy using nearest neighbours or linear interpolation; these results are discussed in Section 3.1.3. Once the coefficients are calculated for the masked-out spaxels, we can calculate a reconstructed background model by summing the principal component vectors multiplied by their corresponding coefficient and the mean background spectrum. We rescale these values in the same manner we initially scaled the data for the PCA.

3. RESULTS

In this section, we apply the methodology discussed above to a galaxy and a galaxy cluster in order to showcase its efficacy in different test cases.

3.1 NGC 4449

In the following section, we discuss the application of the background construction algorithm to a SIGNALS galaxy NGC 4449. NGC 4449 is an irregular, dwarf galaxy (M_⋆ ≈ 1.1 × 10¹¹ M_⊙) at a distance of approximately 3.8 Mpc (e.g. Annibali et al. 2011; Martínez-Delgado et al. 2012; Sacchi et al. 2018). In Fig. 1, we show the SITELLE SN3 (6480–6850 Å) deep image of NGC 4449. The image captures the active H ii regions leading to NGC 4449’s extreme starburst (0.47 M_⊙ yr⁻¹, Hunter, van Woerden & Gallagher 1999; Sacchi et al. 2018), morphological features such as a northern bar, arcs and bubbles of ionized gas, and a persistent DIG permeating through the galaxy. Moreover, NGC 4449 is an interesting test case due to the omnipresence of this DIG throughout the galaxy, which poses a complication for proper background modelling. This complication arises because the theoretical modelling of the DIG is complex, and it is challenging to disentangle DIG emission from H ii region emission.

3.1.1 Segmentation maps

The segmentation algorithm described in Section 2.4 results in a mask of non-background pixels; Fig. 2a highlights emission regions while black pixels represent background spaxels. We note several structures exist in the segmentation map; primarily, the mauve central structure follows the emission of NGC 4449 nicely, including DIG regions. Several horizontal and vertical structures are marked as source regions but are, in actuality, simply observational artefacts of SITELLE owing to saturation spikes and the fact that the CCDs are combined in SITELLE. Since these spaxels are masked, they do not contribute to the background subspace. However, we are still able to obtain a background model for these spaxels after training the neural network for interpolation described in Section 2.6. Additionally, point sources representing foreground or background active galactic nuclei and stars are masked by the segmentation algorithm. Since we are only interested in modelling the background that will affect the flux measurements of NGC 4449, we crop the segmentation map only to include 400 < x, y < 1600 corresponding to the cyan box in Fig. 1; the units are pixels. We apply this crop since, beyond this region, the cube does not contain visible emission from NGC 4449. Indeed, we also ran the method using a larger area and did not find any emission extending beyond the cyan box.

Figure 2.

Segmentation map created from the deep image of NGC 4449 (a) and NGC 1275 (b). Each coloured region represents a segment found by photutils. The colours in the segmentation maps are randomly selected and assigned to each region. Each pixel of the same colour is considered by the segmentation algorithm as belonging to the same region. For our purposes, this has no meaning other than coloured pixels are source regions, and background pixels are shown in black.

Open in new tab Download slide

3.1.2 Principal component analysis

We present the first 10 principal components of the background spectra in NGC 4449 in Fig. 3. Several notable features are present across the ensemble of components; most notably, the majority of sky emission lines are present in the mean spectrum, indicating that the mean spectrum is a good first-order approximation of the background. Additionally, the mean spectrum contains emission components typically associated with H ii regions or the DIG such as [N ii] λ6548, [N ii] λ6583, Hα, [S ii] λ6716, and [S ii] λ6731. This is expected since there are undoubtedly interloping DIG emissions in the background spaxels due to the segmentation algorithm parameters chosen. For the purposes of this paper, it is good that the DIG is included in the background since we want to study H ii region emission and thus treat the DIG as a component of the background. If the goal was to calculate the total strong emission line flux (i.e. Hα) in a given spaxel regardless of the emission region, then this would be inconsistent. It would be better to change the segmentation algorithm such that all regions included in DIG are not included in the background. We note that this is not an easy task and would likely require the inclusion of spectral information. Alternatively, it is possible not to include the first principal component in the reconstructions; however, this assumes that the DIG emission is not present in the other components.

Figure 3.

The first 10 principal components of the background spaxels in NGC 4449 (left panel) and NGC 1275 (right panel) as identified in Fig. 2. Each spectrum represents a different component, including the mean background emission. The spectra have been normalized to unity in this figure in order to highlight the major features of each. All higher order components (meaning components above 10) contain uniquely noise signatures and have been left out of this representation for readability. The majority of the sky-line emission shows up in the mean emission for both observations.

Open in new tab Download slide

The first component contains additional Hα emission and additional continuum. Hα emission is the primary emission line observed in the DIG, and the DIG is omnipresent in this galaxy, so it is reasonable that it represents an important component in the background spectra. Moreover, since the coefficients of the principal components can be negative, this can reflect the absence of Hα emission in a considerable portion of background spectra. The scree plot reveals that this component explains nearly twice as much variance as any other component, thus indicating its importance (see Fig. 4). The second component shows negative flux in [N ii] λ6548, [N ii] λ6583, Hα, [S ii] λ6716, and [S ii] λ6731. This is a common signature of DIG emission (e.g. Vale Asari & Stasinska 2021). The remaining components, including those not shown here³ primarily represent the noise signatures in the background spectra. Although the other components mainly represent noise, components 2–8 have slight contributions near 6560 Å.

Figure 4.

Scree plots for the PCA of NGC 4449 (left panel) and NGC 1275 (right panel). The scree plots show the explained variance ratio as a function of the principal component. Both scree plots show that the first two principal components explain the majority of variance in the observations; the results of a scree plot in addition to the visualization of principal components (Fig. 3) help determine the number of principal components required to accurately reproduce the background.

Open in new tab Download slide

We keep only the first three principal components since the scree plot (Fig. 4) shows a knee at that point.

3.1.3 Reconstructed backgrounds

We present here the reconstructed background regions. We show the coefficient maps of the first three principal components over the entire interpolation region, including both background and source pixels in Fig. A1. We note that the first principal component coefficient peaks around the emission regions since it describes DIG emission; meanwhile, the second and third component coefficients are more homogeneously dispersed throughout the field (Fig. A1). In Figs A2 and A3, we show the results of using standard linear interpolation and nearest neighbour interpolation. Compared with the neural network reconstruction, the interpolated pixels show strong discontinuities and non-smooth behaviour due to the nature of the interpolation strategies.

In Fig. 5, we show the standard background versus the PCA reconstructed background for the same region. The graphic demonstrates the importance of using a local reconstructed background by highlighting the changes in line amplitudes in spectral regions where strong emission lines (e.g. Hα) are present.

Figure 5.

Here, we show the standard background versus the PCA reconstructed background for the same background region of NGC4449. Vertical, grey, and dashed lines indicate strong emission lines. In addition to reducing the noise in the background spectrum, the reconstructed method changes the background flux, slightly affecting the measured fluxes of strong emission lines. The emission lines present in the background represent DIG emission.

Open in new tab Download slide

3.2 NGC 1275

In this section, we apply our methods to the SITELLE observations of the well-studied galaxy cluster NGC 1275. NGC 1275 is the brightest centre galaxy in the Perseus cluster; it hosts a wide range of multiwavelength astrophysical phenomena (Krabbe et al. 2000; Fabian et al. 2003, 2011; Hitomi Collaboration 2016; Gendron-Marsolais et al. 2018; Vigneron et al. 2024). In the optical bandpass, NGC 1275 hosts a large (several dozens of kpc) and asymmetric emission-line nebula. Unlike the previous test case of NGC 4449, NGC 1275 does not exhibit DIG that needs to be disentangled from the nebular emission; additionally, the background is expected to be relatively stable over the field (Gendron-Marsolais et al. 2018; Vigneron et al. 2024).

3.2.1 Segmentation map

The segmentation map is shown in Fig. 2b. We used the same segmentation algorithm to create this map but changed the σ-threshold to 0.5 to find a better contrast between the background and the nebula. This value was experimentally determined. Similar to the segmentation map of NGC 4449, background pixels are shown as black pixels. Also, all point sources are masked and thus show up as black pixels. The deep image is cropped to include only 800 < x < 1600 and 200 < y < 1200 since the rest of the observation does not contain nebular emission; the units are in pixels. Several sources of note are contained within the mask; the large circular regions, excluding that in teal centred at approximately pixel (400, 250), indicate emission from background and foreground galaxies, which should be excluded from the background model. Additionally, the nebula itself, which is made up of several segments, including the central teal segment, is included in the mask. The horizontal lines near the bottom right are saturation spikes in the SITELLE data.

3.2.2 Principal component analysis

In Fig. 3, we show the first 10 principal components ordered by explained variance (importance) for NGC 1275. Similar to the PCA results of NGC 4449, the mean emission contains the sky-lines and is a good first-order approximation of the background. Unlike NGC 4449, the principal components do not contain emission lines typical of strong emission lines. In the case of NGC 1275, there is no diffuse nebular emission (the DIG in NGC 4449), so the segmentation algorithm completely blocks out all emission associated with the nebula. Components 1 and 2 show that there is slight variation in the sky-lines across the observation and that the majority of the variance occurs near the [S ii]-doublet and on the edges of the transmission region of SN3 (at ≈ 6350 Å and ≈ 6750 Å). We note that the overall variance explained is low; however, the components after component 2 clearly show noise. Thus, we only use the first two components in our background reconstructions.

3.2.3 Reconstructed backgrounds

In Fig. 5, we show the standard background versus the PCA reconstructed background for the same background region. Due to the higher redshift of NGC 1275 compared with that of NGC 4449, the positions of the strong emission lines are shifted a considerable amount to longer wavelengths. This moves the Hα complex out of the forest of skylines between 650 and 660 nm, allowing for easier measurements. However, the [S ii]-doublet is shifted into the spectral bandpass occupied by skylines around 685 nm. Because of this, in order to model the [S ii]-doublet, it is crucial to have an accurate background model.

We note that the noise level in the reconstructed background is considerably lower than the noise in the standard background for both observations. This effect is most evident outside of the transmission region (e.g. between 6300 and 6450 Å). This is due to the fact that we disregard high-order eigenspectra that capture this noise. We choose to do this to obtain a background model containing as little noise as possible in order not to inject noise into the background subtracted spectrum we eventually use for fitting to reduce the overall noise level. This choice comes with a potential bias to the signal. Users can choose between this spectral bias and the noise level by opting to include more or less eigenspectra.

4. DISCUSSION

In the previous section, we presented the results of our methodology. We demonstrated how the SITELLE deep image can be used to construct a segmentation map, created a principal component decomposition of the spectra belonging to the background, and presented the reconstructed backgrounds using our neural network trained on the PCA coefficients of the background spaxels. In the following section, we compare extracted flux maps using our methodology with the standard methodology in NGC 4449 and NGC 1275. We conclude with a discussion of other uses for this background methodology.

4.1 Advantages over standard methodologies

The methodology proposed here does not assume any homogeneity in the background spectra. If the background does not vary in a given cube, then the coefficients for the eigenspectra will be near zero. Any deviations from homogeneity will be encoded in spatial changes in these coefficients. This method works equally well for systems with complicated emission region complexes and systems with simple emission regions.

This section compares how the different background modelling methods affect the final calculated flux of strong emission lines in source regions. We fit our cubes using luci after subtracting a single global background using the standard methods detailed in Section 2 and comparing the combined Hα and [N ii]-doublet amplitude with that calculated using our interpolated background scheme. We present example background spectra using the two methods in Figs 5 and 6 – see Section 3.1.3 for a detailed comparison.

Figure 6.

Here, we show the standard background versus the PCA reconstructed background for the same background region comprising approximately 100 pixels of NGC 1275. Vertical, grey, and dashed lines indicate strong emission lines redshifted to Perseus’ global velocity.

Open in new tab Download slide

We fit the unbinned data with a sinc function tieing the five emission lines in velocity space; the five emission lines are [N ii] λ6548, [N ii] λ6583, Hα, [S ii] λ6716, and [S ii] λ6731. We fit the cube twice separately: once using the novel background method and again using the standard background methodology. For the standard methodology, we selected a background region sufficiently far from NGC 4449⁴ (and in NGC 1275⁵) such that it does not contain emissions from the galaxy.

In Fig. 7, we show the combined Hα and [N ii]-doublet amplitude map for the new background model (left panel), the standard method (centre), and the difference between the two maps (right panel; the difference map is new background model map minus the standard background map) for NGC 4449. The difference map reveals that, in the galaxy’s central regions, the standard background methodology overestimates the flux; however, we can see a stark change in the fits in the outer regions. The purple pixels indicate that more flux (using the amplitude as a proxy) is recovered using the new methodology. We expect this since we are now correctly modelling the background in the DIG.

Figure 7.

Combined amplitude fit (Hα + [N ii]-doublet) for NGC 4449. On the left panel, we have the fit using the background method presented in this article. In the middle, we show the fit using the standard background methodology. Finally, on the right panel, we show the difference map between these two. Pixels with a combined log amplitude under −17.5 are masked to highlight the differences.

Open in new tab Download slide

In Fig. 8, we show the amplitude of the combined Hα and [N ii]-doublet fit using the new background technique (left panel), the standard background method (centre), and the difference between the two (right panel) for NGC 1275. The standard background region was taken from near the nebula, but, importantly, it does not include any nebular emission or point sources. Unlike the results for NGC 4449, the difference map reveals that using the new background method, we recover slightly lower amplitudes on the perimeter of the nebula. This indicates that the standard background methodology underestimates the background emission in these areas. For both objects, we verified that the change in background modelling did not affect the velocity or velocity dispersion values.

Figure 8.

From left to right panel: Combined Hα and [N ii]-complex amplitude map using the background method presented in Section 2, Combined Hα and [N ii]-complex amplitude map using the standard background method, and the difference map between the two maps generated using different background models for NGC 1275.

Open in new tab Download slide

4.2 Potential modifications

While we have highlighted the use of this methodology on NGC 4449, it can be extended to any other galaxy in the SIGNALS catalogue. Moreover, it can be applied to other SITELLE observations. However, the interpretation of the principal components will likely change as the primary background contaminant may change (i.e. in galaxy clusters, the primary contaminant is expected to be emission from the stellar continuum rather than DIG). This method could also be adjusted to model the stellar continuum by modifying the background segmentation algorithm to only place spaxels containing stellar continuum emission in the background space. While out of the scope of this article, this topic will be explored in further studies. Furthermore, the algorithm is not SITELLE specific but can be used for any IFU data with sufficient spatial resolution to capture background emission in the spaxels (see Section 4.3).

Although we run the background detection algorithm on the deep image, in some cases, such as extreme stellar contamination, which muddies the deep image, using a flux or line-amplitude map of the strongest emission line is advantageous. In doing so, we ensure that only the line-emitting regions are masked. This application is explored in other works (i.e. Hlavacek-Larrondo, in preparation; Rhea et al., in preparation). In addition to changing the image used to detect background regions, this methodology also works with different segmentation algorithms (i.e. thresholding algorithms).

In the two examples showcased in Section 3, the neural network interpolations yielded smooth reconstructions of the coefficient fields. However, it is possible that users may want less smooth interpolations in the case where high-frequency features are present in the coefficient maps. Inherently, neural fields, such as the one developed in this work, return smooth representations. In order to capture high-frequency (i.e. non-smooth) features in the interpolation, it is possible to add Fourier features to the input vector (Tancik et al. 2020); this will be explored in future works.

4.3 Generalization to other instruments

While we showcased this methodology using data from SITELLE, it can be readily ported to any IFU-style data such as MUSE (Bacon et al. 2017). Since this method does not rely on certain emission lines to constrain the background emission, it is not limited to a given spectral range. In the next subsections, we explore uses in other wavelengths.

4.3.1 Optical

Due to the nature of its design, SITELLE is built to capture primarily emission lines (see Drissen et al. 2019 for details). Our methodology will work equally well for absorption lines. Therefore, it can be used for galactic and solar studies using instruments such as GMOS (Allington-Smith et al. 2002), MUSE (Bacon et al. 2017), and MEGARA (Gil de Paz et al. 2012). For example, instead of using dedicated background fibres that implicitly assume a homogenous background, such as those in the GMOS instrument, the background can be directly sampled and modelled from the science fibres using our methodology. Moreover, optical IFU-like instruments, such as the GHαFaS instrument on the William Herschel Telescope (Hernandez et al. 2008), can also be used.

4.3.2 Infrared

With the advent of the JWST we have access to an IFU in the near-infrared (NIR) and mid-infrared bandpasses (e.g. Gardner et al. 2006; Gordon et al. 2015; Rieke et al. 2015; Böker et al. 2022; Jakobsen et al. 2022; Pontoppidan et al. 2022). Methods for modelling the background in the NIR will suffer from the same complications as optical IFUs; therefore, it is important to have a robust methodology for handling the background. Since our methodology is wavelength-independent, it can be used for JWST NIRSpec IFU and MIRI IFU observations. This methodology will complement the strategies outlined by the JWST team (e.g. Böker et al. 2022) to reduce background contamination in IFU measurements. More specifically, this method can help achieve the primary science goals. For instance, developing an accurate background model is crucial in studying faint distant galaxies. In nearby galaxies, the primary component of the background emission is expected to be emission from the host galaxy’s stellar component and, therefore, requires dedicated modelling; our methodology allows this without assuming an underlying stellar population for the host galaxy. For near-earth objects, a complete background model is required to obtain accurate chemical abundances that will aid our understanding of our solar system’s evolution and chemical makeup.

4.3.3 X-ray

The methodology outlined in this article does not only apply to IFU data but to any detector that records a photon’s energy at each pixel; therefore, it is a perfect candidate for the ACIS instruments on the Chandra X-ray observatory (e.g. Weisskopf et al. 2000; Weisskopf et al. 2002; Garmire et al. 2003). The ACIS instrument includes two sets of CCD detectors. When a photon falls on the detector, the CCD records the position of the photon (i.e. the pixel), the time of the event, and the measured energy of the incident photon. For diffuse targets such as galaxy clusters, the background can play a crucial role in measurements (e.g. Miller et al. 2012; George, Mushotzky & Miller 2014). Several methods have been specifically developed to model the X-ray background, but they depend on a concrete physical understanding of the mechanisms influencing background emission (e.g. Markevitch et al. 2003; Bartalucci et al. 2014). Contrarily, the methodology outlined here does not assume any physical models. Again, this method does not assume homogeneity of the background as many currently implemented methodologies for Chandra do (e.g. Vikhlinin et al. 2006; Cavagnolo et al. 2009; Sun et al. 2009).

While this methodology can be applied to existing X-ray observatories, it is uniquely adapted to the requirements of future missions such as XRISM and Athena (Nandra et al. 2013; Barret et al. 2018; Simionescu et al. 2019; XRISM Science Team 2020). The IFUs on these missions will not only map out the complex morphologies of galaxies and galaxy clusters but also the spectra in the background regions. Due to the enhanced spectral resolution of the detectors on XRISM and Athena, the background spectra will be considerably more complicated than what is currently considered and modelled. Therefore, the methodology outlined in this paper can serve as an alternative to physical modelling while achieving a faithful background spectral model.

5. CONCLUSIONS

We present a novel strategy for modelling the background emission (both the sky and contaminant emission) for IFU-style instruments. This methodology uses a combination of image segmentation algorithms and PCA to model the background in regions where no source emission is present. We then developed an artificial neural network to interpolate the model over masked source regions.

This methodology is applied to a nearby irregular dwarf galaxy, NGC 4449, observed as part of the SIGNALS collaboration. We demonstrate the importance of using our background strategy on this target; since NGC 4449 is highly irregular and contains overlapping H ii regions and widespread diffusion ionized gas, other methodologies are not appropriate for background modelling. By comparing the recovered fluxes as computed by luci using different background methods, we demonstrate how our methodology allows us to obtain flux estimates for this system. We also repeat the experiment for the galaxy cluster NGC 1275, showing how the algorithm performs on a different type of object. Finally, we consider how the algorithm, or rather certain aspects of the algorithm, can be modified or substituted to apply it to other related applications. The algorithm has been implemented in the luci. An example Jupyter notebook can be found at https://github.com/crhea93/LUCI/blob/main/Examples/BackgroundAutomatic.ipynb.

Finally, we note that the SITELLE data analysis pipeline LUCI now contains an optimized implementation of the methodology developed in this paper.

ACKNOWLEDGEMENTS

The authors would like to thank the Canada-France-Hawaii Telescope (CFHT) which is operated by the National Research Council (NRC) of Canada, the Institut National des Sciences de l’Univers of the Centre National de la Recherche Scientifique (CNRS) of France, and the University of Hawaii. The observations at the CFHT were performed with care and respect from the summit of Maunakea which is a significant cultural and historic site. CLR acknowledges financial support from the physics department of the Université de Montréal, the Mitacs scholarship program, and the IVADO doctoral excellence scholarship. JH-L acknowledges support from NSERC via the Discovery grant program, as well as the Canada Research Chair program. MP acknowledges financial support from the European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement No. 896248. LRN is grateful to the National Science Foundation NSF - 2109124 and the Natural Sciences and Engineering Research Council of Canada NSERC - RGPIN-2023-03487 for their support.

We used the following software: tensorflow (Abdi & Williams 2010), keras (Chollet 2015), python (Van Rossum & Drake 2009), scipy (Virtanen et al. 2020), matplotlib (Hunter 2007), scikit-learn (Pedregosa et al. 2011), optuna (Akiba et al. 2019), photutils (Bradley et al. 2023), astropy (Robitaille et al. 2013), luci (Rhea et al. 2021), and ds9 (Joye & Mandel 2003). We note that the versions of each software can be found on the luci GitHub page in the requirements.txt file.

DATA AVAILABILITY

All methods used in this paper are available at crhea93/LUCI. The data can be accessed at the https://www.cadc-ccda.hia-ihaffecteda.nrc-cnrc.gc.ca/.

Footnotes

A representative background region is considered a region in the data cube sufficiently distant from the primary source of emission containing no point sources.

These steps follow the standard procedure outlined at https://photutils.readthedocs.io/en/stable/segmentation.html.

We only plot the first 10 principal components since all other components are primarily noise features, and their addition makes the graph unreadable.

We used a circular region centred at (12:27:53.2,+44:07:59.2) and a radius of 19 arcsec.

We used a circular region centred at (3:19:42.5,+41:32:01.8) and a radius of 10 arcsec.

References

Abadi

et al. ,

2016

preprint

(

arXiv

)

Abdi

Williams

L. J.

2010

WIREs Comput. Stat.

433

Month:	Total Views:
April 2024	17
May 2024	93
June 2024	76
July 2024	52
August 2024	33
September 2024	26
October 2024	15
November 2024	46
December 2024	35
January 2025	11
February 2025	15
March 2025	10
April 2025	17
May 2025	11

Article Contents

Reconstructing robust background integral field unit spectra using machine learning

ABSTRACT

1. INTRODUCTION

2. OBSERVATIONS AND METHODS

2.1 Observations

2.1.1 NGC 4449

2.1.2 NGC 1275

2.2 Traditional background modelling

2.3 An overview of the algorithm

2.4 Segmentation algorithm

2.5 Principal component analysis

2.6 Interpolation via an artificial neural network

3. RESULTS

3.1 NGC 4449

3.1.1 Segmentation maps

3.1.2 Principal component analysis

3.1.3 Reconstructed backgrounds

3.2 NGC 1275

3.2.1 Segmentation map

3.2.2 Principal component analysis

3.2.3 Reconstructed backgrounds

4. DISCUSSION

4.1 Advantages over standard methodologies

4.2 Potential modifications

4.3 Generalization to other instruments

4.3.1 Optical

4.3.2 Infrared

4.3.3 X-ray

5. CONCLUSIONS

ACKNOWLEDGEMENTS

DATA AVAILABILITY

Footnotes

References

APPENDIX A: INTERPOLATION SCHEMES

A1. Neural network interpolation

A2. Linear interpolation

A3. Nearest neighbour interpolation

Citations

Views

Altmetric

Email alerts

Citing articles via

Most Read

Latest

This Feature Is Available To Subscribers Only