Abstract

We obtain analytical approximations for the expectation and variance of the Spectral Kurtosis estimator in the case of Gaussian and coherent transient time domain signals mixed with a quasi-stationary Gaussian background, which are suitable for practical estimations of their signal-to-noise ratio and duty-cycle relative to the instrumental integration time. We validate these analytical approximations by means of numerical simulations and demonstrate that such estimates are affected by statistical uncertainties that, for a suitable choice of the integration time, may not exceed a few per cent. Based on these analytical results, we suggest a multiscale Spectral Kurtosis spectrometer design optimized for real-time detection of transient signals, automatic discrimination based on their statistical signature, and measurement of their properties.

1 INTRODUCTION

The Spectral Kurtosis estimator (⁠|$\widehat{SK}$|⁠) was originally proposed by Nita et al. (2007) as a statistical tool for real-time radio frequency interference (RFI) detection and excision in a Fast Fourier Transform (FFT) radio spectrograph. The first-ever hardware implementation of an |$\widehat{SK}$| spectrograph, the Korean Solar Radio Burst Locator (KSRBL; Dou et al. 2009), provided comprehensive experimental data that validated its theoretically expected performance (Gary, Liu & Nita 2010).

The |$\widehat{SK}$| estimator is a higher order unbiased statistical estimator associated with an accumulated Power Spectral Density (PSD), which is defined as (Nita & Gary 2010a,b),
(1)
where at each frequency bin fk (k = 1 − N/2),
(2)
are sums taken over M raw FFT consecutive PSD estimates and, respectively, their squares.

A remarkable property of the |$\widehat{SK}$| estimator is that, in the case of a pure Gaussian time domain signal, its statistical expectation is unity at each frequency bin, while the power spectrum may have an arbitrary spectral shape. This property gives the |$\widehat{SK}$| estimator the ability to discriminate signals deviating from a Gaussian time domain statistics against arbitrarily shaped astronomical backgrounds, as it is usually the case of the man-made signals producing unwanted RFI contamination of the astronomical signals of interest. Nevertheless, as demonstrated by Nita et al. (2007), the |$\widehat{SK}$| estimator may be equally sensitive to narrow band astronomical transient signals such as radio spikes, which might be mistakenly flagged by a blind RFI excision algorithm assuming that all |$\widehat{SK}$| values deviating from unity should be excised from the astronomical signal of interest. However, as demonstrated here, rather than being a limitation of its practical applicability, this particular sensitivity of the |$\widehat{SK}$| estimator to transient signals may be exploited not only to detect them, but also to quantitatively characterize their properties. For this purpose, we analyse the statistical properties of these two special classes of transients, and obtain analytical expressions for the expected mean and variance of their associated |$\widehat{SK}$| estimators as functions of their effective durations and signal-to-noise ratios.

2 STATISTICS OF GAUSSIAN TRANSIENTS

In this section, we analyse the statistical properties of |$\widehat{SK}$| estimator in the case of narrow band Gaussian transient time domain signal mixed with a quasi-stationary Gaussian background. In Section 2.1, we provide an overview of the results previously reported by Nita & Gary (2010a) regarding the expected mean and variance of the |$\widehat{SK}$| estimator associated with a quasi-stationary time domain Gaussian signal. In Section 2.2, we generalize these results and provide an analytical expression for the expectation of the |$\widehat{SK}$| estimator in the case of a transient Gaussian signal mixed with a Gaussian time domain background, which, in the M ≫ 1 limit reduces to the M ≫ 1 limit of an analytical expression previously reported by Nita et al. (2007). In addition, we obtain an analytical expression for the variance of the |$\widehat{SK}$| estimator under the same conditions. We validate these theoretical expectations by means of numerical simulations.

2.1 Quasi-stationary Gaussian signals

As shown by Nita & Gary (2010a), in the case of a quasi-stationary time domain signal obeying a Gaussian statistics, the statistical expectations for the FFT-derived nth-powers of the sums S1 and S2 defined by equation (2) are given by
(3)
where μ represents the mean power of the quasi-stationary Gaussian background at the particular frequency bin considered.
In particular, equation (3) provides the expectations
(4)
and
(5)
which are needed to compute the expected mean and variance of the |$\widehat{SK}$| estimator. From equation (1) immediately follows:
(6)
and
(7)
where we made use of the non-trivial identity
(8)
which holds because, in the case of a normally distributed time domain signal, |$S_2/S_1^2$| and |$S_1^2$| are uncorrelated random variables (Nita & Gary 2010a).
Hence, by plugging in the expressions provided by equations (4) and (5), and writing down |$\sigma ^2_{\widehat{SK}}=E [\widehat{SK}^2 ]-E [\widehat{SK} ]^2$|⁠, immediately follows |$E [\widehat{SK} ]=1$|⁠, and
(9)
where the approximation is valid for accumulation lengths much larger than unity.

2.2 Gaussian transients

To investigate how the |$\widehat{SK}$| estimator is expected to deviate from unity in the case of a quasi-stationary Gaussian background mixed with a Gaussian time domain signal lasting shorter than the accumulation time, we adopt the model originally considered by Nita et al. (2007), in which the mean spectral power of the transient signal is characterized by a signal-to-noise ratio ρ relative to the mean power of the quasi-stationary background, μ, and the transient signal is considered to be effectively present only in an integer fraction δM of the M raw PSD estimates contributing to the accumulations S1 and S2. Hence, taking in consideration that S1 and S2 are sums of a series of uncorrelated random variables, a simple binomial expansions leads to
(10)
and
(11)
which are linear combinations of the expectations given by equations (4) and (5).
Although the identities given by equation (8) do no longer exactly hold in the case of Gaussian transients, we assume that the expressions given by equations (6) and (7) can still serve as biased estimators of the first and second moments of |$\widehat{SK}$|⁠. Consequently, after a few algebraical manipulations of equation (6), the biased |$\widehat{SK^{\ast }}$| estimator corresponding to the adopted Gaussian transient model can be written as
(12)
which, for accumulation lengths M ≫ 1, reduces to
(13)
As a first self-consistency check of the adopted |$\widehat{SK^{\ast }}$| approximation, we note that, for any ρ, equations (12) and (13) reduce to unity when δ = 0 or δ = 1, i.e. the cases of no transient signal or, respectively, the mixture of two quasi-stationary Gaussian signals. Therefore, |$\widehat{SK^{\ast }}$| produces unbiased estimates at both ends of the duty-cycle range. For all other cases, |$\widehat{SK^{\ast }}$| deviates from unity and reaches a maximum at δ = 1/(2 + ρ), i.e.
(14)
Similarly, from equation (7), we obtain an approximation for the variance of |$\widehat{SK}$| that, for M ≫ 1, reduces to
(15)
For δ = 0 and 1, equation (15) reduce to the expressions given by equation (9), while in between these extremes, and fixed SNR, |$\sigma ^2_{\widehat{SK^{\ast }}}$| has a single-peaked duty-cycle dependence.

Fig. 1 displays, for a fixed SNR (ρ ≡ 5), the duty-cycle dependence of the |$\widehat{SK^{\ast }}$| approximation (solid lines), as well as the |$\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$| fluctuation ranges (dashed lines), for two selected accumulation lengths, M = 97 (panel a) and M = 9766 (panel b), which correspond to the accumulation lengths of two short-lived prototype instruments equipped with |$\widehat{SK}$| capabilities, the Frequency Agile Solar Radiotelescope Subsystem Testbed (FST; Liu et al. 2007) and the Expanded Owens Valley Solar Array Subsystem Testbed (EST; Gary, Nita & Sane 2012). We compare these theoretical expectations with Monte Carlo simulations of the |$\widehat{SK}(\delta ,\rho \equiv 5)$| distributions (scattered points) that were obtained by generating 1000 |$\widehat{SK}$| values for each of the duty-cycle values ranging from 0 to 100 per cent. This comparison demonstrates an overall good agreement of the δ-dependent mean of the simulated |$\widehat{SK}$| distributions, |$\langle \widehat{SK}\rangle$|⁠, (cross symbols), with the |$\widehat{SK^{\ast }}$| predictions (solid lines), which are affected by a δ-dependent bias that reaches its maximum in the vicinity of the |$\widehat{SK^{\ast }}$| peak, but significantly decreases as the accumulation length M increases.

$\widehat{SK^{\ast }}$ estimator (solid lines) as function of a Gaussian transient duty-cycle for a signal-to-noise ratio ρ = 5 and two accumulations lengths, M = 97 (panel a) and M = 9766 (panel b). The symmetric $\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$ limits provided by equation (15) are overlaid (dashed lines) on top of the corresponding numerically simulated distributions (point symbols). The SDEV-corrected 68.27 per cent probability ranges, $\widehat{SK^{\ast }}\pm \gamma \sigma _{\widehat{SK^{\ast }}}$, are indicated by the dot–dashed lines. The square symbols indicate the sample SDEV around the mean of the simulated $\widehat{SK}$ distribution (plus symbols). The Pearson Type IV asymmetric detection thresholds, which correspond to standard 0.13 per cent probabilities of false alarm on each side of the unity $\widehat{SK}$ expectation (dotted lines) if no transient emission was present, are indicated by horizontal thin lines. The SDEV correction factors γ, the maximum values reached by the duty-cycle dependent relative SDEV ϵmax, and the corresponding maximum relative bias of the $\widehat{SK}$ estimator, βmax, are indicated in each figure inset. Panel (c) displys the percentage of $\widehat{SK}$ -detected transients as function of their duty-cycle, for M = 97 (thin line) and M = 9766 (thick line).
Figure 1.

|$\widehat{SK^{\ast }}$| estimator (solid lines) as function of a Gaussian transient duty-cycle for a signal-to-noise ratio ρ = 5 and two accumulations lengths, M = 97 (panel a) and M = 9766 (panel b). The symmetric |$\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$| limits provided by equation (15) are overlaid (dashed lines) on top of the corresponding numerically simulated distributions (point symbols). The SDEV-corrected 68.27 per cent probability ranges, |$\widehat{SK^{\ast }}\pm \gamma \sigma _{\widehat{SK^{\ast }}}$|⁠, are indicated by the dot–dashed lines. The square symbols indicate the sample SDEV around the mean of the simulated |$\widehat{SK}$| distribution (plus symbols). The Pearson Type IV asymmetric detection thresholds, which correspond to standard 0.13 per cent probabilities of false alarm on each side of the unity |$\widehat{SK}$| expectation (dotted lines) if no transient emission was present, are indicated by horizontal thin lines. The SDEV correction factors γ, the maximum values reached by the duty-cycle dependent relative SDEV ϵmax, and the corresponding maximum relative bias of the |$\widehat{SK}$| estimator, βmax, are indicated in each figure inset. Panel (c) displys the percentage of |$\widehat{SK}$| -detected transients as function of their duty-cycle, for M = 97 (thin line) and M = 9766 (thick line).

Figs 1a and b insets indicate the maximum observed relative bias, |$\beta \equiv \widehat{SK^{\ast }}/\langle \widehat{SK}\rangle -1$|⁠, i.e. βmax = 11.0 per cent for M = 97 and βmax = 0.2 per cent for M = 9766, which we compare with the sample maximum relative standard deviations (SDEV), |$\epsilon \equiv \sqrt{\langle \widehat{SK}^2\rangle /\langle \widehat{SK}\rangle ^2-1}$|⁠, i.e. ϵmax = 56.6 per cent and ϵmax = 6.5 per cent, respectively. This comparison reveals that not only the maximum |$\widehat{SK^{\ast }}$| bias is smaller than the absolute relative SDEV ϵ, but also, for any duty-cycle, it is contained within the sample SDEV range |$\langle \widehat{SK}\rangle (1\pm \epsilon )$| (square symbols). Therefore, we may conclude that the choice of the analytical expression given by equation (13) as an biased estimator for the true mean of the |$\widehat{SK}$| distribution may be accurate enough for being employed in practical applications, especially those involving large accumulation lengths M. Nevertheless, despite being consistent with the large scatter of the simulated |$\widehat{SK}$| distribution, the |$\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$| ranges (dashed lines) appear to largely overestimate, especially for small values of M, the true variance of the |$\widehat{SK}$| distribution, as estimated by the sample SDEV range indicated by the square symbols. To quantify this discrepancy, we calculate the percentages of the |$\widehat{SK}$| values lying outside the |$\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$| range, p = 15.3 per cent for M = 97 and p = 22.2 per cent for M = 9766, which show that, from a practical perspective, the analytical expression given by equation (15) provides a non-standard fluctuation interval that has a larger confidence level of not being crossed by a random sample than a standard ±1σ interval that, in the case of a normal distribution, would be expected to leave 31.73 per cent of the sample points scattered outside its bounds. We note, however, that the percentages of |$\widehat{SK}$| values lying outside the |$\langle \widehat{SK}\rangle (1\pm \epsilon )$| SDEV ranges, p = 26.9 per cent for M = 97 and p = 31.5 per cent for M = 9766, do also correspond to higher than standard confidence levels. This behaviour is consistent with the positive skewness of the |$\widehat{SK}$| estimator PDF, which proved to be a non-negligible effect when |$1\pm 3\sigma _{\widehat{SK}}$| RFI detection thresholds were experimentally tested in the first implementation of an |$\widehat{SK}$| spectrometer (Gary et al. 2010).

Having available only the biased approximations of the first two moments of the true |$\widehat{SK}$| distribution associated with a Gaussian transient, |$\widehat{SK^{\ast }}$| and |$\sigma _{\widehat{SK^{\ast }}}^2$|⁠, we can obtain neither a Pearson Type IV PDF analytical approximation, nor an alternative three–moment based Pearson Type III approximation that could be accurate enough for practical applications (Nita & Gary 2010b). Nevertheless, for the practical purpose of estimating a particular pair {δ, ρ} by means of |$\widehat{SK}$| measurements, one can instead use post facto Monte Carlo simulations such those illustrated here to estimate the true confidence level corresponding to the particular |$\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$| realization. However, to facilitate the use of the mathematically convenient propagation of errors formalism for translating the |$\widehat{SK}$| statistical fluctuations estimated by means of Monte Carlo simulations into formal SDEV {σδ, σρ}, we propose the use of an empirical SDEV correction factor, γ, which would produce an equivalent SDEV range |$\widehat{SK^{\ast }}\pm \gamma \sigma _{\widehat{SK^{\ast }}}$| that would leave outside its bounds 31.73 per cent of the simulated |$\widehat{SK}$| random variables. Fig. 1 indicates such SDEV corrections (dot–dashed lines) that, for γ = 0.7013 and γ = 0.8093, result in leaving outside their corresponding ranges p = 31.7 and p = 32.1 per cent of the scattered random variables, for M = 97 and M = 9766, respectively. We note that, as M increases, not only the bias of the |$\widehat{SK^{\ast }}$| approximation becomes practically negligible, but also the differences between the |$\langle \widehat{SK}\rangle (1\pm \epsilon )$|⁠, |$\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$|⁠, and |$\widehat{SK^{\ast }}\pm \gamma \sigma _{\widehat{SK^{\ast }}}$| ranges, which indicate that, as the |$\widehat{SK}$| distribution approaches normality, the |$\widehat{SK^{\ast }}$| and |$\sigma _{\widehat{SK^{\ast }}}^2$| approximations become asymptotically unbiased. We thus conclude that equations (13) and (15) provide approximations suitable for accurate estimations of the characteristics of Gaussian transients.

To illustrate the expected performance of the |$\widehat{SK}$| estimator in detecting Gaussian transients, we show in Figs 1a and b (horizontal lines) the Pearson Type IV asymmetric detection thresholds (Nita & Gary 2010a) corresponding to the standard 0.13 per cent PFA on each side of the unity |$\widehat{SK}$| expectation, i.e. |$1_{-0.439}^{+0.903}$| for M = 97 and |$1_{-0.058}^{+0.063}$| for M = 9766, and in Fig. 1c, we display the corresponding percentages of detected transients as function of their duty-cycles (thin line and tick line, respectively). Fig. 1c reveals that, for a given SNR, the |$\widehat{SK}$| detection performance may significantly vary with the transient duty-cycle, as the direct result of the duty-cycle dependence of the |$\widehat{SK}$| estimator and its statistical fluctuations, as seen in panels (a) and (b). However, for large accumulation lengths, as is the case of M = 9766, the detection performance is a flat 100 per cent, except for narrow ranges close to the both ends of the [0–100] per cent duty-cycle interval.

The overall transient detection performance of the |$\widehat{SK}$| estimator is expected to increase as the signal-to-noise ratio increases, as indicated by Fig. 2, which displays the duty-cycle variation of the |$\widehat{SK^{\ast }}$| estimator (thick lines), and its expected |$\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$| fluctuations (thin lines), for M = 97 (panel a), M = 9766 (panel b), and three selected signal-to-noise ratios, ρ = 5, 7, and10, (solid, dashed, and dot–dashed lines, respectively). However, as shown by Fig. 2a, the |$\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$| ranges corresponding to different signal-to-noise ratios may overlap, which could result in large uncertainties of the SNR and duty-cycle estimates obtained from |$\widehat{SK}$| measurements. Although, as suggested by Fig. 2b, such uncertainties are expected to decrease as the accumulation length M increases, a more quantitative investigation is called for, which is presented in the next section.

Gaussian transient duty-cycle variation of the $\widehat{SK^{\ast }}$ estimator (thick lines) and its expected $\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$ fluctuations (thin lines) for M = 97 (panel a) and M = 9766 (panel b), and three selected signal-to-noise ratios, ρ = 5, 7, and 10, (solid, dashed, and dot–dashed lines, respectively).
Figure 2.

Gaussian transient duty-cycle variation of the |$\widehat{SK^{\ast }}$| estimator (thick lines) and its expected |$\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$| fluctuations (thin lines) for M = 97 (panel a) and M = 9766 (panel b), and three selected signal-to-noise ratios, ρ = 5, 7, and 10, (solid, dashed, and dot–dashed lines, respectively).

2.3 |$\widehat{SK}$| measurements of Gaussian transients

To demonstrate the ability of an |$\widehat{SK}$| spectrometer to measure Gaussian transients, we consider the favourable case of a Gaussian transient fully contained within the bounds of a single accumulation characterized by the accumulated power S1(ti) and proceeded by a transient-free accumulation S1(ti−1). Based on such two consecutive power measurements, one may define the apparent signal-to-noise ratio
(16)
which simply relates to the true SNR and duty-cycle of the transient as η = δρ. Hence, from equation (13), immediately follows
(17)
where all magnitudes entering the expressions of δ and, subsequently, ρ depend only on quantities directly measured by the |$\widehat{SK}$| spectrometer.
Using the general propagation-error formula (Bevington & Robinson 1992)
(18)
the experimental uncertainties of such estimates can be expressed as
(19)
with |$\sigma _{\widehat{SK}}^2$| being provided by equation (15), (optionally scaled by an SDEV correction factor of γ obtained from Monte Carlo simulations involving the estimated parameters δ and ρ), and
(20)
where the variances |$\sigma ^2_{S_1}$| can be straightforwardly expressed in terms of the expectations E[S1] and |$E[S_1^2]$|⁠, which are provided by equation (10) as
(21)
Hence, taking into account that |$\sigma ^2_{S_1}=E[S_1^2]-E[S_1]^2$| and ρ(ti−1) = 0, equation (20) reduces to
(22)
Therefore, in the case of an observed Gaussian transient, equations (1), (15)–(19) and (22) provide the theoretical means for estimating its signal-to-noise ratio and duty-cycle, as well as the corresponding statistical uncertainties, in terms of the S1 and S2 measurements provided by the |$\widehat{SK}$| spectrometer.

To investigate the accuracy of the estimations provided by this method, we display in Fig. 3 the results obtained in the case of the Monte Carlo simulations used to generate the |$\widehat{SK}$| distributions shown in Fig. 1.

Duty-cycle distributions of the relative errors of the duty-cycle estimates (panels a and b, M = 97 and M = 9766, respectively) and SNR estimates (panels c and d, M = 97 and M = 9766, respectively) obtained from the same set of Gaussian transient simulations used to generate the $\widehat{SK}$ distributions shown in Fig. 1. The estimates obtained from transients flagged by the 0.13 per cent PFA $\widehat{SK}$ detection thresholds are indicated by black plus symbols, and the estimates obtained from not flagged $\widehat{SK}$ simulations are shown as grey symbols. The pairs of solid curves in each panel indicate the range of the expected standard-equivalent ±1σ statistical fluctuations, as derived from equation (19) in which the known true values δtrue and ρtrue have been entered, and the correction factors γ = 0.7013 and γ = 0.8093, for M = 97 and M = 9766, respectively, have been applied to the corresponding $\widehat{SK}(\delta _{{\rm true}},\rho _{{\rm true}})$ expected fluctuations, as provided by equation (15). The square symbols shown in each panel indicate the true confidence levels of the ±1σ intervals, which are compared with a standard 68.27 per cent confidence level (horizontal lines). Due to much smaller fluctuations affecting the M = 9766 estimates, different scales are used to display on the same plots the relative errors and confidence levels in panels (c) and (d).
Figure 3.

Duty-cycle distributions of the relative errors of the duty-cycle estimates (panels a and b, M = 97 and M = 9766, respectively) and SNR estimates (panels c and d, M = 97 and M = 9766, respectively) obtained from the same set of Gaussian transient simulations used to generate the |$\widehat{SK}$| distributions shown in Fig. 1. The estimates obtained from transients flagged by the 0.13 per cent PFA |$\widehat{SK}$| detection thresholds are indicated by black plus symbols, and the estimates obtained from not flagged |$\widehat{SK}$| simulations are shown as grey symbols. The pairs of solid curves in each panel indicate the range of the expected standard-equivalent ±1σ statistical fluctuations, as derived from equation (19) in which the known true values δtrue and ρtrue have been entered, and the correction factors γ = 0.7013 and γ = 0.8093, for M = 97 and M = 9766, respectively, have been applied to the corresponding |$\widehat{SK}(\delta _{{\rm true}},\rho _{{\rm true}})$| expected fluctuations, as provided by equation (15). The square symbols shown in each panel indicate the true confidence levels of the ±1σ intervals, which are compared with a standard 68.27 per cent confidence level (horizontal lines). Due to much smaller fluctuations affecting the M = 9766 estimates, different scales are used to display on the same plots the relative errors and confidence levels in panels (c) and (d).

Panels 3a and 3b display (plus symbols), for M = 97 and M = 9760, respectively, the relative errors of the duty-cycle estimates versus the true duty cycles of the simulated transients. The pairs of solid curves shown in both panels indicate the expected ±σδtrue range, as inferred from equation (19), which appear to be in good qualitative agreement with the trend of the observed fluctuations of the duty-cycle estimates. To help quantify this comparison, the square symbols indicate, for each simulated duty-cycle, the percentages of duty-cycle estimates laying within these ranges, i.e. the confidence levels of the δ ± σδ intervals. This comparison show that, for all duty-cycles, these confidence levels remain close to the 68.27 per cent SDEV confidence level we intended to achieve by applying the SDEV correction factors to the |$\sigma _{\widehat{SK^{\ast }}}^2$| variances estimated by equation (15), i.e. γ = 0.7013 for M = 97 and γ = 0.8093 for M = 9766. However, the duty-cycle trend they follow, indicate a systematic overestimation of the true standard-equivalent σδ fluctuations for δ < ≃50 per cent, and a systematic underestimation for larger duty-cycles. Similarly, panels 3c and 3d display the relative errors of the SNR estimates, their expected ±σρtrue intervals, as inferred from equation (19), and the confidence levels of these intervals, which appear to have a duty-cycle dependence that is systematically higher than a standard 68.27 per cent confidence level. Therefore, for any duty-cycle, equation (19) systematically overestimate the true standard-equivalent fluctuations of the ρ estimates.

Fig. 3 demonstrates that the statistical fluctuations of both δ and ρ estimates, which appear to be symmetrically distributed around the true parameter values, significantly decrease as the duty-cycle increases from zero to about 50 per cent, and continue to decrease at a more slower pace, as the duty-cycle approaches 100 per cent. Remarkably, these fluctuations decrease by one order of magnitude as the accumulation length increases from M = 97 to 9766.

However, we note that all of the above conclusions are based on the estimates obtained from all simulated Gaussian transients, disregarding whether or not they were actually flagged as such by the 0.13 per cent PFA detection thresholds shown in Fig. 1. The make such distinction, we use black symbols to display the estimates obtained from |$\widehat{SK}$| flagged measurements, and grey symbols for those obtained from those transients that would remain undetected in a real-life experiment. The ratio of black symbols to n = 1000, which is the total number of simulated transients having the same duty-cycle, follows, in each case, the duty-cycle dependence of the |$\widehat{SK}$| detection performance shown in Fig. 1c. From this perspective, we have to make the cautionary note that, although any individual |$\widehat{SK}$| measurement may be affected by practically small statistical uncertainties, the sample means of the δ and ρ estimates corresponding to a group of Gaussian transients characterized by the same true SNR and duty-cycle, may be statistically biased. Nevertheless, as demonstrated by the panels corresponding to M = 9766, this detection-induced statistical bias may be significantly reduced by increasing the accumulation length, or, at the cost of larger probabilities of false alarm, by lowering the detection thresholds for short accumulation lengths.

We thus conclude that the results displayed in Fig. 3 clearly demonstrate the ability of the |$\widehat{SK}$|–based measurement method presented in this section to provide estimates of the true parameters that, for sufficiently large accumulation lengths, may be affected by standard-equivalent statistical fluctuations not larger than a few per cent.

3 STATISTICS OF COHERENT TRANSIENTS

In this section, we analyse the statistical properties of the |$\widehat{SK}$| estimator in the case of a coherent transient time domain signal mixed with a quasi-stationary Gaussian background. In Section 3.1, we obtain a generally valid expression for the |$\widehat{SK}$| estimator that, in the limit M ≫ 1, reduces to the M ≫ 1 approximation of an expression previously reported by Nita et al. (2007), and we also obtain a first order approximation for the variance of this estimator. In Section 3.2, we generalize these results and provide analytical expressions for the |$\widehat{SK}$| estimator and its variance in the case of a transient coherent signal mixed with a Gaussian time domain background. We validate these analytical expectations by means of numerical simulations.

3.1 Quasi-stationary coherent signals

To determine the statistical properties of the |$\widehat{SK}$| estimator associated with coherent signals mixed with Gaussian background, we follow the same framework employed in Section 2.2 for the case of Gaussian transients, with the only difference being that the underlaying probability distribution function of the raw FFT-derived PSD estimate P is in this case given by (McDonough & Whalen 1995)
(23)
where σ2 represents the variance of a time domain Gaussian background, A is the amplitude of a mixed time domain sinusoidal signal, and Iα is the modified Bessel function of first kind.
Taking in consideration that, at each frequency bin, the mean spectral power of the time domain Gaussian background relates to the time domain variance σ2 as μ = 2σ2 (Nita et al. 2007), and defining the signal-to-noise ratio of the mixed coherent signal as ρ = A2/2σ2, equation (23) may be rewritten in terms of the normalized random variable xP2 = 2P/μ as a non-central chi-square distribution with k = 2 deg of freedom and non-centrality parameter λ = 2ρ, i.e. |$\chi ^2_{{\rm pdf}}(x,2,2\rho )$|⁠, where
(24)
Using the linearity property of the expectation operator, and assuming statistical independence of the time series samples, the PDF given by equation (24) provides
(25)
To compute the expectation |$E(S_1^{2})$|⁠, which is needed to evaluate the expectation of the |$\widehat{SK}$| estimator, we derive the probability distribution of a sum of M independent |$\chi ^2_{pdf}(x,2,2\rho )$|–distributed random variables, which is |$\chi ^2_{pdf}(x,2 M,2\rho M)$|⁠, from which we get
(26)
Since (Nita & Gary 2010a),
(27)
where, |$\mu _1^{\prime }\equiv \mu$| and μk are the raw |$\chi ^2_{pdf}$| moments of order k, we have
(28)
which, combined with the identity
(29)
leads to the unbiased expectation
(30)
Hence, combining equations (25), (26), and (30), immediately follows
(31)
where the M ≫ 1 approximation of the |$\widehat{SK}$| expectation is identical with the spectral variability of the parent population of the PSD estimate, |$\sigma ^2_P/\mu _P^2$| (Nita et al. 2007). This proves that the |$\widehat{SK}$| estimator defined by equation (1), which is an unbiased estimator of the spectral variability of a Gaussian time domain signal (Nita & Gary 2010a), is also a biased estimator of the spectral variability of a coherent signal mixed with a Gaussian background.

We note that, for ρ = 0, equation (31) reduce to unity, as expected for a transient-free Gaussian background, while it decreases from unity towards zero as fast as 2/ρ, as ρ increases.

To obtain an analytical approximation for the variance of the |$\widehat{SK}$| estimator similar to equation (15), in addition to |$E[S_1^2]$| and E[S2], one would not only need to obtain an analytical expression for the expectations |$E[S_1^n]$|⁠, |$(n=\overline{1,4})$|⁠, which can be exactly computed from the known |$\chi ^2_{pdf}(x,2,2\rho )$| distribution, but also the expectation |$E[S_2^2]$|⁠, which must be computed from the parent distribution of the S2 random variable, for which, so far, we were not able to find a closed-form analytical expression. Instead, we compute a first order approximation of |$\sigma ^2_{\widehat{SK}}$| that is generally valid for any M and for any probability distribution of the raw PSD estimates (Nita et al. 2007), which can be written in terms of the expectations E[x] as
(32)
For the particular case of x being distributed according to |$\chi ^2_{{\rm pdf}}(x,2,2\rho )$|⁠, equation (32) leads to
(33)
where, in addition to the first order approximation in terms of ρ, we have dropped the contribution of (M + 1)2/(M − 1)2, which becomes negligible for M ≫ 1.

We note that for ρ = 0, as expected, equation (33) reduce to the M ≫ 1 approximation (4/M) of the variance of the |$\widehat{SK}$| estimator associated with a quasi-stationary time domain Gaussian signal, while it vanishes as fast as 8/Mρ2, when ρ goes to infinity.

3.2 Coherent transient signals

To investigate how the |$\widehat{SK}$| estimator is expected to deviate from unity in the case of coherent time domain signal lasting shorter than the accumulation time, we employ the same framework as in Section 2.2, with the only difference of replacing the underlaying gamma statistical distribution characteristic to a Gaussian PSD estimate (Nita et al. 2007; Nita & Gary 2010a,b), with the |$\chi _{{\rm pdf}}^2(x,2,\rho )$| distribution (equation 24) describing the statistical properties of the PSD estimates corresponding to a coherent time domain signal characterized by a signal-to-noise ratio ρ relative to a quasi-stationary Gaussian time domain background. This approach straightforwardly leads to the biased expectation
(34)
which, for accumulation lengths M ≫ 1, reduces to
(35)
As expected, for δ = 0 (no transient signal present), both equations (34) and (35) reduce to unity, while for δ = 1 (quasi-stationary coherent signal), they reduce to the full |$\widehat{SK}$| expression, and its (M ≫ 1) approximation provided by equation (31). However, differently from the case of Gaussian transients, which are exclusively characterized by |$\widehat{SK}$| values larger than unity, the |$\widehat{SK}$| expressions provided by equations (34) and (35) may take values larger than unity for δ < 0.5 and smaller than unity for δ > 0.5, while crossing the |$1\pm 2/\sqrt{M}$| interval for duty-cycles close to 50 per cent, which is a well-known limitation of the |$\widehat{SK}$| or time domain kurtosis-based RFI detection algorithms (Ruf, Gross & Misra 2006; De Roo, Misra & Ruf 2007; Nita et al. 2007; Nita & Gary 2010a,b; Gary et al. 2010). However, for δ = 1/(4 + ρ), equation (35) reaches its maximum deviation from unity, |$\widehat{SK^{\ast }}=1+\rho ^2/(8+4\rho )$|⁠, which makes |$\widehat{SK}$| a very efficient coherent transient detector. Nevertheless, similar to the case of Gaussian transients addressed in Section 2.2, an evaluation of the |$\widehat{SK}$| statistical fluctuations is needed to fully asses its performance as a detector, as well as the experimental uncertainties affecting any parameter estimated from |$\widehat{SK}$| measurements.

The absence of closed form analytical expressions for the moments of the sums of squared random variables distributed according to |$\chi _{{\rm pdf}}^2(x,2,\rho )$|⁠, which prevented us from obtaining an exact analytical expression for the variance of the |$\widehat{SK}$| estimator associated with a quasi-stationary coherent signal, also prevents us from obtaining an analytical expression in the case of coherent transients. Moreover, the approach we used to in Section 3.1 to obtain the first order approximation of the |$\widehat{SK}$| variance is not directly applicable in the case of coherent transients due o the fact that S1 and S2 are not sums of random variables drawn from the same parent population. Instead, we provide an approximation that, although may seem based on more or less speculative basis, will be proven to be in agreement with the statistical fluctuations observed in numerical simulations.

Our approach is motivated by the observation that, although the full expressions of the |$\widehat{SK^{\ast }}$| estimators associated with the Gaussian (equation 12) and coherent (equation 34) transients are mathematically different, their (M ≫ 1) approximations (equations 13 and 35, respectively) are mathematically equivalent in the sense that the same observed |$\widehat{SK}$| value larger than unity may be either the result of a Gaussian transient characterized by the parameter pair {δ, ρ}, with δ anywhere in the 0–100 per cent range, or, alternatively, the result of a coherent transient with a duty-cycle shorter than 50 per cent characterized by the parameters {δ/2, 2ρ}. While, in the absence of additional information, this morphological transformation makes in principle indistinguishable the true physical nature of the observed transients exclusively from one |$\widehat{SK}$| measurement, it offers us enough grounds to speculate that the true variance of the coherent transient |$\widehat{SK}$| estimator might be reasonably approximated by applying the same morphological transformation to the (M ≫ 1) approximation provided by equation (15). This leads to the approximation
(36)
which, for δ = 0, reduces to 4/M, which is indeed the (M ≫ 1) expected |$\widehat{SK}$| variance for a quasi-stationary Gaussian time domain signal, while for δ = 1, (the case of a quasi-stationary coherent signal), it reduces to
(37)
which needs to be compared with the non-identical approximation that we analytically derived directly from the true statistical distribution of the PSD samples provided by equation (33).

This comparison reveals that, while equation (37) also reduces to 4/M for ρ = 0, unlike equation (33), it does not completely vanishes as ρ goes to infinity. Instead, it approaches as fast as 1/2M − 2/Mρ a residual value of 1/2M that practically vanishes for large accumulation lengths M. Based on this comparison, we find that the semi-analytical approximation provided by equation (36) suitable for practical application, especially in the light of analysis illustrated in Fig. 1, which indicated the need of a numerical SDEV correction of the |$\sigma _{\widehat{SK^{\ast }}}$| analytical expression.

In Fig. 4 , we present a similar analysis for the purpose of validating the analytical expressions obtained in this section. Remarkably, when compared with the sample mean of the Monte Carlo simulations (plus symbols), the inaccuracy of the |$\widehat{SK^{\ast }}$| approximation (solid red lines) appears to be negligible even for relatively short accumulation lengths. Nevertheless, for both values of M, the |$\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$| ranges (dashed lines) appear to largely overestimate the sample SDEV (square symbols). In this simulation, we find that the same SDEV correction factor γ = 0.38 is needed to be applied for both values of M to assure that 68.27 per cent of the simulated samples are scattered within the |$\widehat{SK^{\ast }}\pm \gamma \sigma _{\widehat{SK^{\ast }}}$| ranges indicated by the dot–dashed lines.

$\widehat{SK^{\ast }}$ estimator (solid lines) as function of a coherent transient duty-cycle for a signal-to-noise ratio ρ = 10 and two accumulations lengths, M = 97 (panel a) and M = 9766 (panel b). The symmetric $\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$ limits provided by equation (36) are overlaid (dashed lines) on top of the corresponding numerically simulated distributions (point symbols). The SDEV-corrected 68.27 per cent probability ranges, $\widehat{SK^{\ast }}\pm \gamma \sigma _{\widehat{SK^{\ast }}}$, are indicated by the dot–dashed lines. The square symbols indicate the numerically sample SDEV ranges around the mean of the simulated $\widehat{SK}$ distribution (plus symbols). The Pearson Type IV asymmetric detection thresholds, which correspond to standard 0.13 per cent probabilities of false alarm on each side of the unity $\widehat{SK}$ expectation if no transient emission was present, are indicated by horizontal lines. The SDEV correction factors γ, the maximum values reached by the duty-cycle dependent relative SDEV ϵmax, and the corresponding maximum relative bias of the $\widehat{SK}$ estimator, βmax, are indicated in each figure inset.
Figure 4.

|$\widehat{SK^{\ast }}$| estimator (solid lines) as function of a coherent transient duty-cycle for a signal-to-noise ratio ρ = 10 and two accumulations lengths, M = 97 (panel a) and M = 9766 (panel b). The symmetric |$\widehat{SK^{\ast }}\pm \sigma _{\widehat{SK^{\ast }}}$| limits provided by equation (36) are overlaid (dashed lines) on top of the corresponding numerically simulated distributions (point symbols). The SDEV-corrected 68.27 per cent probability ranges, |$\widehat{SK^{\ast }}\pm \gamma \sigma _{\widehat{SK^{\ast }}}$|⁠, are indicated by the dot–dashed lines. The square symbols indicate the numerically sample SDEV ranges around the mean of the simulated |$\widehat{SK}$| distribution (plus symbols). The Pearson Type IV asymmetric detection thresholds, which correspond to standard 0.13 per cent probabilities of false alarm on each side of the unity |$\widehat{SK}$| expectation if no transient emission was present, are indicated by horizontal lines. The SDEV correction factors γ, the maximum values reached by the duty-cycle dependent relative SDEV ϵmax, and the corresponding maximum relative bias of the |$\widehat{SK}$| estimator, βmax, are indicated in each figure inset.

The same as in the case of Gaussian transients, the |$\widehat{SK}$| performance in detecting coherent transients, which is illustrated in Fig. 4c, improves as the accumulation length increases, reaching a flat 100 per cent for all duty-cycles except narrow ranges at both ends of the interval, as well as around the 50 per cent duty-cycle mark. Fig. 5 completes this detection performance analysis by showing that the coherent transient detection performance of the |$\widehat{SK}$| estimator increases as the signal-to-noise ratio increases. However, as shown by Fig. 5a, the |$\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$| ranges corresponding to different signal-to-noise ratios may overlap, which could result in large uncertainties of the SNR and duty-cycle estimates obtained from |$\widehat{SK}$| measurements. This aspect is quantitatively investigated in the next section.

Coherent transient duty-cycle variation of the $\widehat{SK^{\ast }}$ estimator (thick lines) and its expected $\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$ fluctuations (thin lines) for M = 97 (panel a) and M = 9766 (panel b), and three selected signal-to-noise ratios, ρ = 5, 7, and 10 (solid, dashed, and dot–dashed lines, respectively).
Figure 5.

Coherent transient duty-cycle variation of the |$\widehat{SK^{\ast }}$| estimator (thick lines) and its expected |$\widehat{SK^{\ast }}\pm \sigma ^2_{\widehat{SK^{\ast }}}$| fluctuations (thin lines) for M = 97 (panel a) and M = 9766 (panel b), and three selected signal-to-noise ratios, ρ = 5, 7, and 10 (solid, dashed, and dot–dashed lines, respectively).

3.3 |$\widehat{SK}$| measurements of coherent transients

Following the same approach as in Section 2.3, and taking in consideration that, in the case of a coherent transients mixed with a Gaussian, the variance |$\sigma _{S_1}^2= E(S_1^2)-E(S_1)^2$| can be expressed in terms of the expectations provided by equations (25) and (26), and |$\sigma _{\widehat{SK}}^2=\gamma \sigma _{\widehat{SK^{\ast }}}^2$| is provided by equation (36), the steps leading to the SNR and duty-cycle estimates, and their corresponding statistical uncertainties, is fully described by the following sequence of equations, which ultimately depend only on the directly measured magnitudes S1(ti), S2(ti), and S1(ti−1):
(38)
Fig. 6, which has the same layout as Fig. 3, illustrates the performance of the estimations provided by equation (38) in the case of the coherent transient simulations characterized by the |$\widehat{SK}$| distributions shown in Fig. 4. Fig. 6 demonstrate that the workflow described by equation (38) may provide SNR and duty-cycle estimates that, even for relatively short accumulation lengths, are affected by statistical uncertainties that, for duty-cycles larger than about 20 per cent do not exceed a few per cent. We also find that the confidence levels of the δ ± σδ intervals, also provided equation (38), are consistent with a standard 68.27 confidence level for most of the duty-cycle interval, while the confidence levels of the ρ ± σρ intervals are systematically higher. We thus conclude that |$\widehat{SK}$|–based measurement method presented in this section has a level of accuracy is suitable for practical applications.
Duty-cycle distributions of the relative errors of the duty-cycle estimates (panels a and b, M = 97 and M = 9766, respectively) and SNR estimates (panels c and d, M = 97 and 9766, respectively) obtained from the same set of coherent transient simulations used to generate the $\widehat{SK}$ distributions shown in Fig. 4. The estimates obtained from transients flagged by the 0.13 per cent PFA $\widehat{SK}$ detection thresholds are indicated by black plus symbols, and the estimates obtained from not flagged $\widehat{SK}$ simulations are shown as grey symbols. The pairs of solid curves in each panel indicate the range of the expected standard-equivalent ±1σ statistical fluctuations computed based on the known true values δtrue and ρtrue. The same correction factor γ = 0.38 has been applied for both M = 97 and M = 9766 to the corresponding $\widehat{SK}(\delta _{{\rm true}},\rho _{{\rm true}})$ expected fluctuations, as provided by equation (36). The square symbols shown in each panel indicate the true confidence levels of the ±1σ intervals, which are compared with a standard 68.27 per cent confidence level (horizontal lines). Due to much smaller fluctuations affecting the M = 9766 estimates, different scales are used to display on the same plots the relative errors and confidence levels in panels (c) and (d).
Figure 6.

Duty-cycle distributions of the relative errors of the duty-cycle estimates (panels a and b, M = 97 and M = 9766, respectively) and SNR estimates (panels c and d, M = 97 and 9766, respectively) obtained from the same set of coherent transient simulations used to generate the |$\widehat{SK}$| distributions shown in Fig. 4. The estimates obtained from transients flagged by the 0.13 per cent PFA |$\widehat{SK}$| detection thresholds are indicated by black plus symbols, and the estimates obtained from not flagged |$\widehat{SK}$| simulations are shown as grey symbols. The pairs of solid curves in each panel indicate the range of the expected standard-equivalent ±1σ statistical fluctuations computed based on the known true values δtrue and ρtrue. The same correction factor γ = 0.38 has been applied for both M = 97 and M = 9766 to the corresponding |$\widehat{SK}(\delta _{{\rm true}},\rho _{{\rm true}})$| expected fluctuations, as provided by equation (36). The square symbols shown in each panel indicate the true confidence levels of the ±1σ intervals, which are compared with a standard 68.27 per cent confidence level (horizontal lines). Due to much smaller fluctuations affecting the M = 9766 estimates, different scales are used to display on the same plots the relative errors and confidence levels in panels (c) and (d).

4 |$\widehat{SK}$| DISCRIMINATION OF UNDERLAYING TRANSIENT STATISTICS

In the previous sections, we demonstrated the ability of an |$\widehat{SK}$| spectrometer to detect and measure two special categories of spectral transients mixed with a Gaussian time domain background. However, this performance analysis involved prior knowledge of the true, Gaussian or coherent, statistical nature of the transients. Therefore, the ability of inferring the underlaying transient statistics from |$\widehat{SK}$| measurements has still to be demonstrated. For this purpose, we consider the hypothetical case of two transients, one Gaussian and another coherent, that have the same signal-to-noise ratios and durations, and investigate the variation of their expected |$\widehat{SK}$| estimator as function of various accumulation lengths.

Fig. 7 presents the result of such analysis for an accumulation length set to M = 97. To model some particular aspects that may be encountered in a real experiment, both transients were purposely chosen to have an SNR ρ = 5, a duration ΔM = 3500 FFT blocks, longer than the accumulation length, and an offset δM = 50 FFT blocks relative to the start of one of the accumulation blocks. Figs 7a and b display the accumulated power and, respectively, the duty-cycle, which both have flat distributions overall but the two accumulation blocks containing the rising and falling edges of the transients. Consequently, as shown in Fig. 7c, the expected |$\widehat{SK}$| of the Gaussian transient deviates from unity only in the accumulations bins that do not have a duty-cycle equal to 0 or 100 per cent. However, due to their relatively large statistical fluctuations, the rising and falling edges of such Gaussian transients may or may not be flagged by the 0.13 per cent PFA detection thresholds, i.e. [0.56, 1.90]. On the contrary, in Fig. 7d, all inner accumulations blocks are flagged as unambiguously containing a coherent transient, because the exact 100 per cent duty-cycle translates into less than unity |$\widehat{SK}$| values. However, the rising edge of such a coherent transient would escape detection due to its ∼50 per cent duty-cycle, while its falling edge, which corresponds to a duty-cycle ∼11 per cent, may or may not escape detection due to its relatively large statistical fluctuation, despite an |$\widehat{SK}=1.90$| that happens to be close the maximum value attainable by a coherent transient having ρ = 5, which is δ = 1/9 = 11.11 per cent (equation 35).

Expected $\widehat{SK}$ discrimination of two transients lasting longer than the accumulation length (M = 97). The transients, which have different underplaying statistics, have the same duration (3540 raw FFT blocks) and SNR (ρ = 5), and start at the same offset (350 raw FFT blocks) relative to the start of the first accumulation. (a) SNR (dot–dashed line) and accumulated power (solid line) as function of the accumulation block index. b) The duty-cycle profile of both transients. $\widehat{SK}$ (solid line) and $\widehat{SK}\pm \sigma _{\widehat{SK^{\ast }}}$ (error bars) for the Gaussian and coherent transients are shown in panels (c) and (d), respectively. The range bounded by the 0.13 per cent PFA detection thresholds, [0.56, 1.90], is indicated by the grey-shaded areas in panels (c) and (d). The accumulation blocks during which the transients start and end are marked by vertical lines in all panels.
Figure 7.

Expected |$\widehat{SK}$| discrimination of two transients lasting longer than the accumulation length (M = 97). The transients, which have different underplaying statistics, have the same duration (3540 raw FFT blocks) and SNR (ρ = 5), and start at the same offset (350 raw FFT blocks) relative to the start of the first accumulation. (a) SNR (dot–dashed line) and accumulated power (solid line) as function of the accumulation block index. b) The duty-cycle profile of both transients. |$\widehat{SK}$| (solid line) and |$\widehat{SK}\pm \sigma _{\widehat{SK^{\ast }}}$| (error bars) for the Gaussian and coherent transients are shown in panels (c) and (d), respectively. The range bounded by the 0.13 per cent PFA detection thresholds, [0.56, 1.90], is indicated by the grey-shaded areas in panels (c) and (d). The accumulation blocks during which the transients start and end are marked by vertical lines in all panels.

Therefore, the results illustrated by Fig. 7 indicate that, for accumulation lengths shorter than the transient duration, the |$\widehat{SK}$| analysis alone is guaranteed to detect 100 per cent duty-cycle transients, unambiguously recognize their coherent dynamics, and even directly measure their duration with an uncertainty comparable with the integration time. However, a Gaussian transient having the same 100 per cent duty-cycle may entirely escape |$\widehat{SK}$| detection. Moreover, even if both rising and falling edges are detected, they could not be unambiguously attributed to the edges of a a Gaussian transient, since they could be equally attributed to two unrelated transients, of any of the two types, having durations shorter than half of the integration time. Nevertheless, if the existence of such Gaussian transient is alternatively flagged by its accumulated power profile, the |$\widehat{SK}$| analysis may unambiguously determine the nature of such transient, since only Gaussian transients may have 100 per cent duty-cycles and unity |$\widehat{SK}$|⁠. However, such S1-based transient detection scheme, which would necessarily involve arbitrarily defined empirical detection thresholds, would not be as reliable as an |$\widehat{SK}$| -only detection scheme based on exactly known probabilities of false alarm (Nita & Gary 2010a).

Based on this analysis presented in Fig. 7, we conclude that an |$\widehat{SK}$| spectrometer may efficiently flag continuous or transient coherent signals longer than its integration time, as well as both Gaussian and coherent transients shorter than its integration time, but without being able to unambiguously discriminate the statistical nature of such short transients. Nevertheless, based on a combined S1 and |$\widehat{SK}$| analysis, the statistical nature of the transients lasting longer than the integration time could be inferred, and thus, their duration and signal-to-noise ratios estimated based on the correct statistical model.

The first experimental validation of such |$\widehat{SK}$| spectrometer capabilities has been provided by Nita et al. (2007). Using data recorded by the FST instrument during a solar radio burst, and a software-implemented |$\widehat{SK}$| spectrometer design involving sets of 100 μs contiguous acquisition blocks followed by 20 ms acquisition gaps, Nita et al. (2007) demonstrated the ability of the |$\widehat{SK}$| spectrometer to selectively filter out RFI transients shorter or longer than the integration time, while leaving untouched the microwave spikes of solar origin, which were inferred to have durations longer than the 100 μs accumulation time.

However, in a more recent study, using data obtained with the EST instrument during another solar bursts featuring spiky emission, and an hardware-implemented |$\widehat{SK}$| spectrometer that was designed to integrate 20ms contiguous blocks (M = 9766), with no acquisition gaps in between, Nita & Gary (2016) demonstrated that the integration blocks containing radio spikes of solar origin were flagged by the |$\widehat{SK}$| 0.13 per cent PFA detection thresholds. Using the analysis framework detailed in Section 2.3, Nita & Gary (2016) estimated that the spectral peak of one of the observed solar radio spikes was characterized by an SNR ρ = 2.14 ± 0.11, and a duration τ = (8.05 ± 0.30) ms, which is consistent with theoretical expectations (Sirenko & Fleishman 2009) and previous time-resolved observations of microwave solar radio spikes (Rozhansky, Fleishman & Huang 2008).

Nevertheless, to assign a Gaussian statistical model to the observed microwave spikes, Nita & Gary (2016) had to rely on theoretical expectations that microwave spikes of solar emission must have a Gaussian time domain distribution, and to discard the possibility of |$\widehat{SK}$| flagged spikes to represent low duty-cycle local instrumental RFI, hypothesis that was ruled out by serendipitous Very Large Array (VLA, Perley et al. 2011) simultaneous observations of the same spikes, which independently confirmed their genuine solar origin (Chen et al. 2015).

All of the above examples indicate that, if a targeted class of transients is expected to have durations ranging in a certain interval, the accumulation length of an |$\widehat{SK}$| spectrometer may be in principle tuned to an optimal value that would allow intrinsic discrimination of their underlying statistical properties.

To explore such possible avenue, Fig. 8 illustrates, for the same hypothetical transients considered in Fig. 7, the expected |$\widehat{SK}$| profiles obtained by varying the accumulation length in unit steps, from M = 97, up to a maximum accumulation length several order of magnitude larger. In addition to the information displayed in Figs 7c and d, a set of Gaussian and coherent transient profiles were numerically generated according to the SNR profile shown in Fig. 8a; their corresponding |$\widehat{SK}$| random realizations were calculated for several integer multiple of M = 97, and overlayed on the corresponding |$\widehat{SK}$| profiles (solid lines) and |$\widehat{SK}\pm \sigma _{\widehat{SK^{\ast }}}^2$| ranges (dark-grey-shaded areas) shown in Figs 8c and 8, respectively. This comparison demonstrates a very good agreement between the distribution of the |$\widehat{SK}$| random deviates and the theoretical expectations.

Expected $\widehat{SK}$ profiles as function of a varying accumulation length for the same pair of transients considered in Fig. 7. The SNR (dot–dashed line) and accumulated power (solid line) profiles are sown in panel (a), and the duty-cycle profile is shown in panel (b). A series of numerically generated $\widehat{SK}$ random deviates corresponding to a set of selected integer multiples of the minimum accumulation length, M = 97, are overlayed (symbols) on the $\widehat{SK}$ (solid line) and $\widehat{SK}\pm \sigma _{\widehat{SK^{\ast }}}$ (dark shaded areas) corresponding to the Gaussian(panel c) and coherent (panel d) transients. The range bounded by the 0.13 per cent PFA detection thresholds is indicated by the grey-shaded areas in panels (c) and (d). The start and end of the transients are marked by vertical lines in all panels.
Figure 8.

Expected |$\widehat{SK}$| profiles as function of a varying accumulation length for the same pair of transients considered in Fig. 7. The SNR (dot–dashed line) and accumulated power (solid line) profiles are sown in panel (a), and the duty-cycle profile is shown in panel (b). A series of numerically generated |$\widehat{SK}$| random deviates corresponding to a set of selected integer multiples of the minimum accumulation length, M = 97, are overlayed (symbols) on the |$\widehat{SK}$| (solid line) and |$\widehat{SK}\pm \sigma _{\widehat{SK^{\ast }}}$| (dark shaded areas) corresponding to the Gaussian(panel c) and coherent (panel d) transients. The range bounded by the 0.13 per cent PFA detection thresholds is indicated by the grey-shaded areas in panels (c) and (d). The start and end of the transients are marked by vertical lines in all panels.

As illustrated in Fig. 8b, although the transients have a fixed duration, their relative duty-cycle increases from 0 per cent, up to ∼91 per cent, as an increasing portion of the transient life-time contributes to the accumulation, and it gradually decreases towards 0 per cent, as more and more transient-free blocks are added to the accumulation. Consequently, the |$\widehat{SK}$| profile of the Gaussian transient (Fig. 8c) features a double-peak evolution that, for all relative duty-cycle realizations, stays above unity. Although the |$\widehat{SK}$| profile of coherent transient (Fig. 8d) follows a similar double-peak dependence on the accumulation length, unlike the Gaussian |$\widehat{SK}$| profiles, it reaches less than unity values for the range of accumulation lengths corresponding to relative duty-cycles above 50 per cent.

Therefore, the particular example illustrated in Fig. 8 demonstrates that, if the fixed accumulation length of the |$\widehat{SK}$| spectrometer is tuned to a value that is shorter than twice the expected duration of a coherent transient, while longer than the expected duration of a Gaussian transient, there is a non-zero probability of random realization of an observation that would unambiguously discriminate the statistical nature of such transients, and thus allow reliable estimations of their SNR and duty-cycles.

Although such particular condition may seem too restrictive for having wide practical applicability, we demonstrate below that it can be straightforwardly achieved by imposing a less restrictive condition on a fixed |$\widehat{SK}$| spectrometer accumulation length, which would be sufficient to be shorter than the duration of both types of transients, in order to provide automatic discrimination capabilities.

Indeed, given the fact that the standard S1(M) and S2(M) outputs provided by an |$\widehat{SK}$| spectrometer are additive quantities, they can be sequentially grouped and added together to form the variable length accumulations S1(kM) and S2(kM), k being an integer, and thus generate a discreet |$\widehat{SK}$| profile that would follow a continuous accumulation length profile similar to the Gaussian or coherent |$\widehat{SK}$| profiles illustrated in Figs 8c and 8d, as it is demonstrated by the numerically generated |$\widehat{SK}$| deviates shown in the same panels.

This concept of multiscale spectral Kurtosis (MSSK) analysis was originally proposed by Gary et al. (2010), and demonstrated to effectively improve the detection performance of an |$\widehat{SK}$| spectrometer in the case of RFI transients having durations close to half of the fixed accumulation length of the KSRBL instrument. More recently, Nita & Gary (2016) applied the same concept to develop a measurement technique based on fitting the discreet MSSKs profiles with their expected functional forms, which, within the statistical uncertainties, provided estimates consistent with those obtained for the same transient signals based on the monoscale analysis described in $2.3 and 3.3. However, given the fact that, in this particular analysis, the ∼20 ms accumulation length the EST instrument was longer than the inferred ∼8 ms duration of the observed solar microwave spikes, the corresponding MSSK profiles reproduced only the region around the secondary peak of the expected functional form, and thus, the underlaying Gaussian statistics of the solar microwave spikes could not been directly confirmed.

From this perspective, although the classical |$\widehat{SK}$| spectrometer design originally proposed by Nita et al. (2007) has been proven to already provide, in the form of the S1 and S2 measured quantities, an instrumental output suitable for implementing a downstream real-time MSSK analysis pipeline capable of providing, under favourable circumstances, automatic discrimination of the statistical nature of the observed transients, to generate full length discriminatory MSSK profiles similar to those shown in Fig. 8, a modified |$\widehat{SK}$| spectrometer design should be considered.

Such versatile MSSK spectrometer design could involve a hardware implemented continuous computation of |$\widehat{SK}$| estimates that, as the accumulation evolves, would generate Gaussian or coherent |$\widehat{SK}$| flags as soon as one of another transient type is unambiguously identified. If the main goal of such instrument would be the detection and discrimination of transients, the evolving accumulation could be stopped, and the next one initiated, as soon as the transient identification is made, or continued up to a accumulation length that would provide, as demonstrated by Figs 3 and 6, transient duration and SNR estimates having the desired level of accuracy.

However, if a fixed accumulation lengthy is preferred, the MSSK spectrometer design could include transient counters that, for each fixed accumulation, would provide two additional data outputs representing counts of detected Gaussian and coherent transients, if any. If the coherent transients are believed to be exclusively generated by RFI, while the coherent transients to be generated by astronomical sources, this additional information could be subsequently used to safely filter out the accumulations affected by RFI, while preserving the accumulations containing contributions from Gaussian transients shorter than the integration time.

5 CONCLUSIONS

We obtained analytical expressions that provide biased estimations of the true mean and variance of the |$\widehat{SK}$| distributions in two special cases of spectral transients mixed with a Gaussian time domain background, as functions of their signal-to-noise ratio, and duty-cycle relative o the instrumental accumulation time. We investigated the bias of these approximations and their transient detection performance by means of Monte Carlo simulations.

We demonstrated that the |$\widehat{SK}$| transient detection performance may be significantly increased, and the bias of |$\widehat{SK}$| -based estimates may be significantly reduced, by increasing the accumulation length. We also developed an analytical workflow leading to estimates of the SNR and duty-cycle of such transients, and their standard-equivalent statistical deviations.

We investigated the accuracy of these estimates and found that, although their statistical uncertainties may vary as function of the SNR and duty-cycle, they can be reduced as low as a few per cent by increasing the instrumental accumulation length.

We described a practical adaptive approach that, even for a fixed accumulation length, may improve the transient detection performance and reduce the statistical uncertainties of the SNR and duty-cycle estimates, by taking full advantage of the built-in capabilities of the original |$\widehat{SK}$| spectrometer design.

We suggested an original multiscale |$\widehat{SK}$| spectrometer design optimized for real-time detection, classification, and analysis of various transient astronomical signals generated by flaring stars, pulsars, and extragalactic sources, including the elusive Fast Radio Burst transients (Keane et al. 2016).

Nevertheless, such design may also be considered for the purpose of investigating the existence of coherent natural or artificial astronomical signals, which could be facilitated by a higher order statistics spectrometer as the one proposed here (Melrose 2009).

The author thanks the anonymous reviewer for useful comments that helped improve the final version of this manuscript.

REFERENCES

Bevington
P. R.
Robinson
D. K.
1992
Data Reduction and Error Analysis for the Physical Sciences
2nd edn
McGraw-Hill, Inc.
New York

Chen
B.
Bastian
T. S.
Shen
C.
Gary
D. E.
Krucker
S.
Glesener
L.
2015
Science
350
1238

De Roo
R.
Misra
S.
Ruf
C.
2007
IEEE Trans. Geosci. Remote Sens.
2706

Dou
Y.
Gary
D. E.
Liu
Z.
Nita
G. M.
Bong
S.-C.
Cho
K.-S.
Park
Y.-D.
Moon
Y.-J.
2009
PASP
121
512

Gary
D. E.
Liu
Z.
Nita
G. M.
2010
PASP
122
560

Gary
D. E.
Nita
G. M.
Sane
N.
2012
Am. Astron. Soc. Meeting Abstr. 220
204.30

Keane
E. F.
et al.
2016
Nature
530
453

Liu
Z.
Gary
D. E.
Nita
G. M.
White
S. M.
Hurford
G. J.
2007
PASP
119
303

McDonough
R. N.
Whalen
A. D.
1995
Detection of Signals in Noise
2nd edn
Academic Press
New York

Melrose
D. B.
2009
Gopalswamy
N.
Webb
D. F.
Proc. IAU Symp. 257, Universal Heliophysical Processes
Cambridge Univ. Press, Cambridge
305

Nita
G. M.
Gary
D. E.
2010a
PASP
122
595

Nita
G. M.
Gary
D. E.
2010b
MNRAS
406
L60

Nita
G. M.
Gary
D. E.
2016
J. Geophys. Res.
in press

Nita
G. M.
Gary
D. E.
Liu
Z.
Hurford
G. J.
White
S. M.
2007
PASP
119
805

Perley
R. A.
Chandler
C. J.
Butler
B. J.
Wrobel
J. M.
2011
ApJ
739
L1

Rozhansky
I. V.
Fleishman
G. D.
Huang
G.-L.
2008
ApJ
681
1688

Ruf
C.
Gross
S.
Misra
S.
2006
IEEE Trans. Geosci. Remote Sens.
44
694

Sirenko
E. A.
Fleishman
G. D.
2009
Astron. Rep.
53
369