-
PDF
- Split View
-
Views
-
Cite
Cite
Christopher Michael Bryant, Joshua Alexander Osborne, Amir Shahmoradi, How unbiased statistical methods lead to biased scientific discoveries: A case study of the Efron–Petrosian statistic applied to the luminosity-redshift evolution of gamma-ray bursts, Monthly Notices of the Royal Astronomical Society, Volume 504, Issue 3, July 2021, Pages 4192–4203, https://doi.org/10.1093/mnras/stab1098
- Share Icon Share
ABSTRACT
Statistical methods are frequently built upon assumptions that limit their applicability to certain problems and conditions. Failure to recognize these limitations can lead to conclusions that may be inaccurate or biased. An example of such methods is the non-parametric Efron–Petrosian test statistic used in the studies of truncated data. We argue and show how the inappropriate use of this statistical method can lead to biased conclusions when the assumptions under which the method is valid do not hold. We do so by reinvestigating the evidence recently provided by multiple independent reports on the evolution of the luminosity/energetics distribution of cosmological Long-duration Gamma-Ray Bursts (LGRBs) with redshift. We show that the effects of detection threshold have been likely significantly underestimated in the majority of previous studies. This underestimation of detection threshold leads to severely incomplete LGRB samples that exhibit strong apparent luminosity-redshift or energetics-redshift correlations. We further confirm our findings by performing extensive Monte Carlo simulations of the cosmic rates and the luminosity/energy distributions of LGRBs and their detection process.
1 INTRODUCTION
Gamma-ray bursts (GRBs) are the most violent and energetic stellar explosions in the known Universe. They radiate huge amounts of gamma-ray energy, comparable to the lifetime energy output of the sun, over a short period of time and are often followed by an afterglow at longer wavelengths (e.g. Mészáros 2006; Zhang 2007; Gehrels, Ramirez-Ruiz & Fox 2009). With their energy concentrated in a collimated beam, they can be seen at much higher redshifts than supernovae (SNe). Amongst other things, GRBs are excellent tools to probe the star formation rate (SFR) of the early as well as the recent Universe.
GRBs are generally divided into two categories: Long-soft GRBs (LGRBs) with T90 ≳ 2 [s], and short-hard GRBs (SGRBs) with T90 ≲ 2 [s].* These values are based on population statistics of the Compton Gamma Ray Observatory’s Burst and Transient Source Experiment (BATSE) detector, which was decommissioned in 2000 (Kouveliotou et al. 1993; Shahmoradi & Nemiroff 2011; Shahmoradi 2013b,a; Shahmoradi & Nemiroff 2015). LGRBs are believed to be the result of the collapse of massive stars into a black hole (Woosley 1993), while SGRBs are theorized to be the result of the merger of two neutron stars or of a neutron star and a black hole (Eichler et al. 1989).
Current research attempts to infer an accurate description and distribution profile of various GRB characteristics, in particular, the class of LGRBs due to their abundant redshift measurements compared to the SGRB class. A recent focus in the community has been on the potential cosmological evolution of LGRB luminosity/energetics Liso/Eiso with redshift, as well as estimating the cosmic rates of LGRBs. A popular technique to constrain these is based on the non-parametric method of Efron–Petrosian (Efron & Petrosian 1992; Petrosian 2002), which is widely used to study observational data sets subject to truncation and censorship (e.g. Kocevski & Liang 2006; Singal et al. 2011; Dainotti et al. 2013).
Yu et al. (2015) (hereafter Y15), Petrosian, Kitanidis & Kocevski (2015) (hereafter P15), and Lloyd–Ronning, Aykutalp & Johnson (2019) (hereafter L19) use the method of Efron–Petrosian to deduce the local (redshift-decorrelated) luminosity function ψ(L0) and cosmic GRB formation rate ρ(z) to infer that local GRBs (z < 1) are in excess of the SFR. Pescalli et al. (2016) (hereafter P16) follows the same approach as the previous three, however, does not find an excess of GRBs relative to the SFR at low redshifts. Tsvetkova et al. (2017) (hereafter T17) simply deduces the luminosity function and GRB formation rate.
In this work, we hypothesize and provide evidence that the effects of detection threshold in the aforementioned studies might have been significantly underestimated. This underestimation of the detector threshold results in an artificial correlation between the luminosity/energetics of LGRBs and redshifts. This could, in turn, lead to the conclusion that the GRB formation rate is different from the SFR at any redshift.
This paper is organized as follows. Section 2 details the methodology used by the aforementioned papers to deduce luminosity/energy evolution L(z)/E(z) given a sample of GRBs, which leads to ψ(L0) and ρ(z). Section 3 is a reanalysis of their work (with the exception of P15, for whom we could not locate their data). Section 4 describes our own Monte Carlo simulation of a synthetic population of LGRBs. Finally, Section 5 is a discussion of the results of our reanalysis and the implications of our Monte Carlo simulations.
2 METHODS

Plotting of the 127 GRBs from Y15. Plot (a) shows isotropic luminosity versus redshift. The black line represents the observational limit of Swift assumed by Y15 to be 2.00 × 10−8 erg cm−2 s−1. The purple line represents the linear regression through the data set whose slope is α = 2.04. Plot (b) is the observer frame visualization of the Y15 data set, where the dashed line is the observational limit of Y15. Plot (c) is a histogram of flux, where the dashed line is the observational limit. Plot (d) shows redshift versus L0 = L(z)/(1 + z)2.04, the redshift-independent luminosity. Plot (e) shows a range of possible threshold limits versus α values at τ = 0. The intersection of this line with Y15’s threshold limit is our value for their α. Plot (f) shows a range of possible threshold limits versus τ values at α = 0. Assuming the detection threshold of Y15, a redshift-independent luminosity distribution is rejected at |τ| = 5.18σ. However, choosing a more conservative detection threshold at 2.00 × 10−7 erg cm−2 s−1 yields no evidence for luminosity-redshift evolution.
3 REANALYSES OF PAST WORK
In the following subsections, we attempt to regenerate and reanalyse the findings of several recent papers that present evidence in favour of a strong evolution of the GRB luminosity/energetics with redshift.
3.1 Yu et al. (2015) (Y15)
To better understand the role of Swift’s BAT detector threshold on the conclusions drawn by Y15, here we attempt to reproduce their analysis of Swift data (Table 1). Figs 1(a)–(c) depict the distributions of the observational LGRB sample used in table 1 of Y15. Specifically, the bivariate distribution of the 1 s total isotropic peak flux (Liso) and the redshifts (1 + z) of LGRBs as shown in plot 1(a) exhibits an apparently strong correlation. However, much of this correlation is potentially due to the BAT detection threshold effects on the observational sample of LGRBs. To quantify and eliminate the effects of detector threshold, Y15 use the proposed non-parametric methodology of Efron & Petrosian (1992) by assuming a parametrization as seen in equation (2) for the luminosity-redshift dependence in the LGRB data, such that the resulting distribution of L0 becomes independent of redshift. We note an apparent inconsistency between equation (5) that we have extracted from Efron & Petrosian (1992) and equation 9 of Y15, where summations are taken in different places.
Using the same value as Y15 for Fmin yields quantitatively similar α and τ values.
. | Y15 Values . | Reanalysis values . |
---|---|---|
α | |$2.43^{+0.41}_{-0.38}$| | |$2.04^{+0.36}_{-0.37}$| |
τ | 4.7σ | 5.18σ |
Fmin | 2.0 × 10−8 erg cm−2 s−1 | 2.00 × 10−8 erg cm−2 s−1 |
. | Y15 Values . | Reanalysis values . |
---|---|---|
α | |$2.43^{+0.41}_{-0.38}$| | |$2.04^{+0.36}_{-0.37}$| |
τ | 4.7σ | 5.18σ |
Fmin | 2.0 × 10−8 erg cm−2 s−1 | 2.00 × 10−8 erg cm−2 s−1 |
Using the same value as Y15 for Fmin yields quantitatively similar α and τ values.
. | Y15 Values . | Reanalysis values . |
---|---|---|
α | |$2.43^{+0.41}_{-0.38}$| | |$2.04^{+0.36}_{-0.37}$| |
τ | 4.7σ | 5.18σ |
Fmin | 2.0 × 10−8 erg cm−2 s−1 | 2.00 × 10−8 erg cm−2 s−1 |
. | Y15 Values . | Reanalysis values . |
---|---|---|
α | |$2.43^{+0.41}_{-0.38}$| | |$2.04^{+0.36}_{-0.37}$| |
τ | 4.7σ | 5.18σ |
Fmin | 2.0 × 10−8 erg cm−2 s−1 | 2.00 × 10−8 erg cm−2 s−1 |
The value of exponent that we infer from the data set of Y15 is consistent with, although not the same as, their inferred α exponent. Y15 found a value of |$\alpha = 2.43^{+0.41}_{-0.38}$| in their analysis. In our reanalysis of their work, we find a value of |$\alpha = 2.04^{+0.36}_{-0.37}$|. However we do believe that their assumption for the value of the flux lower limit of equation (7) is likely a severe underestimation of the detection threshold of Swift’s BAT. In reality the detection threshold limit of the BAT detector is far more complex than a simple hard cutoff, but rather, a fuzzy range arising from a multitude of factors.
The Swift satellite is very well known for its immensely complex triggering algorithm. To our knowledge, it is comprised of at least three separate detection mechanisms that complement each other (e.g. Fenimore et al. 2003):
The first type of trigger is for short time-scales (4 to 64 ms). These are traditional triggers (single background), for which about 25 000 combinations of time-energy-focal plane subregions are checked per second.
The second type of trigger is similar to HETE: fits multiple background regions to remove trends for time-scales between 64 ms and 64 s. About 500 combinations for these triggering mechanisms are checked per second. For these rate triggers, false triggers and variable non-GRB sources are also rejected by requiring a new source to be present in an image.
The third type of trigger works on longer time-scales (minutes) and is based on routine images that are made of the field of view.
The entire complexity of the detection mechanism of Swift’s BAT, as mentioned above, is summarized in a single value, equation (7), in the work of Y15. The consequences of choosing this value is most apparent in Figs 1(b) and (c), where the data set of detected LGRBs virtually ends before reaching the detection threshold line. This implies that the observational LGRB data set is not constrained by the assumed detection threshold of Y15, which is counterintuitive. We do expect the threshold to soft-truncate the data set, and this truncation likely occurs closer to the central peak of the histogram of Fig. 1(c).
This severe underestimation of the detection threshold of BAT by Y15 is very well seen in Fig. 1(d) where we plot the redshift-corrected isotropic peak luminosity versus redshift. The solid black line in this plot represents the redshift-corrected detection threshold. In agreement with our hypothesis in the previous paragraph, we observe that the resulting redshift-corrected detection threshold of Y15 resembles almost a flat line at high redshifts. This is another indication that the inferred relationship between (1 + z) and Liso is likely heavily influenced by the improperly modelled detection threshold of BAT.
In Figs 1(e) and (f) the calculation of α and τ are taken a step further, this time by varying the detector threshold limit, Fmin. Fig. 1(e) depicts how α varies with detector threshold when τ equals zero, while Fig. 1(f) depicts how τ varies with detector threshold when α equals zero. Both plots can therefore show for what value of detector threshold both α and τ equal zero (indicating zero correlation between Liso and z), however, this is only highlighted in Fig. 1(f) with the blue dashed line. In other words, how a more conservative choice of detector threshold can remove the calculated correlation entirely. For the Y15 data set, it takes only one order of magnitude to fully remove the correlation.
To gain a better insight into the effects of detection threshold on observational data, we have reproduced parts of the results of Shahmoradi & Nemiroff (2015) in Fig. 2. This figure illustrates well the subtle fuzzy effects of the BATSE detector threshold on the observed distribution of the energetics of BATSE LGRBs and SGRBs, including the observed peak flux distribution for which a sharp detection threshold cutoff is frequently assumed. The detection threshold causes a deviation in the observed data from the inferred underlying population, to begin just to the right of the histogram peak (the solid lines). This deviation becomes more significant as one moves leftward. Y15 chose their detector threshold to begin far to the left of the histogram peak, when in reality, it should have been chosen close to the peak. If their choice of detector threshold is taken seriously, then its implications are profound: there is a suspicious lack of faint LGRBs in the Universe that are not the result of detector threshold cutoff, or Malmquist bias. This is why the cutoff must be closer to the central peak.

The red and green colours represent data and model for LGRBs and SGRBs, respectively, for BATSE catalogue GRBs. The solid curves illustrate the projection of the multivariate GRB world models of Shahmoradi (2013b), Shahmoradi & Nemiroff (2015) on the BATSE-catalogue peak flux Pbol distribution, subject to the BATSE detection threshold. The colour-shaded areas represent the 90 per cent prediction intervals for the distribution of BATSE data. The dashed lines represent the predicted underlying populations of LGRBs and SGRBs, respectively based on the multivariate GRB world models of Shahmoradi (2013b), Shahmoradi & Nemiroff (2015).
Finally, we turn our attention to Y15’s Monte Carlo simulations, which seemingly confirms their results. Therein, they begin with their inferred value of α to simulate a distribution of LGRBs following the relationship of equation (2) with α = 2.43. They find that the synthetic data and the observed data have similar distributions. This is, however, no surprise considering their simulation was based on the derived results of their observational analysis and the assumed potentially underestimated detection threshold. In other words, their Monte Carlo simulation proves the self-consistency of the Efron–Petrosian statistic and their analysis, but falls short of verifying the accuracy of the detection threshold assumption made in their analysis. Therefore, this circular reasoning concludes that the two observed and synthetic distributions are similar without validating the accuracies of the assumptions made in their work.
The primary conclusion of Y15, of an excess of GRBs at low redshift (z < 1) compared to the SFR, also contradicts previous studies based on the properties of GRB host galaxies. In point of fact, Vergani et al. (2015), Perley et al. (2015, 2016a, b), Krühler et al. (2015), and Schulze et al. (2015) performed spectroscopic and multiwavelength studies on the properties (stellar masses, luminosities, SFR, and metallicity) of GRB host galaxies of various complete GRB samples and compared them to those of the star-forming galaxies selected by galaxy surveys. Their collective results indicate that at low redshift (z < 1) only a small fraction of the star formation produces GRBs (Pescalli et al. 2016).
3.2 Pescalli et al. (2016) (P16)
A value for Fmin was chosen such that the resulting α value matches P16’s value. Using a Band model with Plim was unsuccessful.
. | P16 values . | Reanalysis values . |
---|---|---|
α | 2.5 | |$2.50^{+0.40}_{-0.44}$| |
τ | Not given | 4.99σ |
Fmin | Given as Plim | 8.10 × 10−8 erg cm−2 s−1 |
. | P16 values . | Reanalysis values . |
---|---|---|
α | 2.5 | |$2.50^{+0.40}_{-0.44}$| |
τ | Not given | 4.99σ |
Fmin | Given as Plim | 8.10 × 10−8 erg cm−2 s−1 |
A value for Fmin was chosen such that the resulting α value matches P16’s value. Using a Band model with Plim was unsuccessful.
. | P16 values . | Reanalysis values . |
---|---|---|
α | 2.5 | |$2.50^{+0.40}_{-0.44}$| |
τ | Not given | 4.99σ |
Fmin | Given as Plim | 8.10 × 10−8 erg cm−2 s−1 |
. | P16 values . | Reanalysis values . |
---|---|---|
α | 2.5 | |$2.50^{+0.40}_{-0.44}$| |
τ | Not given | 4.99σ |
Fmin | Given as Plim | 8.10 × 10−8 erg cm−2 s−1 |
P16 proceeds in a similar fashion to Y15, beginning with the observational data set of LGRBs found in their table B.1. We extract from this data set 81 LGRBs that have both redshift and isotropic peak luminosity (Liso) values for our reanalysis. They use the Efron–Petrosian τ statistic to find an α value of α = 2.5, consistent with the results of Y15. This provides the functional form of the luminosity evolution with redshift L(z) = L0(1 + z)2.5. From here they proceed with Lynden–Bell’s c− method to derive the cumulative luminosity function Φ(L0) and the LGRB formation rate ρ(z).
Given the lack of sufficient details about the approach proposed by P16 and the fact they find almost no difference between the traditional approach to computing the Efron–Petrosian statistic and their proposed method, here we follow the traditional formal technique for computing the τ statistic. However, since P16 do not provide an effective bolometric energy flux limit for their sample in units of (erg cm−2 s−1), we searched for an effective energy flux detection threshold that would yield a τ statistic comparable to what P16 find (Table 2).
Therefore, we conclude that the effective detection threshold (Fmin = 8.10 × 10−8 erg cm−2 s−1) that we previously inferred directly from the Efron–Petrosian statistic should likely resemble more the flux limit that is used but not clearly discussed or shown in the work of P16. However, the trade off in choosing this value is that the limit appears to be underestimated, as can be seen in Figs 3(a)–(c). It is not possible to choose a value of Ep that both yields an α value in agreement with P16 and does not appear to underestimate the detector threshold flux limit.

Plotting of the 81 GRBs from P16. Plot (a) shows redshift versus isotropic luminosity. The black line represents the observational limit of Swift, which has been deduced from P16 to be 8.10 × 10−8 erg cm−2 s−1. The purple line represents the linear regression through the data set whose slope is α = 2.50. Plot (b) is the observer frame visualization of the P16 data set, where the dashed line is the observational limit. Plot (c) is a histogram of flux, where the dashed line is the observational limit. Plot (d) shows redshift versus L0 = L(z)/(1 + z)2.50, the redshift-independent luminosity. Plot (e) shows a range of possible threshold limits versus α values at τ = 0. The intersection of this line with P16’s supposed threshold limit is our value for their α. Plot (f) shows a range of possible threshold limits versus τ values at α = 0. Assuming the detection threshold inferred from the analysis of P16, a redshift-independent luminosity distribution is rejected at |τ| = 4.99σ. However, choosing a more conservative detection threshold at 3.94 × 10−7 erg cm−2 s−1 yields no evidence for luminosity-redshift evolution.
Assuming this chosen value for Fmin is indeed an appropriate approximation for that used by P16, we are again faced with an underestimation of the true effective value of the detection threshold of Swift, similar to Y15. This can be clearly seen in Fig. 3(d) where we plot the redshift-corrected isotropic peak luminosity versus redshift, and the solid black line represents the redshift-corrected detection threshold. We observe that this threshold resembles almost a flat line at high redshifts, indicating that the inferred relationship between (1 + z) and Liso is likely heavily influenced by the improperly modelled detection threshold of BAT. In Fig. 3(f), it can be seen that the reported correlation can be removed entirely with a more conservative value of Fmin = 3.94 × 10−7 erg cm−2 s−1.
In addition to potential underestimation of the detection threshold, the observational data in the work of P16 also appears to not have been homogeneously collected. Looking at Fig. 3(c), the histogram of data appears to be multimodal/incomplete, implying the presence of some, yet-unknown, selection effects in the process of constructing this data set.
Finally, we turn our attention to P16’s Monte Carlo simulation, which seems to confirm their results. Unlike Y15, P16 avoid a circular logic inference in their simulations by assuming different Fmin (5 × 10−8 erg cm−2 s−1) and α (2.2) values from their methodology and results. They are able to successfully recover the GRB formation rate and luminosity function that they adopted for their simulated sample.
They further test the consequences of sample incompleteness in two approaches. In the first approach they randomly remove a fraction of the bursts close to Fmin. In the second approach, they lower Fmin by a factor of 5, creating an underestimation of its value. Both approaches artificially create sample incompleteness. The result of both realizations of sample incompleteness is to flatten out the GRB formation rate at low redshift, creating the illusion of an excess of low-redshift GRBs relative to the SFR. This result contradicts the simulations and findings of Y15 and corroborates our conclusions in Section 3.1.
3.3 Tsvetkova et al. (2017) (T17)
Using T17’s value for Fmin does not reproduce their α value, so an Fmin was chosen that does.
. | T17 Values . | Reanalysis values . |
---|---|---|
α | 1.7 | |$1.70^{+0.34}_{-0.26}$| |
τ | 1.2σ | 5.46σ |
Fmin | 2 × 10−6 erg cm−2 s−1 | 7.95 × 10−7 erg cm−2 s−1 |
. | T17 Values . | Reanalysis values . |
---|---|---|
α | 1.7 | |$1.70^{+0.34}_{-0.26}$| |
τ | 1.2σ | 5.46σ |
Fmin | 2 × 10−6 erg cm−2 s−1 | 7.95 × 10−7 erg cm−2 s−1 |
Using T17’s value for Fmin does not reproduce their α value, so an Fmin was chosen that does.
. | T17 Values . | Reanalysis values . |
---|---|---|
α | 1.7 | |$1.70^{+0.34}_{-0.26}$| |
τ | 1.2σ | 5.46σ |
Fmin | 2 × 10−6 erg cm−2 s−1 | 7.95 × 10−7 erg cm−2 s−1 |
. | T17 Values . | Reanalysis values . |
---|---|---|
α | 1.7 | |$1.70^{+0.34}_{-0.26}$| |
τ | 1.2σ | 5.46σ |
Fmin | 2 × 10−6 erg cm−2 s−1 | 7.95 × 10−7 erg cm−2 s−1 |
In T17, the authors explore a data set of GRBs detected in the triggered mode of the Konus-Wind experiment. Beginning with 150 mixed-type GRBs, they prune the data set down to 137 by removing the Type I (short) GRBs as well as GRB 081203A. It is not explained why GRB 081203A is excluded.

Plotting of the 137 GRBs from T17. Plot (a) shows redshift versus isotropic luminosity. The black line represents the observational limit of Swift, which has been deduced from T17 to be 7.95 × 10−7 erg cm−2 s−1. The purple line represents the linear regression through the data set whose slope is α = 1.70. Plot (b) is the observer frame visualization of the T17 data set, where the dashed line is the observational limit. Plot (c) is a histogram of flux, where the dashed line is the observational limit. Plot (d) shows redshift versus L0 = L(z)/(1 + z)1.70, the redshift-independent luminosity. Plot (e) shows a range of possible threshold limits versus α values at τ = 0. The intersection of this line with T17’s supposed threshold limit is our value for their α. Plot (f) shows a range of possible threshold limits versus τ values at α = 0. Assuming the detection threshold inferred from the analysis of T17, a redshift-independent luminosity distribution is rejected at |τ| = 5.46σ. However, choosing a more conservative detection threshold at 2.12 × 10−6 erg cm−2 s−1 yields no evidence for luminosity-redshift evolution.
As can be seen in Figs 4(b) and (c), the analysis of T17 also appears to suffer from an underestimation of the detector threshold limit of Swift. Again, we expect the limit to be closer to the central peak of the histogram, soft-truncating the data set. Otherwise, such a sharp drop in the count of LGRB events before reaching the detection threshold would have truly fundamental and revolutionary implications about the cosmic rates of LGRBs.
We note that T17’s underestimation of the detector threshold limit does not appear to be as severe as Y15 or P16. Once the luminosity evolution has been removed, the detector threshold cut in the (1 + z) − L0 plane does not become flat at high redshift, as can be seen in Fig. 4(d).
Also of note is the disparity in the significance of rejecting no luminosity-redshift evolution between our results and those of T17. Fig. 4(f) shows our inferred significance (τ = 5.46σ), which is hard to reconcile with the inferred significance τ = 1.2σ in T17. This figure also highlights that a more conservative value of Flim = 2.12 × 10−6 erg cm−2 s−1 removes the apparent correlation between Liso and z entirely.
3.4 Lloyd-Ronning et al. (2019) (L19)
Using L19’s value for Fmin does not reproduce their α value, and a reasonable Fmin could not be chosen that does, so instead one was deduced to visually match the threshold cut on their graph. This yields a significantly different α value, however.
. | L19 Values . | Reanalysis values . |
---|---|---|
α | 2.3 ± 0.5 | |$1.31^{+0.21}_{-0.20}$| |
τ | 5.6σ | 6.01σ |
Fmin | 2 × 10−6 erg cm−2 | 1.60 × 10−7 erg cm−2 |
. | L19 Values . | Reanalysis values . |
---|---|---|
α | 2.3 ± 0.5 | |$1.31^{+0.21}_{-0.20}$| |
τ | 5.6σ | 6.01σ |
Fmin | 2 × 10−6 erg cm−2 | 1.60 × 10−7 erg cm−2 |
Using L19’s value for Fmin does not reproduce their α value, and a reasonable Fmin could not be chosen that does, so instead one was deduced to visually match the threshold cut on their graph. This yields a significantly different α value, however.
. | L19 Values . | Reanalysis values . |
---|---|---|
α | 2.3 ± 0.5 | |$1.31^{+0.21}_{-0.20}$| |
τ | 5.6σ | 6.01σ |
Fmin | 2 × 10−6 erg cm−2 | 1.60 × 10−7 erg cm−2 |
. | L19 Values . | Reanalysis values . |
---|---|---|
α | 2.3 ± 0.5 | |$1.31^{+0.21}_{-0.20}$| |
τ | 5.6σ | 6.01σ |
Fmin | 2 × 10−6 erg cm−2 | 1.60 × 10−7 erg cm−2 |
To resolve the disagreement between our inferred value of detection threshold used by L19 and the reported value in their study, we instead settle on visually matching the threshold cut in fig. 1 of L19 to obtain a limit of 1.60 × 10−7 erg cm−2. This yields Fig. 5(a) which looks remarkably similar to the corresponding fig. 1 of L19. Assuming this detection threshold, we obtain a value of |$\alpha = 1.31^{+0.21}_{-0.20}$| using the Efron–Petrosian τ statistic (Table 4).

Plotting of the 376 GRBs from L19. Plot (a) shows redshift versus isotropic emitted energy. The black line represents the observational limit of Swift, which has been deduced from L19 to be 1.60 × 10−7 erg cm−2. The purple line represents the linear regression through the data set whose slope is α = 1.31. Plot (b) is the observer frame visualization of the L19 data set, where the dashed line is the observational limit. Plot (c) is a histogram of fluence, where the dashed line is the observational limit. Plot (d) shows redshift versus E0 = E(z)/(1 + z)1.31, the redshift-independent effective energy. Plot (e) shows a range of possible threshold limits versus α values at τ = 0. The intersection of this line with L19’s supposed threshold limit is our value for their α. Plot (f) shows a range of possible threshold limits versus τ values at α = 0. Assuming the detection threshold inferred from the analysis of L19, a redshift-independent luminosity distribution is rejected at |τ| = 6.01σ. However, choosing a more conservative detection threshold at 1.04 × 10−6 erg cm−2 yields no evidence for luminosity-redshift evolution.
If our inferred detection threshold is indeed the value used by L19 in their study, then Figs 5(b) and (c), lead us to conclude that the detection threshold has been likely severely underestimated in the study of L19. The effective threshold represented by the dashed line in the histogram of Fig. 5(c) is well to the left of the peak of the distribution of LGRB fluences.
Our conclusion in the above is further confirmed by Fig. 5(d), where we plot the redshift-corrected isotropic effective energy versus redshift. The solid black line in this plot represents the redshift-corrected detection threshold and almost resembles a flat line at high redshifts. This is another indication that the inferred relationship between (1 + z) and Eiso is likely heavily influenced by the improperly modelled detection threshold. In this study, however, the effective threshold cut represent the combined effects of the detection thresholds of multiple satellites due to the heterogenous collection of data from multiple independent GRB catalogues. All of these raise the possibility that L19’s single-valued threshold is likely a severe oversimplification of the complex merger of individual satellite thresholds, while none of the individual satellite thresholds might be well represented by a single-valued hard cutoff.
Fig. 5(e) shows a range of possible threshold limits versus α values at τ = 0. One can see the red line approaches L19’s value of α = 2.3 on the far left edge of the graph. Fig. 5(f) indicates that Eiso and z are correlated at 6.01σ significance assuming our inferred threshold has been used by L19. Although this significance is not the same as the value reported by L19, 5.6σ, it is similar. Note, however, that this inferred detection threshold still severely underestimates the actual combined effects of multiple detection thresholds on the heterogenous data set of L19. As can be seen in the figure, a more conservative choice of Fmin = 1.04 × 10−6 erg cm−2 removes the apparent correlation between Eiso and z entirely.
4 MONTE CARLO SIMULATIONS OF LUMINOSITY-REDSHIFT EVOLUTION
To further investigate the effects of detection threshold underestimation on the inferred luminosity-redshift evolution of LGRBs, we also create a Monte Carlo universe of LGRBs. The premise of our simulation is that the LGRB world is devoid of any luminosity-redshift or energetics-redshift correlations. Therefore, the application of the Efron–Petrosian statistic on any collection of events measured from such Monte Carlo universe of LGRBs, subjected to a given simulated detection threshold, should also accurately predict no energetics/luminosity-redshift correlations for the intrinsic underlying population of LGRBs in the Monte Carlo universe.
4.1 The LGRB world model
We use the BATSE catalogue of 1366 LGRBs (Paciesas et al. 1999; Shahmoradi & Nemiroff 2010; Goldstein et al. 2013) to construct a Bayesian hierarchical model (Shahmoradi 2017) of the cosmic distribution of LGRBs in the universe, subject to an accurate modelling of the detection threshold of BATSE Large Area Detectors (LADs) and data uncertainties. Then, we use a variant of the adaptive Markov Chain Monte Carlo techniques to sample the resulting posterior distribution of the parameters of the hierarchical model (Shahmoradi 2019; Shahmoradi, Bagheri & Osborne 2020; Shahmoradi & Bagheri 2020a,b). Details of model construction and sampling are extensively discussed in the aforementioned papers (e.g. Shahmoradi 2013b; Osborne et al. 2020).
4.2 The Monte Carlo universe of LGRBs
Once the best-fitting parameters of the LGRB world model are inferred, we create a Monte Carlo universe of LGRBs by randomly and repeatedly generating LGRB events whose characteristics are distributed according to the LGRB World model constructed in Section 4.1. For each LGRB event synthesis, we use a set of model parameters randomly drawn from the posterior distribution of the LGRB world model parameters explored in Section 4.1. Then, each LGRB passes through the simulated LGRB detection process of the BATSE LADs.
An illustration of the resulting Monte Carlo universe of LGRBs is provided in Fig. 6. The two plots represent the joint distributions of Eiso/Liso and redshift. Clearly, the BATSE LADs create a rather sharp cut on the synthesized z − Liso sample of LGRBs compared to the distribution of z − Eiso, which exhibits much fuzzier detection threshold effects. This is expected and reassuring, since the BATSE LADs are primarily triggered on the peak photon flux at different time-scales.

An illustration of the Monte Carlo universe of LGRBs constructed in Section 4.2 using the B10 rate. Each point in this plot represents one synthetic LGRB. The magenta colour represents a high probability of detection while the cyan represents a low probability of detection.
We note that the specific shape of the energetics or redshift distribution of LGRBs or the specific detection mechanism of LGRBs in our Monte Carlo simulations has no relevance or effects on our assessment of the utility and accuracy of the Efron–Petrosian statistic. All that is important here, is the lack of any a priori correlations between the energetics and the redshifts of LGRBs in our Monte Carlo simulations.
Using our Monte Carlo universe of LGRBs, we generate a random sample of 380 BATSE-detectable LGRBs. This sample size is comparable to the size of the observational data sets collected and analysed in previous studies. We flag an LGRB as detectable by generating a uniform-random number between 0 and 1 and comparing it to the probability of detection of the LGRB. If the probability of detection is higher than the randomly generated number, we include the event in the sample of detected LGRBs for our analysis.
4.3 Analysis of synthetic Monte Carlo data

An illustration of the effects of detection threshold on the outcome of the Efron–Petrosian test statistic. Plot (a) Shows the synthetic data set used for our study of (Eiso, 1 + z) correlation. The solid black line represents the detector threshold at 50 per cent while the dashed black line represents a detector threshold comparable to that of L19 at 99 per cent probability of detection. The colour bar represents the probability of detection by the BATSE LADs where cyan and magenta represent 0 per cent and 100 per cent chances of detection, respectively. Plot (b) shows the redshift-evolution corrected data set based off of the value of alpha calculated using the detector threshold at 50 per cent probability of detection. Plot (c) shows the alpha value calculated corresponding to τ = 0 with varying detection threshold limits. The solid black line represents the detector threshold at 50 per cent, the black circle is the average α value over 50 generated samples for the specific threshold used, while the dashed black line represents the detector threshold at 99 per cent probability of detection, comparable to those of previous studies, and the black square is the average α over 50 generated samples at τ = 0. Plot (d) displays the τ statistic at α = 0. The black line represents the detector threshold at 50 per cent detection probability and the dashed black line represents the detection threshold at 99 per cent detection probability, comparable to those of previous studies. The circle and square in this figure are the average τ values over 50 generated samples.

An illustration of the effects of detection threshold on the outcome of the Efron–Petrosian test statistic. Plot (a) shows the synthetic data set used for our study of (Liso, 1 + z) correlation. The solid black line represents the detector threshold at 50 per cent while the dashed black line represents a detector threshold comparable to that of L19 at 99 per cent probability of detection. The colour bar represents the probability of detection by the BATSE LADs where cyan and magenta represent 0 per cent and 100 per cent chances of detection, respectively. Plot (b) shows the redshift evolution corrected data set based off of the value of alpha calculated using the detector threshold at 50 per cent probability of detection. Plot (c) shows the alpha value calculated corresponding to τ = 0 with varying detection threshold limits. The solid black line represents the detector threshold at 50 per cent, the black circle is the average α value over 50 generated samples for the specific threshold used, while the dashed black line represents the detector threshold at 99 per cent probability of detection, comparable to those of previous studies, and the black square is the average α over 50 generated samples at τ = 0. Plot (d) displays the τ statistic at α = 0. The black line represents the detector threshold at 50 per cent detection probability and the dashed black line represents the detection threshold at 99 per cent detection probability, comparable to those of previous studies. The circle and square in this figure are the average τ values over 50 generated samples.
To further illustrate this, we consider a lower detection threshold, comparable to the value used in L19. We note that a direct application of the assumed detection threshold of L19 to our analysis is not possible since the data set used in L19 has been collected from multiple heterogenous sources, as opposed to our synthetic homogenously detected LGRB sample. However, a detection threshold equivalent to that of L19 can be obtained in our analysis by noting that the detection threshold cutoff assumed in the study of L19 is above only three individual LGRB events. This comprises less than |$1{{\ \rm per\ cent}}$| of the entire data set of 376 LGRBs in L19.
We therefore, choose our detection threshold limit such that only |$1{{\ \rm per\ cent}}$| of our synthetic sample falls below the assumed detection threshold hard cutoff, similar to that of L19. This yields a value of α = 1.55 ± 0.21 which is depicted and illustrated in Fig. 7(c). This inferred non-zero correlation at 7σ significance clearly contradicts the fundamental assumption of our Monte Carlo simulation, and confirms our hypothesis that an underestimation of the detection threshold can readily bias the Efron–Petrosian test statistic.
As seen in Fig. 8, we repeat the above analysis for the joint distribution of z − Liso with a detection threshold hard cutoff set at 50 per cent probability of detection: 1.88 × 10−7 ergs cm−2 s−1. We find α = −0.04 ± 0.24 at this probability of detection. However, when we use a detection threshold comparable to those of Y15, P16, and T17, we find α = 1.72 ± 0.16 at >10σ significance, again contradicting the a priori assumption of our Monte Carlos universe of LGRBs.
5 DISCUSSION
In this work, we re-analysed several previous studies on the evolution of the luminosity/energetics of LGRBs with redshift. To be consistent with the previous studies, we used the method of Efron–Petrosian and the τ statistic to determine the exponent of the power-law relationship, α in g(z) = (1 + z)α, between the luminosity/energetics and redshifts of LGRBs.
Contrary to the previous studies, we conclude that the effects of the detection threshold has been likely severely underestimated. We further confirm our conclusion via Monte Carlo simulation, where we assume no correlation between the energetics and the redshift of LGRBs. We then measure these simulated LGRBs via the BATSE detector (for its simplicity). Our finding is that an underestimation of the effective detection threshold by even less than a factor of two can create artificial correlations between the redshifts and the luminosity/energetics of LGRBs. The Monte Carlo simulation of P16 also shows that an underestimation of the detector flux limit can lead to apparent artificial correlations between luminosity and redshift. They further show that this effect can also give rise to an apparent overabundance of LGRBs at low redshift.
The regression slope (α, on a log–log plot) of the reported correlations between the redshift and the luminosity/energetics of previous studies also resembles their chosen detection threshold (Figs 1a, 3a, and 5a). This further corroborates our hypothesis that the observed correlations are an artefact of the individually chosen detection thresholds of the various gamma-ray detectors.
A more accurate study of the luminosity/energetics-redshift evolution requires a more careful modelling of the detection threshold of gamma-ray detectors, where the detection threshold is not a single cutoff on the distribution of LGRBs but rather a dispersed set of detection probabilities in the entire bivariate distribution. However, such a modelling approach is impossible with the original method of Efron–Petrosian and requires parametric modelling of the luminosity/energetics and redshift distribution as well as the detection threshold.
Le, Ratke & Mehta (2020) uses purely parametric methods to determine the GRB formation rate ρ(z). They find that there is no deviation from the SFR at any redshift for the complete unbiased LGRB Swift-Perley and Swift-Ryan-b samples. They do, however, find an excess at low redshift (z < 1) for the Swift-Ryan 2012 sample, and conclude that the reason for this excess is either incomplete sample size, that ρ(z) does not trace SFR at low redshift, or that it is simply unclear.
Deng et al. (2016) run an in-depth analysis using both the τ statistic and parametric modelling and find that observational biases may lead to an overestimation of the α value, and that the evidence for luminosity evolution with redshift is too weak to argue. They note several biases that can lead to data sets being incomplete in redshift, such as the flux truncation effect, trigger probability, and redshift measurement. They too argue that the true instrument threshold for a GRB is complicated, as we have. Coward et al. (2013) go into further detail on how the biases of redshift measurement are even more complicated, noting factors such as galactic dust extinction, redshift desert, and host galaxy extinction that contribute, independent of the brightness of a GRB. They argue that the observed redshift distribution is compatible with a GRB formation rate ρ(z) that tracks the SFR.
The premise of the previous studies in Section 3 has been to provide a non-parametric investigation of the luminosity/energetics versus redshift. However, upon performing the non-parametric correlation, the majority of these investigations rely on parametric fitting of the luminosity/energetics and redshift distributions, which convolutes the premise. Indeed, Lan et al. (2019) presents a fully parametric study of the redshift/energetics evolution and reports a potential correlation between the two, but nevertheless, their study is founded on the assumption of a simple hard cutoff of the detection threshold of LGRBs. A fully parametric study of the correlation which incorporates a more accurate and detailed description of the detection threshold of gamma-ray detectors remains to be done.
ACKNOWLEDGEMENTS
The authors thank the anonymous referee for their valuable feedback.
DATA AVAILABILITY
All of the data used in this paper, as well as the Matlab code used for our analysis and Monte Carlo simulation, can be found here: https://github.com/cdslaborg/lgrbEnergeticsRedshiftCorrelation. The original data can also be found in the individual four papers analysed in Section 3.
REFERENCES
Author notes
T90 is the duration over which a burst emits from 5 to 95 per cent of its total measured gamma-ray photon flux.