Table 1.

To demonstrate the effectiveness of the emulator, we compare evaluation times (in wall-clock seconds) of the forward and adjoint parts of different approximate physics models available in BORG, bench-marked against a full N-body simulation. The indicative times are of relevance for the generation of BORG samples of initial conditions per time unit. Comprehensive timing comparison requires optimization for specific settings and hardware in each method, which is outside the scope of this work. The scenario involves |$128^3$| particles in a cubic volume of side length |$250\,h^{-1}$| Mpc. |$\star$|⁠: Supermicro 4124GS-TNR node with NVIDIA A100 40GiB, |$\dagger$|⁠: 128 cores on a Dell R6525 node with AMD EPYC Rome 7502 using the COLA settings required for sufficiently high accuracy as shown by Stopyra et al. (2024).

Structure formation model|$\boldsymbol {t}_{\mathrm{forward}}$| [s]|$\boldsymbol {t}_{\mathrm{adjoint}}$| [s]
1LPT|$\lt 10^{-1}$||$\lt 10^{-1}$|
BORG-EM (1LPT |$+$| emulator)|$^\star$|1.62.6
1LPT |$+$| mini-emulator|$^\star$|0.40.6
BORG-PM (COLA, |$n_{\mathrm{steps}} = 20$|⁠, forcesampling|$=4)^{\dagger }$||$\sim 8$||$\sim 8$|
N-body (P-Gadget-III, |$n_{\mathrm{steps}} = 1664)^{\dagger }$||$\sim 6 \times 10^2$|
Structure formation model|$\boldsymbol {t}_{\mathrm{forward}}$| [s]|$\boldsymbol {t}_{\mathrm{adjoint}}$| [s]
1LPT|$\lt 10^{-1}$||$\lt 10^{-1}$|
BORG-EM (1LPT |$+$| emulator)|$^\star$|1.62.6
1LPT |$+$| mini-emulator|$^\star$|0.40.6
BORG-PM (COLA, |$n_{\mathrm{steps}} = 20$|⁠, forcesampling|$=4)^{\dagger }$||$\sim 8$||$\sim 8$|
N-body (P-Gadget-III, |$n_{\mathrm{steps}} = 1664)^{\dagger }$||$\sim 6 \times 10^2$|
Table 1.

To demonstrate the effectiveness of the emulator, we compare evaluation times (in wall-clock seconds) of the forward and adjoint parts of different approximate physics models available in BORG, bench-marked against a full N-body simulation. The indicative times are of relevance for the generation of BORG samples of initial conditions per time unit. Comprehensive timing comparison requires optimization for specific settings and hardware in each method, which is outside the scope of this work. The scenario involves |$128^3$| particles in a cubic volume of side length |$250\,h^{-1}$| Mpc. |$\star$|⁠: Supermicro 4124GS-TNR node with NVIDIA A100 40GiB, |$\dagger$|⁠: 128 cores on a Dell R6525 node with AMD EPYC Rome 7502 using the COLA settings required for sufficiently high accuracy as shown by Stopyra et al. (2024).

Structure formation model|$\boldsymbol {t}_{\mathrm{forward}}$| [s]|$\boldsymbol {t}_{\mathrm{adjoint}}$| [s]
1LPT|$\lt 10^{-1}$||$\lt 10^{-1}$|
BORG-EM (1LPT |$+$| emulator)|$^\star$|1.62.6
1LPT |$+$| mini-emulator|$^\star$|0.40.6
BORG-PM (COLA, |$n_{\mathrm{steps}} = 20$|⁠, forcesampling|$=4)^{\dagger }$||$\sim 8$||$\sim 8$|
N-body (P-Gadget-III, |$n_{\mathrm{steps}} = 1664)^{\dagger }$||$\sim 6 \times 10^2$|
Structure formation model|$\boldsymbol {t}_{\mathrm{forward}}$| [s]|$\boldsymbol {t}_{\mathrm{adjoint}}$| [s]
1LPT|$\lt 10^{-1}$||$\lt 10^{-1}$|
BORG-EM (1LPT |$+$| emulator)|$^\star$|1.62.6
1LPT |$+$| mini-emulator|$^\star$|0.40.6
BORG-PM (COLA, |$n_{\mathrm{steps}} = 20$|⁠, forcesampling|$=4)^{\dagger }$||$\sim 8$||$\sim 8$|
N-body (P-Gadget-III, |$n_{\mathrm{steps}} = 1664)^{\dagger }$||$\sim 6 \times 10^2$|
Close
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close

This PDF is available to Subscribers Only

View Article Abstract & Purchase Options

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Close