-
PDF
- Split View
-
Views
-
Cite
Cite
Satoko Yamamoto, Junichiro Makino, Hermite integrator for high-order mesh-free schemes, Publications of the Astronomical Society of Japan, Volume 71, Issue 1, January 2019, 18, https://doi.org/10.1093/pasj/psy137
- Share Icon Share
Abstract
In most mesh-free methods, the calculation of interactions between sample points or “particles” is the most time-consuming. When we use mesh-free methods with high spatial orders, the order of the time integration should also be high. If we use usual Runge–Kutta schemes, we need to perform the interaction calculation multiple times per time step. One way to reduce the number of interaction calculations is to use Hermite schemes, which use the time derivatives of the right-hand side of differential equations, since Hermite schemes require a smaller number of interaction calculations than Runge–Kutta schemes do to achieve the same order. In this paper, we construct a Hermite scheme for a mesh-free method with high spatial orders. We performed several numerical tests with fourth-order Hermite schemes and Runge–Kutta schemes. We found that, for both Hermite and Runge–Kutta schemes, the overall error is determined by the error of spatial derivatives, for time steps smaller than the stability limit. The calculation cost at the time-step size of the stability limit is smaller for Hermite schemes. Therefore, we conclude that Hermite schemes are more efficient than Runge–Kutta schemes and thus useful for high-order mesh-free methods for Lagrangian hydrodynamics.
1 Introduction
Lagrangian mesh-free methods, in which particles move following the motion of fluid, have been widely used for astrophysical hydrodynamical simulations. In most mesh-free methods, the calculation of interactions between particles is the most time-consuming part. Typically, one particle interacts with ∼ 100 neighbor particles, and thus the cost of interaction calculations dominates the total calculation cost. One way to reduce the number of interaction calculations is to use Hermite schemes, which use the time derivatives of the right-hand side of differential equations, since Hermite schemes require a smaller number of interaction calculations than Runge–Kutta schemes do to achieve the same order.
In the field of stellar dynamics, the fourth-order Hermite scheme (Makino 1991; Makino & Aarseth 1992) is widely used for high-order integration. The basic idea of the Hermite scheme is to calculate the time derivative of gravitational acceleration directly, and use it to construct a high-order interpolation polynomial. If we calculate up to a pth-order time-derivative directly, we can achieve the order of s(p + 1) when we use the s-step linear multi-step method, and in the case of s = 2, we can achieve the order of 2(p + 1). The two-step linear multi-step method can be formulated so that it requires only one force evaluation per time step. In the case of a grid-based scheme for hydrodynamics, Aoki (1997) described a method based purely on the Taylor expansion, which achieves the order p + 1.
In this paper, we combine the Hermite scheme with Consistent Particle Hydrodynamics in Strong Form (CPHSF: Yamamoto & Makino 2017), which is one of the high-order mesh-free methods. One disadvantage of the Hermite scheme is that, even though it requires a smaller number of interaction calculations, the calculation cost of one interaction is higher because we need to calculate high-order derivations. In the case of CPHSF or other moving least squares-based interpolation, high-order interpolation polynomial gives spatial derivatives, and we only need to convert spatial derivatives to time derivations using the original differential equations. Thus, the increase of the calculation cost is small and independent of the number of neighbors.
We performed several numerical tests. Fourth-order Hermite schemes and second- and fourth-order Runge–Kutta schemes are used for the test with a periodic boundary, and an implicit Hermite scheme, an implicit fourth-order Runge–Kutta scheme and the backward-Euler scheme are used for the test with boundary conditions. We found that, for both the Hermite and Runge–Kutta schemes, the overall error is determined by the error of spatial derivatives, for time steps smaller than the stability limit. The calculation cost at the time-step size of the stability limit is smaller for Hermite schemes. Therefore, we conclude that Hermite schemes are more efficient than Runge–Kutta schemes and thus useful for high-order mesh-free methods for Lagrangian hydrodynamics.
In the rest of this paper, we first present the formulation of the Hermite scheme for CPHSF in section 2, and report the results of numerical tests in section 3. We summarize our study in section 4.
2 Derivation of the high-order scheme
In this section, we present the derivation of the fourth-order Hermite schemes for CPHSF.
2.1 Hermite scheme
2.2 Derivation of high-order time-derivatives for hydrodynamical equations
2.3 Calculation cost for high-order time-derivatives
For the fourth-order Hermite time-integrations, we must derive second spatial order derivatives of physical quantities to calculate jerk, snap, and crackle. However, if we use spatial high-order mesh-free methods (e.g., CPHSF), the additional number of arithmetic operations of jerk, snap, and crackle is much smaller than the original number of calculations for the spatial high-order mesh-free method.
In this section, we compare the original number of arithmetic operations and the additional number of the operations necessary for the Hermite scheme. First, we show how to derive the spatial high-order derivatives of a physical quantity f. Secondly, the original number of arithmetic operations of CPHSF is derived. We call this value Nop. Note that we assume that Nop comprises only the number of operations for the evaluation of the inverse matrix of Bi in equation (35) and interaction calculation between particles since these dominate the total calculation cost of CPHSF. Thirdly, the additional number of arithmetic operations for jerk, snap, and crackle is derived. We call this value Nadd. Finally, we compare Nop and Nadd. To obtain the number of arithmetic operations, we calculate the number of floating-point operations per particle of CPHSF. If a quantity has been derived, we assume that it will not be unnecessarily recalculated. We assume that the numbers of floating-point operations required to evaluate division and square root are both 20.
where i and j are indices of particles, m and α are integers, np and Wij are the spatial order of the scheme and a Kernel function, and xij, yij, and |$z$|ij are xj − xi, yj − yi, and |$z$|j − |$z$|i.
for one, two, and three dimensions, respectively. Table 1 shows the summary of the numbers of floating-point operations for one interaction calculation of CPHSF.
Numbers of floating-point operations for one interaction calculation of CPHSF.
Process . | D = 1 . | D = 2 . | D = 3 . |
---|---|---|---|
N dist | ≃ 22 | ≃ 45 | ≃ 48 |
N kernel | ≃ 33 | ≃ 35 | ≃ 36 |
N sf | N sf(np, 1) | N sf(np, 2) | N sf(np, 3) |
Total | ≃ [55 + Nsf(np, 1)] | ≃ [80 + Nsf(np, 2)] | ≃ [84 + Nsf(np, 3)] |
Process . | D = 1 . | D = 2 . | D = 3 . |
---|---|---|---|
N dist | ≃ 22 | ≃ 45 | ≃ 48 |
N kernel | ≃ 33 | ≃ 35 | ≃ 36 |
N sf | N sf(np, 1) | N sf(np, 2) | N sf(np, 3) |
Total | ≃ [55 + Nsf(np, 1)] | ≃ [80 + Nsf(np, 2)] | ≃ [84 + Nsf(np, 3)] |
Numbers of floating-point operations for one interaction calculation of CPHSF.
Process . | D = 1 . | D = 2 . | D = 3 . |
---|---|---|---|
N dist | ≃ 22 | ≃ 45 | ≃ 48 |
N kernel | ≃ 33 | ≃ 35 | ≃ 36 |
N sf | N sf(np, 1) | N sf(np, 2) | N sf(np, 3) |
Total | ≃ [55 + Nsf(np, 1)] | ≃ [80 + Nsf(np, 2)] | ≃ [84 + Nsf(np, 3)] |
Process . | D = 1 . | D = 2 . | D = 3 . |
---|---|---|---|
N dist | ≃ 22 | ≃ 45 | ≃ 48 |
N kernel | ≃ 33 | ≃ 35 | ≃ 36 |
N sf | N sf(np, 1) | N sf(np, 2) | N sf(np, 3) |
Total | ≃ [55 + Nsf(np, 1)] | ≃ [80 + Nsf(np, 2)] | ≃ [84 + Nsf(np, 3)] |
for one, two and three dimensions, respectively.
In the following, we derive Nadd. To derive jerk, snap, and crackle in the Hermite schemes, we need to calculate the second spatial order derivatives of fi given by equation (35). Here, the values of ∑jfjpα, ijWij and |$[B^{-1}_i]_{m\alpha }$| have been calculated in the derivation of the spatial first-order derivative. Therefore, we must calculate only the multiplication of |$[B^{-1}_i]_{m\alpha }$| by ∑jfjpα, ijWij, and the additional number of calculations for one physical quantity is given by DH2n(np, D). We have density (pressure), energy, and velocity, and thus the number of physical quantities in a numerical calculation is (D + 2). Therefore, the total additional number of calculations is Nadd = (D + 2)dH2n(np, D).
Figure 1 shows Nop and Nadd with respect to np. We assume Nnb = 10, 75, and 600 for one, two, and three dimensions. We can see that Nadd is much smaller than Nop. Therefore, we conclude that the additional number of the calculations of jerk, snap, and crackle is much smaller than the original number of the calculations of CPHSF.

Values of Nadd and Nop plotted against np. The dashed and solid lines show these values for Nadd and Nop. From left to right, the values of D are 1, 2, and 3.
3 Numerical experiments
In this section, we present the result of the Sod shock tube test in subsection 3.1 and that for the test of the surface gravity wave in subsection 3.2. We compare the results of fourth-order Hermite schemes and second- and fourth-order Runge–Kutta schemes in the Sod shock tube test, and the results of an fully implicit Hermite-scheme, the implicit fourth-order Runge–Kutta scheme, and the backward-Euler scheme in the surface gravity wave test.
3.1 Sod shock tube
We compare results with PEC, PECE, and P(EC)2 forms of Hermite schemes, and Heun’s scheme (hereafter RK2) and the classical fourth-order Runge–Kutta scheme (hereafter RK4). The numbers of particles, Nx, are 1000, 2000, and 4000. Calculation codes used in this study were developed using FDPS (Iwasawa et al. 2016).
Figure 2 shows density profiles at t = 0.1 for the tests with Nx = 1000 and dt ≃ dtmax/4, where dtmax is the maximum time-step in the stability region, with the PEC form of the Hermite scheme. Note that the results for all schemes are similar to that for the PEC form of the Hermite scheme. We can see that the shock wave can be captured. However, the post-shock oscillation is strong for x0 = 0.006. Figure 3 is the same as figure 2, but for Nx = 4000. Note that the results are independent of the time-integration scheme used and the results for Nx = 2000 are similar to those for Nx = 4000. We can see that the shock wave can be captured clearly even if the initial condition is not smooth. Therefore, if the initial condition is not smooth, the resolution of time and space should be higher.

Results of the Sod shock tube tests with Nx = 1000. The density profiles at t = 0.1 are shown. The left- and right-hand panels show the results for x0 = 0.006 and x0 = 0.03.

Now we check the spatial order of the scheme. We used the sixth-order shape function and then the first and second derivatives are fifth and fourth orders in space. Therefore, if the result converges to an exact solution following the order of the method, the order of the scheme should be larger than or equal to 4, and thus ερ should be given by |$\epsilon _{\rho } \propto N_x^{-m}$| where m is larger than or equal to 4. Figure 4 shows that ερ for the P(EC)2 form of the Hermite scheme for runs with dt = 10−6 plotted against |$N_x^{-1}$|. The results are independent of the time-integration scheme used. The value of ερ for runs with x0 = 0.006 is proportional to |$N_x^{-4}$|. The value of ερ in the large Nx region for runs with x0 = 0.03 is proportional to |$N_x^{-1}$| since, in this region, the round-off error dominates the total error. In the other region, ερ is proportional to |$N_x^{-4}$|. From these results, we can conclude that the spatial order of the scheme is consistent with theoretical expectation.

ερ at t = 0.1 for the tests with x0 = 0.006 and x0 = 0.03 plotted against |$N_x^{-1}$|. Filled and open circles show results for x0 = 0.006 and 0.03, and the solid curve shows the theoretical models for the error.
Let us look at the time orders of the schemes. Figures 5 and 6 show ερ, Δt for the tests with x0 = 0.006 and x0 = 0.03 plotted against dtic, where dtic is dt divided by the number of interaction calculations per time step. We can see that the errors of RK2 and RK4 are |$\mathcal {O}(dt^2)$| and |$\mathcal {O}(dt^4)$|, respectively, and that of the Hermite schemes is |$\mathcal {O}(dt^2)$|.

ερ, Δt at t = 0.1 for the tests with x0 = 0.006 plotted against dtic. From left to right, panels show the results for the Nx = 1000, 2000 and 4000. Triangles, squares and crosses show the results for Hermite schemes in PEC, PECE, P(EC)2 forms, and open and filled circles show the results for RK4 and RK2. Solid and dashed curves show the theoretical models for the error of second- and fourth-order schemes.

Figure 7 shows errors for tests with x0 = 0.006 plotted against dtic. The result shows that the accuracy of fourth-order Hermite schemes is similar to those of RK2 and RK4, since the errors of spatial differentiation approximation determines the overall error.

ερ at t = 0.1 for the tests with x0 = 0.006 plotted against dtic. From left to right in the upper panels, the results for the PEC-, PECE- and P(EC)2 forms of the Hermite schemes are shown. The lower left-hand and middle panels show the results for the second and fourth Runge–Kutta schemes. Crosses and open and filled circles show results for Nx = 1000, 2000, and 4000.
Figure 8 shows maximum dtic in the numerical stable region for tests with x0 = 0.006 plotted against |$N_x^{-1}$|. We can see that the regions of stability of fourth-order Hermite schemes are larger than or equal to those of RK2 and RK4. Hence, we can use larger time-steps with the Hermite schemes. Therefore, we can conclude that Hermite schemes, especially in PEC and PECE forms, are better than Runge–Kutta schemes for simulations of fluid with shock and contact discontinuity, even when the initial condition has a sharp jump.

Maximum dtic in the numerical stable region for tests with x0 = 0.006 plotted against |$N_x^{-1}$|. Triangles, squares, and crosses show the results for Hermite schemes in PEC, PECE, and P(EC)2 forms, and open and filled circles show the results for RK4 and RK2.
Figure 9 shows errors for x0 = 0.03 plotted against dtic. As in the case of x0 = 0.006, the results show that the accuracy of fourth-order Hermite schemes is similar to those of RK2 and RK4, since the errors of the spatial differentiation approximation determine the overall error.

Figure 10 shows maximum dtic in the numerical stable region for tests with x0 = 0.03 plotted against |$N_x^{-1}$|. As in the case of x0 = 0.006, the results for the regions of stability of fourth-order Hermite schemes are larger than or equal to those of RK2 and RK4. Therefore, we can conclude that Hermite schemes, especially in PEC and PECE forms, are better than Runge–Kutta schemes for simulations of fluid with shock and contact discontinuity. We can conclude that Hermite schemes are more computationally efficient than Runge–Kutta schemes for calculation shocks.

3.2 Surface gravity wave test
We compare results of runs with the implicit Hermite scheme, the backward-Euler scheme (hereafter IRK1) and the Gauss–Legendre scheme (hereafter IRK4). The numbers of particles, N, are 16 × 17, 32 × 33, and 64 × 65.
Figure 11 shows the time evolution up to t = 0.75T with the implicit Hermite scheme, N = 16 × 17 and dt ≃ dtmax/4. Figure 12 shows y of the particle initially at (x, y) = (0, 1) with the implicit Hermite scheme, N = 16 × 17 and dt ≃ dtmax/4. Note that the results are independent of the time-integration scheme used and N.

Results of the surface gravity wave tests with N = 16 × 17; from top to bottom, the snapshots at t = 0, 0.25T, 0.5T, and 0.75T are shown.

Time-evolution of the y-coordinate of the particle initially at (x, y) = (0, 1) in the surface gravity wave test with N = 16 × 17.
Now we check the spatial order of the scheme. We used the fifth-order shape function and then the first and second derivatives are fourth and third orders in space. Therefore, if the result converges to an exact solution following the order of the method, the order of the scheme should be larger than or equal to 3, and thus |$\epsilon _{v_x}$| should be given by |$\epsilon _{v_x} \propto N_x^{-m}$| where m is larger than or equal to 3 and Nx is the number of particles in the x-direction. Figure 13 shows |$\epsilon _{v_x}$| for the implicit Hermite scheme with dt = T/512 plotted against |$N_x^{-1}$|. We can see that the error |$\epsilon _{v_x}$| is proportional to |$N_x^{-4}$|. Therefore, the error in acceleration determines the overall error. The results are independent of the time integration used. From the result, the spatial order of the scheme is consistent.

|$\epsilon _{v_x}$| at t = 0.2T plotted against |$N_x^{-1}$|. Filled circles show numerical results and solid curves show the theoretical models for the error.
Let us now look at the time order of the scheme. Figure 14 shows |$\epsilon _{v_x, \Delta {t}}$| plotted against dtic. We can see that the errors of the implicit Hermite scheme, IRK4, and IRK1 are |$\mathcal {O}(dt^2)$|, |$\mathcal {O}(dt^4)$|, and |$\mathcal {O}(dt)$|, respectively. As described in subsection 3.1, the time order of the Hermite scheme is equal to 2. From these results, we can conclude that the time orders of the schemes are consistent.

|$\epsilon _{v_x, \Delta {t}}$| plotted against dtic. From left to right, panels show the results for the Nx = 16, 32 and 64. Crosses and open and filled circles show the results of the implicit Hermite scheme, IRK4, and IRK1. Dashed, solid and treble-dot–dashed curves show the theoretical models for the error of second-, first-, and fourth-order schemes.
Figure 15 shows errors plotted against dtic. The result shows that the accuracy of the implicit Hermite scheme is similar to that of IRK4 and smaller than that of IRK1 with large N.

|$\epsilon _{v_x}$| plotted against dtic. Left-hand, middle and right-hand panels show the results for the implicit Hermite scheme, IRK4, and IRK1. Crosses and open and filled circles show results for Nx = 16, 32, and 64.
Figure 16 shows the maximum dtic in the numerical stable region plotted against |$N_x^{-1}$|. We can see that the region of stability of the implicit Hermite scheme is wider than those of IRK1 and IRK4. Hence, we can use larger time-steps with the implicit Hermite scheme. Therefore, we can conclude that the Hermite scheme is better than Runge–Kutta schemes for simulations of fluid with the surface and gravity wave.

dt ic in the numerical stable region for tests with x0 = 0.006 plotted against |$N_x^{-1}$|. Crosses and open and filled circles show the results of the implicit Hermite scheme, IRK4, and IRK1.
4 Summary
If we use multi-stage integration schemes, such as Runge–Kutta schemes, with mesh-free methods we need to perform the interaction calculation, which is the most expensive part of the calculation, multiple times per time step. We constructed a Hermite scheme for a high-order mesh-free method. The accuracy of fourth-order Hermite schemes is at least similar to those of Runge–Kutta schemes and the region of stability of Hermite schemes is better than those of Runge–Kutta schemes. Therefore, we can use a large time-step with the Hermite scheme compare to that for the Runge–Kutta scheme for the same accuracy. We conclude that Hermite schemes are more computationally efficient than commonly used Runge–Kutta schemes for a high-order mesh-free method.
Acknowledgements
We would like to thank the referee for his or her insightful comments and suggestions. We also thank the editor for his or her assistance. We thank Masaki Iwasawa, Keigo Nitadori and Daisuke Namekata for discussions about Hermite schemes and Runge–Kutta schemes. This research was supported by RIKEN Junior Research Associate Program and MEXT as “Exploratory Challenge on Post-K computer” (Elucidation of the Birth of Exoplanets [Second Earth] and the Environmental Variations of Planets in the Solar System).
References