Abstract

In most mesh-free methods, the calculation of interactions between sample points or “particles” is the most time-consuming. When we use mesh-free methods with high spatial orders, the order of the time integration should also be high. If we use usual Runge–Kutta schemes, we need to perform the interaction calculation multiple times per time step. One way to reduce the number of interaction calculations is to use Hermite schemes, which use the time derivatives of the right-hand side of differential equations, since Hermite schemes require a smaller number of interaction calculations than Runge–Kutta schemes do to achieve the same order. In this paper, we construct a Hermite scheme for a mesh-free method with high spatial orders. We performed several numerical tests with fourth-order Hermite schemes and Runge–Kutta schemes. We found that, for both Hermite and Runge–Kutta schemes, the overall error is determined by the error of spatial derivatives, for time steps smaller than the stability limit. The calculation cost at the time-step size of the stability limit is smaller for Hermite schemes. Therefore, we conclude that Hermite schemes are more efficient than Runge–Kutta schemes and thus useful for high-order mesh-free methods for Lagrangian hydrodynamics.

1 Introduction

Lagrangian mesh-free methods, in which particles move following the motion of fluid, have been widely used for astrophysical hydrodynamical simulations. In most mesh-free methods, the calculation of interactions between particles is the most time-consuming part. Typically, one particle interacts with ∼ 100 neighbor particles, and thus the cost of interaction calculations dominates the total calculation cost. One way to reduce the number of interaction calculations is to use Hermite schemes, which use the time derivatives of the right-hand side of differential equations, since Hermite schemes require a smaller number of interaction calculations than Runge–Kutta schemes do to achieve the same order.

In the field of stellar dynamics, the fourth-order Hermite scheme (Makino 1991; Makino & Aarseth 1992) is widely used for high-order integration. The basic idea of the Hermite scheme is to calculate the time derivative of gravitational acceleration directly, and use it to construct a high-order interpolation polynomial. If we calculate up to a pth-order time-derivative directly, we can achieve the order of s(p + 1) when we use the s-step linear multi-step method, and in the case of s = 2, we can achieve the order of 2(p + 1). The two-step linear multi-step method can be formulated so that it requires only one force evaluation per time step. In the case of a grid-based scheme for hydrodynamics, Aoki (1997) described a method based purely on the Taylor expansion, which achieves the order p + 1.

In this paper, we combine the Hermite scheme with Consistent Particle Hydrodynamics in Strong Form (CPHSF: Yamamoto & Makino 2017), which is one of the high-order mesh-free methods. One disadvantage of the Hermite scheme is that, even though it requires a smaller number of interaction calculations, the calculation cost of one interaction is higher because we need to calculate high-order derivations. In the case of CPHSF or other moving least squares-based interpolation, high-order interpolation polynomial gives spatial derivatives, and we only need to convert spatial derivatives to time derivations using the original differential equations. Thus, the increase of the calculation cost is small and independent of the number of neighbors.

We performed several numerical tests. Fourth-order Hermite schemes and second- and fourth-order Runge–Kutta schemes are used for the test with a periodic boundary, and an implicit Hermite scheme, an implicit fourth-order Runge–Kutta scheme and the backward-Euler scheme are used for the test with boundary conditions. We found that, for both the Hermite and Runge–Kutta schemes, the overall error is determined by the error of spatial derivatives, for time steps smaller than the stability limit. The calculation cost at the time-step size of the stability limit is smaller for Hermite schemes. Therefore, we conclude that Hermite schemes are more efficient than Runge–Kutta schemes and thus useful for high-order mesh-free methods for Lagrangian hydrodynamics.

In the rest of this paper, we first present the formulation of the Hermite scheme for CPHSF in section 2, and report the results of numerical tests in section 3. We summarize our study in section 4.

2 Derivation of the high-order scheme

In this section, we present the derivation of the fourth-order Hermite schemes for CPHSF.

2.1 Hermite scheme

In this section we present the formulation of the fourth-order Hermite schemes (Makino 1991; Makino & Aarseth 1992). Consider a second-order differential equation,
(1)
(2)
Here, |$\boldsymbol{x}$| and |$\boldsymbol{v}$| denote the position and velocity of one particle. The fourth-order Hermite scheme is derived as follows. The predictor at time tn is given by
(3)
(4)
where |$\boldsymbol{x}_{p}$| and |$\boldsymbol{v}_{p}$| are the predicted position and velocity at the new time, tn + 1 = tn + Δt, |$\boldsymbol{x}_n$| and |$\boldsymbol{v}_n$| are the position and velocity at time tn, and |$\boldsymbol{a}_n$| and |$\boldsymbol{j}_n$| are the acceleration and jerk (first time-derivative of acceleration) at time tn. Using |$\boldsymbol{x}_{p}$| and |$\boldsymbol{v}_{p}$|⁠, we can now calculate the acceleration and jerk, |$\boldsymbol{a}_{n+1}$| and |$\boldsymbol{j}_{n+1}$|⁠, at time tn + 1. Using |$\boldsymbol{a}_n$|⁠, |$\boldsymbol{j}_n$|⁠, |$\boldsymbol{a}_{n+1}$|⁠, and |$\boldsymbol{j}_{n+1}$|⁠, we can construct the third-order Hermite interpolation polynomial for |$\boldsymbol{a}(t)$| as
(5)
where |$\boldsymbol{s}_n$| and |$\boldsymbol{c}_n$| are given by
(6)
(7)
We integrate equation (5) from tn to tn + 1 and obtain correctors given by
(8)
(9)
If we set |$\boldsymbol{x}_{n+1} = \boldsymbol{x}_c$| and |$\boldsymbol{v}_{n+1} = \boldsymbol{v}_c$| at this point, that means we use the PEC (predict–evaluate–correct) form of the linear multi-step method. We can also use PECE or P(EC)2 forms.

2.2 Derivation of high-order time-derivatives for hydrodynamical equations

In this section, we describe how we calculate high-order time-derivatives for hydrodynamics equations in the Lagrangian view. Our approach is essentially the same as that of Aoki (1997), who derived higher-order time-derivatives for the Eulerian view. Aoki (1997) considered the following equation:
(10)
where ξx is some linear operator. By taking time derivatives of both sides of equation (10), they derived a series of equations;
(11)
(12)
and so on. In this paper, we consider the equation
(13)
where d/dt is the Lagrangian derivative,
(14)
The original set of partial differential equations of a Lagrangian formulation of hydrodynamics is given by
(15)
(16)
(17)
(18)
Here, we rewrite |$(d/dt)(\boldsymbol{\nabla })$| as
(19)
The operator |$\boldsymbol{\Diamond }$| is defined as
(20)
where α and β are indices of dimensions, and
(21)
where α = 1, 2, and 3, and |$\boldsymbol{x} = (x_1, x_2, x_3) = (x, y, z)$|⁠. The index β is summed over. Second time-derivatives of ρ, |$\boldsymbol{v}$|⁠, and u are then expressed as
(22)
(23)
(24)
where |$\widetilde{P}$| is defined as
(25)
For the equation of state for ideal gas used in subsection 3.1,
(26)
where γ is the ratio of specific heat, and |$\widetilde{P}$| is given by
(27)
For the equation of state for weakly compressible fluid used in subsection 3.2,
(28)
where ρair, Pair, g, H, and c0 are air density, air pressure, gravity, height of fluid, and sound velocity, respectively. We set
(29)
The parameter |$\widetilde{P}$| is given by
(30)
In this paper, we apply artificial viscosity of the same form as that in Yamamoto and Makino (2017). Note that we do not calculate the contribution of the artificial viscosity to the second time-derivatives since artificial viscosity is not differentiable. Therefore, the artificial viscosity for PEC and P(EC) forms of Hermite schemes are integrated with the Heun’s scheme and the trapezoidal scheme, respectively. We calculate artificial viscosity as follows.
(31)
(32)
(33)
where αAV, βAV, and hAV are coefficients, and cs and ζ are the sound velocity and a parameter which controls the overall strength of AV. In this paper, we set αAV = 1 and βAV = 2. The parameters λm are the eigenvalues of the strain rate tensor |$\boldsymbol{s}$| defined as
(34)
The parameter λmmax is the negative eigenvalue with the maximum absolute value. If all eigenvalues are non-negative, q = 0. In this paper, we use the time-independent coefficient ζ. We set ζ = 1.

2.3 Calculation cost for high-order time-derivatives

For the fourth-order Hermite time-integrations, we must derive second spatial order derivatives of physical quantities to calculate jerk, snap, and crackle. However, if we use spatial high-order mesh-free methods (e.g., CPHSF), the additional number of arithmetic operations of jerk, snap, and crackle is much smaller than the original number of calculations for the spatial high-order mesh-free method.

In this section, we compare the original number of arithmetic operations and the additional number of the operations necessary for the Hermite scheme. First, we show how to derive the spatial high-order derivatives of a physical quantity f. Secondly, the original number of arithmetic operations of CPHSF is derived. We call this value Nop. Note that we assume that Nop comprises only the number of operations for the evaluation of the inverse matrix of Bi in equation (35) and interaction calculation between particles since these dominate the total calculation cost of CPHSF. Thirdly, the additional number of arithmetic operations for jerk, snap, and crackle is derived. We call this value Nadd. Finally, we compare Nop and Nadd. To obtain the number of arithmetic operations, we calculate the number of floating-point operations per particle of CPHSF. If a quantity has been derived, we assume that it will not be unnecessarily recalculated. We assume that the numbers of floating-point operations required to evaluate division and square root are both 20.

First, we show how to derive the spatial high-order derivatives of f. In CPHSF, the mth spatial order derivatives of f is given by the following equations:
(35)
(36)
(37)
(38)

where i and j are indices of particles, m and α are integers, np and Wij are the spatial order of the scheme and a Kernel function, and xij, yij, and |$z$|ij are xjxi, yjyi, and |$z$|j|$z$|i.

In CPHSF, the total number of floating point operations per neighbor particle is given by
(39)
where Nnb is the number of neighbor particles, and Nint and Ninv are the numbers of floating-point operations for the interaction calculation between particles and the evaluation of the inverse matrix of Bi in equation (35). The number of floating-point operations for interaction calculation is given by
(40)
where Ndist and Nkernel are the number of floating-point operations necessary to evaluate the relative distance and the kernel function. The last term, Nsf, represents the number of floating-point operations for the CPHSF fitting. In CPHSF, first of all, we evaluate only |$|\boldsymbol{x}_{ij}|/h_i$|⁠, where |$\boldsymbol{x}_{ij}$| is the displacement of particles i and j and hi is the Kernel length of particle i, to search neighbor particles of particle i, and Ndist is ≃ 22, 45, and 48 for one, two, and three dimensions. Then, we evaluate elements of Bi given by equation (38), the polynomial equation given by equation (36), and the kernel function Wij, to calculate equation (35). One interaction calculation between particle i and particle j in [Bi]αβ is given by {[pij]α[pij]βWij}. The number of combinations of [pij]α[pij]β is n(2np, D). The parameter n(np, D) is the number of bases of a polynomial fitting in equation (35), where D is the number of dimensions, and the value of n(np, D) is given by
(41)
For example, if we consider the one-dimensional case, {[pij]α[pij]βWij} is given by |$x_{ij}^\alpha x_{ij}^\beta W_{ij}$| and thus |$[B_i]_{\alpha _1\beta _1}$| is the same as |$[B_i]_{\alpha _2\beta _2}$| with (α1 + β1) = (α2 + β2). Therefore, the number of the terms of the form of {[pij]α[pij]βWij} is n(2np, D). Since we assume that a quantity which has been derived will not be unnecessarily recalculated, the number of floating-point operations for the evaluation of {[pij]α[pij]βWij} except for {[pij]0[pij]0Wij} is 1. For example, if we consider the one-dimensional case, we can get |$x_{ij}^{m}W_{ij}$| by multiplying |$x_{ij}^{m-1}W_{ij}$| by xij and thus the number of floating-point operations is only 1 for the evaluation of |$x_{ij}^{m}W_{ij}$|⁠. In addition, the number of floating-point operations for summing each term {[pij]α[pij]βWij} with respect to j is 1. Therefore, the total number of floating-point operations for one interaction calculation in Bi is 2n(2np, D) − 1. One interaction calculation between particle i and particle j in the calculation of equation (35) is given by |${W}_{ij}f_j\boldsymbol{p}_{ij}$|⁠. The number of the terms of the form of Wijfj[pij]α is n(np, D). We have density (pressure), energy, and velocity, and thus the number of physical quantities is (D + 2). Therefore, the total number of floating-point operations for one interaction calculation in mth derivatives of density (pressure), energy, and velocity given equation (35) is 2(D + 2)n(np, D). Therefore, the number of floating-point operations for the CPHSF fitting is given by
(42)
The number of floating-point operations necessary for the evaluation of the kernel function, Nkernel, are ≃ 33, 35, and 36 for one, two, and three dimensions, respectively. From the above, the total numbers of floating-point operations for the calculation of equation (35) are ≃ [33 + Nsf(np, 1)], ≃ [35 + Nsf(np, 2)], and ≃ [36 + Nsf(np, 3)] for one, two, and three dimensions.
From the above, the total numbers of floating-point operations for one interaction calculation of CPHSF, Nint, are
(43)
(44)
(45)

for one, two, and three dimensions, respectively. Table 1 shows the summary of the numbers of floating-point operations for one interaction calculation of CPHSF.

Table 1.

Numbers of floating-point operations for one interaction calculation of CPHSF.

ProcessD = 1D = 2D = 3
N  dist≃ 22≃ 45≃ 48
N  kernel≃ 33≃ 35≃ 36
N  sfN  sf(np, 1)N  sf(np, 2)N  sf(np, 3)
Total≃ [55 + Nsf(np, 1)]≃ [80 + Nsf(np, 2)]≃ [84 + Nsf(np, 3)]
ProcessD = 1D = 2D = 3
N  dist≃ 22≃ 45≃ 48
N  kernel≃ 33≃ 35≃ 36
N  sfN  sf(np, 1)N  sf(np, 2)N  sf(np, 3)
Total≃ [55 + Nsf(np, 1)]≃ [80 + Nsf(np, 2)]≃ [84 + Nsf(np, 3)]
Table 1.

Numbers of floating-point operations for one interaction calculation of CPHSF.

ProcessD = 1D = 2D = 3
N  dist≃ 22≃ 45≃ 48
N  kernel≃ 33≃ 35≃ 36
N  sfN  sf(np, 1)N  sf(np, 2)N  sf(np, 3)
Total≃ [55 + Nsf(np, 1)]≃ [80 + Nsf(np, 2)]≃ [84 + Nsf(np, 3)]
ProcessD = 1D = 2D = 3
N  dist≃ 22≃ 45≃ 48
N  kernel≃ 33≃ 35≃ 36
N  sfN  sf(np, 1)N  sf(np, 2)N  sf(np, 3)
Total≃ [55 + Nsf(np, 1)]≃ [80 + Nsf(np, 2)]≃ [84 + Nsf(np, 3)]
The evaluation of the inverse matrix of Bi also dominates in CPHSF and the number of floating-point operations of it, Ninv, is ≃ 2n(np, D)3/3. Therefore, the numbers of floating-point operations per particle of CPHSF are
(46)
(47)
(48)

for one, two and three dimensions, respectively.

In the following, we derive Nadd. To derive jerk, snap, and crackle in the Hermite schemes, we need to calculate the second spatial order derivatives of fi given by equation (35). Here, the values of ∑jfjpα, ijWij and |$[B^{-1}_i]_{m\alpha }$| have been calculated in the derivation of the spatial first-order derivative. Therefore, we must calculate only the multiplication of |$[B^{-1}_i]_{m\alpha }$| by ∑jfjpα, ijWij, and the additional number of calculations for one physical quantity is given by DH2n(np, D). We have density (pressure), energy, and velocity, and thus the number of physical quantities in a numerical calculation is (D + 2). Therefore, the total additional number of calculations is Nadd = (D + 2)dH2n(np, D).

Figure 1 shows Nop and Nadd with respect to np. We assume Nnb = 10, 75, and 600 for one, two, and three dimensions. We can see that Nadd is much smaller than Nop. Therefore, we conclude that the additional number of the calculations of jerk, snap, and crackle is much smaller than the original number of the calculations of CPHSF.

Values of Nadd and Nop plotted against np. The dashed and solid lines show these values for Nadd and Nop. From left to right, the values of D are 1, 2, and 3.
Fig. 1.

Values of Nadd and Nop plotted against np. The dashed and solid lines show these values for Nadd and Nop. From left to right, the values of D are 1, 2, and 3.

3 Numerical experiments

In this section, we present the result of the Sod shock tube test in subsection 3.1 and that for the test of the surface gravity wave in subsection 3.2. We compare the results of fourth-order Hermite schemes and second- and fourth-order Runge–Kutta schemes in the Sod shock tube test, and the results of an fully implicit Hermite-scheme, the implicit fourth-order Runge–Kutta scheme, and the backward-Euler scheme in the surface gravity wave test.

3.1 Sod shock tube

In this section, we present the result of the Sod shock tube test (Sod 1978). We assume that fluid is an ideal gas with γ = 1.4. The computational domain is −0.5 ≤ x < 0.5 with a periodic boundary, and the initial boundary of two fluids are at x = −0.5 and 0. In this test, we used equal-mass particles. The initial velocity is given by |$v$|x = 0. The density is smoothed by a C5 polynomial, and is given by
(49)
where (b0, b1, b2, b3, b4, b5) = (−693/256, 1155/256, −693/128, 495/128, −385/256, 63/256), and ρh and ρl are the values of initial density in the high- and low-density regions. We used ρh = 1 and ρl = 0.25. The parameter x0 represents the width of the smoothing region, and we used two values of x0. One is an initial condition with x0 = 0.006, and the other is a smooth initial condition with x0 = 0.03. We set the initial condition for 0.25 ≤ x < 0.5 to mirror that of 0 < x ≤ 0.25, and −0.5 ≤ x ≤ −0.25 as mirroring −0.25 ≤ x ≤ 0. The positions of particles in the smoothing region are determined so that position xi of particle i satisfies
(50)
where Nh is the number of particles in the high-density region and the right-hand side of equation (50) is the mass of a particle. The smoothed pressure is given by
(51)
where Ph and Pl are the values of initial pressure in the high- and low-density regions. We used Ph = 1 and Pl = 0.1795. We used equations (31) and (32) for the artificial viscosity with hAV = 2.375 × 10−3. We used a sixth-order interpolation with the value of interpolation polynomial at the position of particle |$\boldsymbol{x}_i$| fixed to the actual value. Therefore, |$\boldsymbol{\delta }$| given by equation (36) and |$\boldsymbol{p}_{ij}$| given by equation (36) are
(52)
(52)
The kernel function is the fourth-order Wendland function (Wendland 1995). The kernel length is given by
(54)
(55)
where ρt = 0, i and ΔVt = 0, i are the density and geometric volume, respectively, of a particle i at t = 0. We set η = 3.8.
We calculated the L1-norm error of density at t = 0.1 to verify the spatial order of the schemes and to compare the accuracy of the schemes;
(56)
where |$\rho _{n}^{\mathrm{hres}}$| is the result of a high-resolution test in which the number of particles, Nx, is 8000 and dt = 10−6. When we derived equation (56), we calculated ρn of particles rearranged at the same positions as those of the high-resolution test. The time integrator for high-resolution test is the Hermite scheme of the P(EC)2 form. For the test of the time order of the scheme for the test with Nx = N0, |$\rho _{n,\Delta {t}}^{\mathrm{hres}}$| is the result of a high-resolution test in which Nx is N0 and dt = 10−6. The time integrator for the high-resolution test is the same as that for ρn. In this case we define the error as
(57)

We compare results with PEC, PECE, and P(EC)2 forms of Hermite schemes, and Heun’s scheme (hereafter RK2) and the classical fourth-order Runge–Kutta scheme (hereafter RK4). The numbers of particles, Nx, are 1000, 2000, and 4000. Calculation codes used in this study were developed using FDPS (Iwasawa et al. 2016).

Figure 2 shows density profiles at t = 0.1 for the tests with Nx = 1000 and dtdtmax/4, where dtmax is the maximum time-step in the stability region, with the PEC form of the Hermite scheme. Note that the results for all schemes are similar to that for the PEC form of the Hermite scheme. We can see that the shock wave can be captured. However, the post-shock oscillation is strong for x0 = 0.006. Figure 3 is the same as figure 2, but for Nx = 4000. Note that the results are independent of the time-integration scheme used and the results for Nx = 2000 are similar to those for Nx = 4000. We can see that the shock wave can be captured clearly even if the initial condition is not smooth. Therefore, if the initial condition is not smooth, the resolution of time and space should be higher.

Results of the Sod shock tube tests with Nx = 1000. The density profiles at t = 0.1 are shown. The left- and right-hand panels show the results for x0 = 0.006 and x0 = 0.03.
Fig. 2.

Results of the Sod shock tube tests with Nx = 1000. The density profiles at t = 0.1 are shown. The left- and right-hand panels show the results for x0 = 0.006 and x0 = 0.03.

Same as figure 2, but for Nx = 4000.
Fig. 3.

Same as figure 2, but for Nx = 4000.

Now we check the spatial order of the scheme. We used the sixth-order shape function and then the first and second derivatives are fifth and fourth orders in space. Therefore, if the result converges to an exact solution following the order of the method, the order of the scheme should be larger than or equal to 4, and thus ερ should be given by |$\epsilon _{\rho } \propto N_x^{-m}$| where m is larger than or equal to 4. Figure 4 shows that ερ for the P(EC)2 form of the Hermite scheme for runs with dt = 10−6 plotted against |$N_x^{-1}$|⁠. The results are independent of the time-integration scheme used. The value of ερ for runs with x0 = 0.006 is proportional to |$N_x^{-4}$|⁠. The value of ερ in the large Nx region for runs with x0 = 0.03 is proportional to |$N_x^{-1}$| since, in this region, the round-off error dominates the total error. In the other region, ερ is proportional to |$N_x^{-4}$|⁠. From these results, we can conclude that the spatial order of the scheme is consistent with theoretical expectation.

ερ at t = 0.1 for the tests with x0 = 0.006 and x0 = 0.03 plotted against $N_x^{-1}$. Filled and open circles show results for x0 = 0.006 and 0.03, and the solid curve shows the theoretical models for the error.
Fig. 4.

ερ at t = 0.1 for the tests with x0 = 0.006 and x0 = 0.03 plotted against |$N_x^{-1}$|⁠. Filled and open circles show results for x0 = 0.006 and 0.03, and the solid curve shows the theoretical models for the error.

Let us look at the time orders of the schemes. Figures 5 and 6 show ερ, Δt for the tests with x0 = 0.006 and x0 = 0.03 plotted against dtic, where dtic is dt divided by the number of interaction calculations per time step. We can see that the errors of RK2 and RK4 are |$\mathcal {O}(dt^2)$| and |$\mathcal {O}(dt^4)$|⁠, respectively, and that of the Hermite schemes is |$\mathcal {O}(dt^2)$|⁠.

ερ, Δt at t = 0.1 for the tests with x0 = 0.006 plotted against dtic. From left to right, panels show the results for the Nx = 1000, 2000 and 4000. Triangles, squares and crosses show the results for Hermite schemes in PEC, PECE, P(EC)2 forms, and open and filled circles show the results for RK4 and RK2. Solid and dashed curves show the theoretical models for the error of second- and fourth-order schemes.
Fig. 5.

ερ, Δt at t = 0.1 for the tests with x0 = 0.006 plotted against dtic. From left to right, panels show the results for the Nx = 1000, 2000 and 4000. Triangles, squares and crosses show the results for Hermite schemes in PEC, PECE, P(EC)2 forms, and open and filled circles show the results for RK4 and RK2. Solid and dashed curves show the theoretical models for the error of second- and fourth-order schemes.

Same as figure 5, but the results for x0 = 0.03.
Fig. 6.

Same as figure 5, but the results for x0 = 0.03.

In the following we explain the reason why the order of the Hermite scheme is |$\mathcal {O}(dt^2)$| for fixed |$N_x^{-1}$|⁠. In a particle-based method, the calculated spatial derivatives contain discretization errors, and therefore the time derivative contains errors. In the case of RK schemes, this error causes the solution in the limit of dt → 0 to converge to a solution that is different from the exact solution, but the rate of the convergence is the order of the time-integration scheme, since we can regard the space-discretized differential equations as the set of ordinal differential equations. However, in the case of the Hermite scheme, we construct the second time-derivatives of physical quantities from the original equations and high-order spatial derivatives, and these spatial derivatives contain discretization errors. Thus, both the first and second time-derivatives contain the errors due to space discretization errors, and therefore the second time-derivatives are not exactly the time derivatives of the first time-derivatives. For simplicity, let us illustrate this behaviour for the integration of velocity in one dimension. Here, we rewrite the correctors by substituting equations (6) and (7) to equation (9). Note that we set dt = Δt in equation (4):
(58)
If we use sixth-order polynomial fitting for deriving spatial derivatives, |$v$|c containing the spatial errors is given by
(59)
where An and An + 1 are accelerations at n and n + 1 steps and Jn, and Jn + 1 are jerks at n and n + 1 steps, all given by sixth-order polynomial fitting for deriving spatial derivatives. Therefore, J is not equal to the time derivative of A:
(60)
where εJ is the error. Here, we integrate equation (59) from t = 0 to t = T,
(61)
where Nt is given by Nt = Tt. Here, we can assume that
(62)
Therefore, equation (61) becomes
(63)
where |$v$|A(T) is the analytical solution for the velocity which satisfies equation (62). We can see that the time order of a Hermite scheme is equal to two. From these results, we can conclude that the time orders of the schemes are consistent. The fact that the apparent error order of the Hermite scheme is 2 does not imply it is a second-order scheme, because when we simultaneously shrink the interparticle distance and time step, the error will be |$\mathcal {O}(dt^4)$| as expected. The second-order behaviour occurs only when the spatial error dominates the total error.

Figure 7 shows errors for tests with x0 = 0.006 plotted against dtic. The result shows that the accuracy of fourth-order Hermite schemes is similar to those of RK2 and RK4, since the errors of spatial differentiation approximation determines the overall error.

ερ at t = 0.1 for the tests with x0 = 0.006 plotted against dtic. From left to right in the upper panels, the results for the PEC-, PECE- and P(EC)2 forms of the Hermite schemes are shown. The lower left-hand and middle panels show the results for the second and fourth Runge–Kutta schemes. Crosses and open and filled circles show results for Nx = 1000, 2000, and 4000.
Fig. 7.

ερ at t = 0.1 for the tests with x0 = 0.006 plotted against dtic. From left to right in the upper panels, the results for the PEC-, PECE- and P(EC)2 forms of the Hermite schemes are shown. The lower left-hand and middle panels show the results for the second and fourth Runge–Kutta schemes. Crosses and open and filled circles show results for Nx = 1000, 2000, and 4000.

Figure 8 shows maximum dtic in the numerical stable region for tests with x0 = 0.006 plotted against |$N_x^{-1}$|⁠. We can see that the regions of stability of fourth-order Hermite schemes are larger than or equal to those of RK2 and RK4. Hence, we can use larger time-steps with the Hermite schemes. Therefore, we can conclude that Hermite schemes, especially in PEC and PECE forms, are better than Runge–Kutta schemes for simulations of fluid with shock and contact discontinuity, even when the initial condition has a sharp jump.

Maximum dtic in the numerical stable region for tests with x0 = 0.006 plotted against $N_x^{-1}$. Triangles, squares, and crosses show the results for Hermite schemes in PEC, PECE, and P(EC)2 forms, and open and filled circles show the results for RK4 and RK2.
Fig. 8.

Maximum dtic in the numerical stable region for tests with x0 = 0.006 plotted against |$N_x^{-1}$|⁠. Triangles, squares, and crosses show the results for Hermite schemes in PEC, PECE, and P(EC)2 forms, and open and filled circles show the results for RK4 and RK2.

Figure 9 shows errors for x0 = 0.03 plotted against dtic. As in the case of x0 = 0.006, the results show that the accuracy of fourth-order Hermite schemes is similar to those of RK2 and RK4, since the errors of the spatial differentiation approximation determine the overall error.

Same as figure 7, but for x0 = 0.03.
Fig. 9.

Same as figure 7, but for x0 = 0.03.

Figure 10 shows maximum dtic in the numerical stable region for tests with x0 = 0.03 plotted against |$N_x^{-1}$|⁠. As in the case of x0 = 0.006, the results for the regions of stability of fourth-order Hermite schemes are larger than or equal to those of RK2 and RK4. Therefore, we can conclude that Hermite schemes, especially in PEC and PECE forms, are better than Runge–Kutta schemes for simulations of fluid with shock and contact discontinuity. We can conclude that Hermite schemes are more computationally efficient than Runge–Kutta schemes for calculation shocks.

Same as figure 8, but for x0 = 0.03.
Fig. 10.

Same as figure 8, but for x0 = 0.03.

3.2 Surface gravity wave test

The surface gravity wave test is useful for the investigation of the capability of numerical schemes to handle two-dimensional fluid dynamics with high accuracy and small dissipation. The initial condition is the same as those in Antuono et al. (2011) and Yamamoto and Makino (2017), but sound velocity given by equation (29) is 10 times smaller than that of Yamamoto and Makino (2017). We assume that fluid is weakly compressible with an equation of state given by equation (28) with ρair = 103 and Pair = 105 and sound velocity given by equation (29) with g = −10 and the height of fluid H = 1. The computational domain is 0 ≤ x < 1, 0 ≤ y ≤ 1. We applied a periodic boundary at x = 0, |$v$|y = 0 at y = 0 and P = Pair for particles initially at y = 1 as boundary conditions. Initial density is
(64)
Initial velocity is
(65)
(66)
where A, k, and ω are the amplitude, the wavenumber, and its frequency. We set A = 0.01, k = 2π, and |$\omega =\sqrt{|g|k\tanh (kH)}$|⁠. In this test, we do not use artificial viscosity to clarify the origin of the error. We used a fifth-order interpolation with the value of the interpolate polynomial at the position of particle |$\boldsymbol{x}_i$| fixed to the actual value.
Therefore, |$\boldsymbol{\delta }$| given by equation (36) and |$\boldsymbol{p}_{ij}$| given by equation (36) are
(67)
(68)
The kernel function is the fourth-order Wendland function (Wendland 1995). We used equation (54) as the kernel length and set η = 3.8.
We calculate the absolute error of |$v$|x at (x, y) = (0.4, 1) and t = 0.2T where T is the period given by 2π/ω for checking the spatial order of the schemes and comparing the accuracy of the schemes.
(69)
where |$v_{x}^{\mathrm{hres}}$| is the result of the high-resolution test in which the number of particles, N, is 128 × 129 and dt = T/1024. The time integrator for high-resolution test is the implicit Hermite scheme. For checking the time order of the scheme for the test with N = N0, |$v_{x,\Delta {t}}^{\mathrm{hres}}$| is the result of a high-resolution test in which N is N0 and dt = T/512. The time integrator for a high-resolution test is same as |$v$|x. We calculated |$v$|x and |$v_{x,\Delta {t}}^{\mathrm{hres}}$| of the particles initially at (x, y) = (0.3125, 1). In this case we define the error as
(70)

We compare results of runs with the implicit Hermite scheme, the backward-Euler scheme (hereafter IRK1) and the Gauss–Legendre scheme (hereafter IRK4). The numbers of particles, N, are 16 × 17, 32 × 33, and 64 × 65.

Figure 11 shows the time evolution up to t = 0.75T with the implicit Hermite scheme, N = 16 × 17 and dtdtmax/4. Figure 12 shows y of the particle initially at (x, y) = (0, 1) with the implicit Hermite scheme, N = 16 × 17 and dtdtmax/4. Note that the results are independent of the time-integration scheme used and N.

Results of the surface gravity wave tests with N = 16 × 17; from top to bottom, the snapshots at t = 0, 0.25T, 0.5T, and 0.75T are shown.
Fig. 11.

Results of the surface gravity wave tests with N = 16 × 17; from top to bottom, the snapshots at t = 0, 0.25T, 0.5T, and 0.75T are shown.

Time-evolution of the y-coordinate of the particle initially at (x, y) = (0, 1) in the surface gravity wave test with N = 16 × 17.
Fig. 12.

Time-evolution of the y-coordinate of the particle initially at (x, y) = (0, 1) in the surface gravity wave test with N = 16 × 17.

Now we check the spatial order of the scheme. We used the fifth-order shape function and then the first and second derivatives are fourth and third orders in space. Therefore, if the result converges to an exact solution following the order of the method, the order of the scheme should be larger than or equal to 3, and thus |$\epsilon _{v_x}$| should be given by |$\epsilon _{v_x} \propto N_x^{-m}$| where m is larger than or equal to 3 and Nx is the number of particles in the x-direction. Figure 13 shows |$\epsilon _{v_x}$| for the implicit Hermite scheme with dt = T/512 plotted against |$N_x^{-1}$|⁠. We can see that the error |$\epsilon _{v_x}$| is proportional to |$N_x^{-4}$|⁠. Therefore, the error in acceleration determines the overall error. The results are independent of the time integration used. From the result, the spatial order of the scheme is consistent.

$\epsilon _{v_x}$ at t = 0.2T plotted against $N_x^{-1}$. Filled circles show numerical results and solid curves show the theoretical models for the error.
Fig. 13.

|$\epsilon _{v_x}$| at t = 0.2T plotted against |$N_x^{-1}$|⁠. Filled circles show numerical results and solid curves show the theoretical models for the error.

Let us now look at the time order of the scheme. Figure 14 shows |$\epsilon _{v_x, \Delta {t}}$| plotted against dtic. We can see that the errors of the implicit Hermite scheme, IRK4, and IRK1 are |$\mathcal {O}(dt^2)$|⁠, |$\mathcal {O}(dt^4)$|⁠, and |$\mathcal {O}(dt)$|⁠, respectively. As described in subsection 3.1, the time order of the Hermite scheme is equal to 2. From these results, we can conclude that the time orders of the schemes are consistent.

$\epsilon _{v_x, \Delta {t}}$ plotted against dtic. From left to right, panels show the results for the Nx = 16, 32 and 64. Crosses and open and filled circles show the results of the implicit Hermite scheme, IRK4, and IRK1. Dashed, solid and treble-dot–dashed curves show the theoretical models for the error of second-, first-, and fourth-order schemes.
Fig. 14.

|$\epsilon _{v_x, \Delta {t}}$| plotted against dtic. From left to right, panels show the results for the Nx = 16, 32 and 64. Crosses and open and filled circles show the results of the implicit Hermite scheme, IRK4, and IRK1. Dashed, solid and treble-dot–dashed curves show the theoretical models for the error of second-, first-, and fourth-order schemes.

Figure 15 shows errors plotted against dtic. The result shows that the accuracy of the implicit Hermite scheme is similar to that of IRK4 and smaller than that of IRK1 with large N.

$\epsilon _{v_x}$ plotted against dtic. Left-hand, middle and right-hand panels show the results for the implicit Hermite scheme, IRK4, and IRK1. Crosses and open and filled circles show results for Nx = 16, 32, and 64.
Fig. 15.

|$\epsilon _{v_x}$| plotted against dtic. Left-hand, middle and right-hand panels show the results for the implicit Hermite scheme, IRK4, and IRK1. Crosses and open and filled circles show results for Nx = 16, 32, and 64.

Figure 16 shows the maximum dtic in the numerical stable region plotted against |$N_x^{-1}$|⁠. We can see that the region of stability of the implicit Hermite scheme is wider than those of IRK1 and IRK4. Hence, we can use larger time-steps with the implicit Hermite scheme. Therefore, we can conclude that the Hermite scheme is better than Runge–Kutta schemes for simulations of fluid with the surface and gravity wave.

dt  ic in the numerical stable region for tests with x0 = 0.006 plotted against $N_x^{-1}$. Crosses and open and filled circles show the results of the implicit Hermite scheme, IRK4, and IRK1.
Fig. 16.

dt  ic in the numerical stable region for tests with x0 = 0.006 plotted against |$N_x^{-1}$|⁠. Crosses and open and filled circles show the results of the implicit Hermite scheme, IRK4, and IRK1.

4 Summary

If we use multi-stage integration schemes, such as Runge–Kutta schemes, with mesh-free methods we need to perform the interaction calculation, which is the most expensive part of the calculation, multiple times per time step. We constructed a Hermite scheme for a high-order mesh-free method. The accuracy of fourth-order Hermite schemes is at least similar to those of Runge–Kutta schemes and the region of stability of Hermite schemes is better than those of Runge–Kutta schemes. Therefore, we can use a large time-step with the Hermite scheme compare to that for the Runge–Kutta scheme for the same accuracy. We conclude that Hermite schemes are more computationally efficient than commonly used Runge–Kutta schemes for a high-order mesh-free method.

Acknowledgements

We would like to thank the referee for his or her insightful comments and suggestions. We also thank the editor for his or her assistance. We thank Masaki Iwasawa, Keigo Nitadori and Daisuke Namekata for discussions about Hermite schemes and Runge–Kutta schemes. This research was supported by RIKEN Junior Research Associate Program and MEXT as “Exploratory Challenge on Post-K computer” (Elucidation of the Birth of Exoplanets [Second Earth] and the Environmental Variations of Planets in the Solar System).

References

Antuono
 
M.
,
Colagrossi
A.
,
Marrone
S.
,
Lugni
C.
 
2011
,
Comput. Phys. Commun.
,
182
,
866

Aoki
 
T.
 
1997
,
Comput. Phys. Commun
.,
102
,
132

Iwasawa
 
M.
,
Tanikawa
A.
,
Hosono
N.
,
Nitadori
K.
,
Muranushi
T.
,
Makino
J.
 
2016
,
PASJ
,
68
,
54

Makino
 
J.
 
1991
,
ApJ
,
369
,
200

Makino
 
J.
,
Aarseth
S. J.
 
1992
,
PASJ
,
44
,
141

Sod
 
G. A.
 
1978
,
J. Comput. Phys.
,
27
,
1

Wendland
 
H.
 
1995
,
Adv. Comput. Math.
,
4
,
389

Yamamoto
 
S.
,
Makino
J.
 
2017
,
PASJ
,
69
,
35

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)