Solving inverse problems in physics by optimizing a discrete loss: Fast and accurate learning without neural networks

Author Notes

Abstract

In recent years, advances in computing hardware and computational methods have prompted a wealth of activities for solving inverse problems in physics. These problems are often described by systems of partial differential equations (PDEs). The advent of machine learning has reinvigorated the interest in solving inverse problems using neural networks (NNs). In these efforts, the solution of the PDEs is expressed as NNs trained through the minimization of a loss function involving the PDE. Here, we show how to accelerate this approach by five orders of magnitude by deploying, instead of NNs, conventional PDE approximations. The framework of optimizing a discrete loss (ODIL) minimizes a cost function for discrete approximations of the PDEs using gradient-based and Newton’s methods. The framework relies on grid-based discretizations of PDEs and inherits their accuracy, convergence, and conservation properties. The implementation of the method is facilitated by adopting machine-learning tools for automatic differentiation. We also propose a multigrid technique to accelerate the convergence of gradient-based optimizers. We present applications to PDE-constrained optimization, optical flow, system identification, and data assimilation. We compare ODIL with the popular method of physics-informed neural networks and show that it outperforms it by several orders of magnitude in computational speed while having better accuracy and convergence rates. We evaluate ODIL on inverse problems involving linear and nonlinear PDEs including the Navier–Stokes equations for flow reconstruction problems. ODIL bridges numerical methods and machine learning and presents a powerful tool for solving challenging, inverse problems across scientific domains.

partial differential equations, inverse problems, physics-informed neural networks, computational fluid dynamics

Subject

Applied Mathematics Mechanical Engineering

Issue Section:

Physical Sciences and Engineering

Editor:

Significance Statement

Inverse problems are fundamental in science and engineering, often revealing solutions unattainable through direct methods. Inspired by efforts to solve such problems using physics neural networks (PINNs), we present the framework of optimizing a discrete loss (ODIL). ODIL deploys conventional discretizations on a grid instead of neural networks. ODIL is a drop-in replacement for PINNs in many scientific and engineering problems, while being several orders of magnitude more efficient in terms of computational cost and achieving higher accuracy.

Introduction

The numerical solution of physical models expressed as partial differential equations (PDEs) is indispensable across all areas of science, engineering, and medicine. Commonly, solutions are presented for the so-called forward problems where the PDEs are discretized on grids or particles (1–3), and the discrete problems are solved to determine quantities of interest given initial and/or boundary conditions. In recent years, the advent of data science has prompted intense interest in solving PDEs in the context of “inverse problems” where the parameters and even the structure of PDE models are inferred from data (4–7). Solving inverse problems in science and engineering entails formidable challenges as the PDEs need to be solved while observing physical and geometrical constraints as well as noisy data (8). Such problems are encountered in several fields of science and engineering, and they have been handled by methods such as PDE-constrained optimization (9), data assimilation (10), optical flow in computer vision (11), and system identification (12). As there is an ever increasing number of problems that involve noisy or missing data, there is a strong need to develop efficient methods for solving such problems and machine learning has emerged as a potent modality (7, 13, 14), albeit with certain limitations (8).

Physics-informed learning using neural networks (NNs) for partial and ordinary differential equations were introduced three decades ago, by Dissanayake and Phan-Thien (15), van Milligen et al. (16), and Lagaris et al. (17). In these works, the unknown fields were represented by NNs with weights obtained by minimizing a loss function composed by the residuals of the governing equations. Related efforts used NNs for modeling nonlinear dynamical systems (18, 19), back-tracking of an N-body system (20), and deep NNs for reconstructing the flow field in the near-wall region of a turbulent flow using wall only information (21). At the time of their introduction, these methods were not favored for solving forward or inverse problems, in particular due to their computational cost and the speed of available computing architectures. Twenty years later, the method was revived by Raissi et al. (22) who used modern machine-learning methods and software (such as deep neural networks, automatic differentiation, and TensorFlow) to carry out its operations. The method has been popularized by the term physics-informed neural network (PINN).

PINNs have created a lot of enthusiasm in the scientific community as they offered an alternative to solving complex physical problems using convenient representations at a significantly reduced cost of implementation. However, it is broadly accepted that PINNs cannot match conventional numerical methods for solving well-posed forward problems involving PDEs but they are positioned as a convenient tool for solving ill-posed and inverse problems and data-driven modeling (7). The proper assessment of this claim hindered by the fact that to date there is no baseline for assessing the capabilities of PINNs with respect to conventional methods for solving forward and inverse problems. Raissi et al. (22) admit that conventional numerical solvers are more efficient for forward problems, a direct comparison with such solvers reveals important drawbacks of the neural network approach (23).

The overall computational cost of PINNs originates from the cost of evaluating the solution in each collocation point. Represented by a fully connected network, the solution in each point depends on all weights of the neural network, therefore the cost of evaluating it in one point is proportional to the number of weights. In contrast to neural networks, conventional grid-based methods represent the solution with local approximations resulting in a constant cost per point. Consequently, the gradient of the residual with respect to the weights in one point is a dense vector and the Hessian of the loss function a dense matrix. This makes higher order optimization methods such as Newton’s method unsuitable for training NNs. Furthermore, the convenience of representation and ease of formalism by PINNs comes with limitations. While PINN evaluates the differential operator exactly on a set of collocation points, it makes no allowance for physically and numerically motivated adjustments that speed up convergence and ensure discrete conservation such as deferred correction, upwind schemes, and discrete conservation laws as represented, for example, in the finite volume method. Furthermore, decades of development and analysis that went into conventional solvers enables understanding, prediction, and control of their convergence and stability properties. Such information is not available with neural networks and recent works aim to remedy this situation (24, 25). Application of PINN to equations with high-order derivatives is also limited (26, 27) as the computational cost of nested automatic differentiation is exponential in the order of differentiation (28, 29), while in the case of finite differences, the cost is linear.

More recently, the architectures deployed in PINNs have undergone significant enhancements, including the introduction of positional encodings, exact enforcement of boundary conditions, incorporation of adaptive activation functions, and improving the treatment of constraints and invariances (30–34). While these techniques significantly improve the performance in certain cases, they do not guarantee a universal improvement in accuracy or training speed. When it comes to studies involving PINNs, it is generally not advisable to start with these techniques, considering the numerous hyperparameters that already require tuning. For instance, there are cases where enforcing exact boundary conditions can lead to worse performance. In such scenarios, a more effective approach may involve blending the boundary conditions into the approximation, even if it diverges from the reference solution. The same principle applies to positional encodings, which entail adding sine waves to the input. While this approach clearly boosts performance in scenarios like the wave equation with periodic boundary conditions, where the solution is a linear combination of sine waves, it may not yield the same benefits for more general conditions. Nevertheless, in our experimentation with PINNs, we applied several well-known algorithmic modifications. Our observations did not indicate any noteworthy enhancements, consistent with similar experiences by other groups (35–37). Moreover, the effects of these modifications were similar to the variations observed when changing the random seed (see SI Appendix). Our observations align with a recent paper authored by one of the co-creators of PINNs, wherein the attempt to solve the $Re = 2000$ lid-driven cavity flow problem not only was unsuccessful but also resulted in several nonphysical, pseudo-solutions (38).

Here, we present a framework that is based on conventional numerical methods that overcomes many of the challenges encountered by PINNs. We show that this valuable approach can be accelerated by five orders of magnitude by deploying conventional numerical algorithms that are facilitated by machine-learning tools. We cast the framework as optimizing a discrete loss (ODIL) by combining discrete formulations of PDEs with modern machine-learning tools to extend their scope to ill-posed and inverse problems.

The framework of ODIL minimizes a cost function for the discrete forms of PDEs using gradient-based and Newton’s methods. We demonstrate that ODIL is effective in solving inverse problems for equations with missing parameters and with limited data. This framework has two key aspects. First, the discretization itself that defines the accuracy, stability, and consistency of the method. Second, the optimization algorithm to solve the discrete problem. Our method uses efficient computational tools that are readily available for both of these aspects. The discretization properties are inherited from the conventional numerical methods building upon the advances in this field over multiple decades. Since the sparsity of the problem is preserved, the optimization algorithm can use a Hessian and achieve a quadratic converge rate, which remains out of reach for gradient-based training of neural networks. We remark that solving the discrete equations as a minimization problem is not a new idea. It has been developed in various forms in different domains, including the discretize-then-differentiate approach in the context of PDE-constrained optimization (9), as the penalty method (39), and is related to the 4D-VAR problem in variational data assimilation (10, 40), in addition to regularization techniques (41). Differentiable solvers (42–45) serve as a tool for computing the gradients of the solution with respect to parameters of the problem. They can be combined with a gradient-based optimization method to fit the solution to known data and infer the parameters. However, they assume that the forward problem is well-posed, which is not always the case as demonstrated by the “notorious test problem” in optimal control (46, 47) discussed in Fig. S4.

Another related approach is the neural bootstrapping method (NBM) (27, 48). NBM represents the solution with a neural network as in PINN but evaluates conventional discretization schemes on a set of random collocation points using a local stencil around each point. The authors have applied this approach to elliptic problems with discontinuities. Although NBM does not reduce the cost of evaluating the solution in each point introduced by a fully connected NN, it can be less computationally expensive than PINN if the equations contain high-order derivatives. The cost of nested automatic differentiation in PINN grows exponentially with the order of the derivative, but NBM uses finite differences for which the cost is linear. While the authors do not provide a direct comparison with conventional solvers in terms of the accuracy and computational cost, they use small neural networks (e.g. 1,000 parameters) and evaluate the loss on a grid with significantly more cells (e.g. $128^{3}$ cells), suggesting that NBM is more efficient than conventional solvers in terms of the number of required degrees of freedom. However, the presented test cases involve smooth functions with only low-frequency modes (e.g. products of sinusoidals). Given the phenomenon of spectral bias (49) of NNs, more complex solutions will require larger networks. Moreover, the authors do not demonstrate applications of NBM to inverse problems, which is the primary domain of ODIL. In a related work, blended inverse-PDE networks (BiPDE) (50) incorporate a differentiable PDE solver as a layer in an NN with the goal of solving inverse problems for parameterized PDEs. BiPDE is an autoencoder that takes a solution field, passes it through an encoder to the unknown hidden parameters, and uses the parameters as input for the PDE solver layer, which outputs the solution. The autoencoder is trained on data obtained by solving the forward problem with random parameters sampled from a given distribution. The trained encoder can be used for finding the parameters from a known solution field, possibly perturbed by noise. We remark that, unlike BiPDE, ODIL includes both the residual of the discretized PDEs and known data into the loss function and obtains the solution and the parameters by minimizing the total loss.

In the following, we present a potent computational framework for solving forward and inverse problems with missing and noisy data and facilitate its deployment by adopting machine-learning tools. Adopting modern machine-learning tools, such as automatic differentiation, renders the implementation of ODIL as convenient as applying gradient-based methods in PINNs, providing a baseline for the comparison of these methods. We present a number of solutions to prototypical and benchmark problems to evaluate the performance of ODIL. Our results indicate that ODIL outperforms PINNs by up to five orders of magnitude in computational speed and with better accuracy while covering the same class of inverse and ill-posed problems. Finally, we note that ODIL is an example of bridging numerical methods and machine learning. The successful fusion of these two domains with complementary potentials can create further developments of computational methods for challenging scientific and engineering problems described by PDEs.

Methods

Formulation of ODIL

Here, we formulate the ODIL framework to solve forward and inverse problems for PDEs. Using a given discretization of the PDEs, we construct a loss function from the residuals of the discretization, data terms, and regularization terms. The applications in this article involve discretizations on a uniform Cartesian grid, but this general formulation is applicable to other cases, e.g. an unstructured triangular grid. Let C denote the cell indices of a grid in a d-dimensional space $R^{d}$ with the number of cells $N_{C} = | C |$ ⁠. Consider a system of $N_{F}$ discrete equations, $N_{u}$ unknown fields, and $N_{θ}$ parameters

F^{(i)} [u, θ] = 0, i = 1, \dots, N_{F},

(1)

where $F^{(i)} : (R^{C})^{N_{u}} \times R^{N_{θ}} \to R^{C}, i = 1, \dots, N_{F}$ are the residual operators of the discrete equations, $u = (u^{(1)}, \dots, u^{(N_{u})})$ ⁠, are the unknown discrete fields $u^{(j)} \in R^{C}$ ⁠, and $θ \in R^{N_{θ}}$ is the vector of parameters. These discrete equations can represent discretizations of the governing PDEs (e.g. a finite difference scheme), equality constraints to impose known data (e.g. measurements of the solution in a finite set of points), and regularization terms (e.g. discrete Laplace operator as a smoothness constraint). The parameters θ include all unknowns that are not discrete fields (e.g. weights of neural networks used to represent nonlinear terms). The relevant boundary and initial conditions can either be incorporated into the discretization of the PDEs or added as extra equality constraints. The ODIL framework solves the unconstrained minimization problem

min_{u, θ} L (u, θ)

(2)

with the loss function

L (u, θ) = \sum_{i = 1}^{N_{F}} [\frac{1}{N_{C}} \sum_{c \in C} (F_{c}^{(i)} [u, θ])^{2}],

(3)

where $\cdot_{c}$ denotes the value of a field in cell $c \in C$ ⁠.

We consider two approaches to solving this minimization problem. The first approach is to use a standard gradient-based method, such as Adam (51) or L-BFGS-B (52), with the gradients of $L (u, θ)$ computed using automatic differentiation (53). One important enhancement of this approach is the multigrid decomposition technique to accelerate the convergence, which is described further below.

The second approach is the Gauss–Newton method, which iteratively solves the problem by updating the solution based on linearization of the residual operators about the current solution. Let $(u^{s}, θ^{s})$ denote the solution at iteration s. Define the linear approximations to the operators $F^{(i)}$ about the point $(u^{s}, θ^{s})$

\begin{aligned} {\tilde{F}}^{(i), s} [u, θ] \\ = F^{(i)} [u^{s}, θ^{s}] + {\frac{\partial F^{(i)}}{\partial u} |}^{s} [u - u^{s}] + {\frac{\partial F^{(i)}}{\partial θ} |}^{s} [θ - θ^{s}], \end{aligned}

(4)

where $\cdot |^{s}$ indicates that the Jacobian is evaluated at $(u^{s}, θ^{s})$ ⁠. Using the linearized operators, define the approximate loss function

L^{s} (u, θ) = \sum_{i = 1}^{N_{F}} [\frac{1}{N_{C}} \sum_{c \in C} ({\tilde{F}}_{c}^{(i), s} [u, θ])^{2}],

(5)

which is now a quadratic function of $u$ and θ. A minimum of this function provides the solution at the next iteration

(u^{s + 1}, θ^{s + 1}) = \arg min_{(u, θ)} L^{s} (u, θ) .

(6)

On the other hand, a minimum of $L^{s}$ satisfies the optimality conditions

\begin{matrix} \nabla_{u} L^{s} (u^{s + 1}, θ^{s + 1}) = 0, \\ \nabla_{θ} L^{s} (u^{s + 1}, θ^{s + 1}) = 0, \end{matrix}

(7)

which is a system of linear equations. To derive the system, we combine all unknowns in one column vector

U = (u, θ) \in R^{N_{u} N_{C} + N_{θ}},

(8)

all residuals in one column vector

F = (F^{(1)}, \dots, F^{(N_{F})}) \in R^{N_{F} N_{C}},

(9)

and the corresponding derivatives in one matrix

A = [\begin{matrix} \frac{\partial F^{(1)}}{\partial u} & \frac{\partial F^{(1)}}{\partial θ} \\ ⋮ & ⋮ \\ \frac{\partial F^{(N_{F})}}{\partial u} & \frac{\partial F^{(N_{F})}}{\partial θ} \end{matrix}] \in R^{(N_{F} N_{C}) \times (N_{u} N_{C} + N_{θ})} .

(10)

In this notation, the linearized operators (4) take the form

{\tilde{F}}^{s} [U] = F^{s} + A^{s} (U - U^{s}),

(11)

where $F^{s} = F [U^{s}]$ ⁠. The approximate loss function (5) becomes

L^{s} (U) = \frac{1}{N_{C}} ‖ {\tilde{F}}^{s} [U] ‖_{2}^{2} = \frac{1}{N_{C}} ‖ F^{s} + A^{s} (U - U^{s}) ‖_{2}^{2} .

(12)

Finally, the optimality conditions (7) are now equivalent to

(A^{s})^{*} A^{s} (U^{s + 1} - U^{s}) = - (A^{s})^{*} F^{s} .

(13)

Solving this linear system for $U^{s + 1}$ provides the iteration update. To solve the resulting sparse linear system at each iteration, we use a direct method (54, 55) or an algebraic multigrid method (56). Iterative methods require an initial guess to start the iteration. Unless stated otherwise, we use a zero initial guess for the discrete fields in all presented applications. We note that if the discretization is linear and the corresponding optimization problem is quadratic, the Gauss–Newton method converges to the solution after one iteration from any initial guess. If a problem involves a neural network, the initial weights are sampled from a uniform distribution and the initial biases are zero. For brevity, we refer to the above Gauss–Newton method as simply Newton’s method throughout the article.

Calculating sparse Jacobians

Discrete equations (1) that result from approximations of PDEs are often local, i.e. the residual in each cell depends on values of unknown fields in a limited number of neighboring cells. In such cases, the Jacobians with respect to the unknown fields in Eq. (10) are sparse. To calculate them, we intend to use automatic differentiation of scalar functions. Coloring techniques (57, 58) can be used to estimate a sparse Jacobian using multiple evaluations of a Jacobian-vector product. However, we use a more explicit approach that directly tracks the dependencies on the neighboring cells. For simplicity, assume that $N_{u} = 1$ ⁠. Cases with $N_{u} > 1$ are handled by extension. To calculate the Jacobian of a residual operator $F : R^{C} \times R^{N_{θ}} \to R^{C}$ ⁠, we assume that the operator explicitly depends on the field values in K neighboring cells

F [u, θ] = g [S^{(1)} [u], \dots, S^{(K)} [u], θ],

(14)

where $g : (R^{C})^{K} \times R^{N_{θ}} \to R^{C}$ is an element-wise operator of its field arguments, i.e. its value $g_{c} [v^{(1)}, \dots, v^{(K)}, θ]$ in each cell $c \in C$ only depends on $(v_{c}^{(1)}, \dots, v_{c}^{(K)}, θ)$ ⁠. Here $S^{(k)} : R^{C} \to R^{C}, k = 1, \dots, K$ are shift operators selecting values from the neighboring cells or possibly the identity operator. On a Cartesian grid, a discrete field is a d-dimensional array and a shift operator simply shifts the array. On an unstructured triangular grid, a shift operator selects adjacent triangles. Then, we introduce a scalar function

G (v^{(1)}, \dots, v^{(K)}, θ) = \sum_{c \in C} g_{c} (v^{(1)}, \dots, v^{(K)}, θ)

(15)

and calculate the desired Jacobian as

\frac{\partial F}{\partial u} = \sum_{k = 1}^{K} \frac{\partial G}{\partial v^{(k)}} ⊙ S^{(k)},

(16)

where the derivatives of $G$ are evaluated at $v^{(k)} = S^{(k)} [u]$ and $⊙$ is the element-wise product. Note that Eq. (16) expresses the Jacobian through K derivatives of the scalar function $G$ as intended.

Multigrid decomposition

Here, we propose a multigrid technique to accelerate the convergence of optimization methods for problems that involve discrete fields on a grid. Multigrid methods are generally accepted as the fastest numerical methods for solving elliptic differential equations (59). A standard multigrid method consists of the following parts: a hierarchy of grids including the original grid and coarser grids, discretizations of the problem on each grid, interpolation operators to finer levels, and restriction operators to coarser levels. The method iteratively updates the solution on each level, interpolates the update to finer levels, and restricts the residuals to coarser levels.

As discussed above, the optimization problem in ODIL can be solved with Newton’s method which involves a linear system at each step, so the multigrid method can be applied directly to that linear system. However, then we need to construct the linear system explicitly and rely on a general multigrid solver which may have limited efficiency especially on GPUs (60). Furthermore, if the equation involves unknown parameters, such as parameters of a neural network, as demonstrated on the problem of inferring conductivity from temperature, in Results, the corresponding Jacobian matrix is dense, which makes Newton’s method impractical.

Instead of Newton’s method, we can solve the problem with a general optimization algorithm, such as Adam or L-BFGS. However, as we demonstrate on the lid-driven cavity flow in Results, they require orders of magnitude more iterations than Newton’s method. To accelerate the convergence, we propose the following technique. Consider a uniform grid with $N_{1} = N$ points in each direction. Introduce a hierarchy of successively coarser grids of size $N_{i} = N / 2^{i - 1}$ for $i = 1, \dots, L$ ⁠, where L is the total number of levels. Define the multigrid decomposition operator as

M_{L} [u_{1}, \dots, u_{L}] = u_{1} + T_{1} u_{2} + \dots + T_{1} T_{2} \dots T_{L - 1} u_{L},

(17)

where each $u_{i}$ is a field on grid $N_{i}$ ⁠, and each $T_{i}$ is an interpolation operator from grid $N_{i + 1}$ to the finer grid $N_{i}$ ⁠. The multigrid decomposition of a discrete field u on a grid of size N reads

u = M_{L} [u_{1}, \dots, u_{L}] .

(18)

Note that this representation is over-parameterized and therefore not unique. The total number of scalar parameters increases from $N^{d}$ of the original field u to $N_{1}^{d} + \dots + N_{L}^{d}$ for the representation $u_{1}, \dots, u_{L}$ ⁠. We define the interpolation operators $T_{i}$ using linear interpolation (59). The rest of the framework remains the same, including the discretized PDEs and the optimization algorithm. Gradients of the resulting loss function can be computed using automatic differentiation. This technique addresses the issue of locality of gradient-based optimizers by extending the domain of dependence of each scalar parameter so that information can propagate through the grid faster. Figure S3 demonstrates the speedup on the lid-driven cavity flow. We refer to Ref. (61) for further details and examples related to the multigrid decomposition technique. All applications of ODIL with a gradient-based optimizer presented in Results use this technique. The only exception is the wave equation, where we intend to compare ODIL and PINN using standard optimization methods without enhancements.

Results

Wave equation: accuracy and cost of PINN

We apply ODIL and PINN to solve an initial-value problem for the wave equation in two dimensions and compare both methods in terms of accuracy and computational cost. The problem consists of the wave equation

u_{t t} = u_{x x}

(19)

with the initial conditions and boundary conditions

\begin{matrix} u (x, 0) = U (x, 0) \\ u_{t} (x, 0) = U_{t} (x, 0) \\ u (- 1, t) = U (- 1, t) \\ u (1, t) = U (1, t) \end{matrix}

(20)

in a rectangular domain $(x, t) \in Ω = [- 1, 1] \times [0, 1]$ ⁠, where $u (x, t)$ is the unknown function and $U (x, t)$ is the exact solution

U (x, t) = \frac{1}{10} \sum_{k = 1}^{5} (cos (x - t + 0.5) π k + cos (x + t + 0.5) π k) .

(21)

Both methods reduce the problem to minimization of a loss function. ODIL represents the solution on a uniform Cartesian grid of size $N_{x} \times N_{t}$ with cell centers $x_{i} = - 1 + (i + 0.5) Δ x, i = 1, \dots, N_{x}$ ⁠, and $t^{n} = (n + 0.5) Δ t, n = 1, \dots, N_{t}$ ⁠, where $Δ x = 2 / N_{x}$ and $Δ t = 1 / N_{t}$ ⁠. A second-order finite volume discretization yields the following residual operator

F_{i}^{n} [u] = \frac{u_{i}^{n} - 2 u_{i}^{n - 1} + u_{i}^{n - 2}}{Δ t^{2}} - \frac{u_{i + 1}^{n - 1} - 2 u_{i}^{n - 1} + u_{i - 1}^{n - 1}}{Δ x^{2}}

(22)

for $n > 2$ and

\begin{matrix} F_{i}^{2} [u] = \frac{u_{i}^{2} - u_{i}^{1}}{Δ t^{2}} - \frac{U_{t} (x_{i}, 0)}{Δ t} - \frac{u_{i + 1}^{1} - 2 u_{i}^{1} + u_{i - 1}^{1}}{Δ x^{2}} \\ F_{i}^{1} [u] = U (x_{i}, 0) + U_{t} (x_{i}, 0) - u_{i}^{1} \end{matrix}

(23)

for $n = 1, 2$ derived using the initial conditions. The boundary conditions enter the discretization through values in the halo cells set to

\begin{matrix} u_{0}^{n} = \frac{1}{3} (u_{2}^{n} - 6 u_{1}^{n} + 8 U (- 1, t^{n})) \\ u_{N_{x} + 1}^{n} = \frac{1}{3} (u_{N_{x} - 1}^{n} - 6 u_{N_{x}}^{n} + 8 U (1, t^{n})) \end{matrix}

(24)

using second-order extrapolation. ODIL solves the minimization problem for the loss function [3] computed from this residual operator. We choose $N_{x} = N_{t}$ and define the number of parameters of ODIL as $N = N_{x} N_{t}$ ⁠. PINN represents the solution as a fully connected neural network and evaluates the residuals of the differential equation exactly on a finite set of collocation points (22). The network consists of two equally sized hidden layers with $tanh$ activation functions. We define the number of parameters N of PINN as the total number of weights. The number of collocation points for PINN amounts to 8,192 points inside the domain and 768 points for the initial and boundary conditions. The optimization problem is solved with L-BFGS-B or Newton for ODIL and with L-BFGS-B for PINN. Here, we do not use the multigrid decomposition technique of ODIL.

Figure 1 presents examples of the solutions obtained using both methods and also shows how the error and execution time of both methods depend on the number of parameters. The results of PINN include 25 runs with random initial weights and positions of the collocation points. In the case of ODIL, solutions from two optimization methods, L-BFGS-B and Newton, coincide. For each run of PINN or ODIL with L-BFGS-B, we determine the best error achieved during 80,000 epochs and estimate the required number of epochs as the number of epochs to achieve 150% of the best error. Then, we estimate the execution time per epoch as the mean over the last 100 epochs for L-BFGS-B and one epoch for Newton’s method, excluding the time spent on startup, building the computational graph, and output. The measurements are performed on one CPU core of Intel Xeon E5-2690 v3 and exclude the startup time spent on building the computational graph required on the first epoch. We report the execution time as the product of the required number of epochs and the execution time per epoch. Both methods demonstrate similar accuracy. The error of ODIL scales as $1 / N$ corresponding to a second-order accurate discretization. The error of PINN is significantly scattered, but the median error also scales as $1 / N$ ⁠. In addition, we refer to Fig. S5 that shows the effect of weights initialization and demonstrates how PINN behaves on an asymmetric solution. The execution times of the methods, however, are drastically different. With $N \approx 19, 000$ parameters, PINN with L-BFGS-B takes 13 h, ODIL with L-BFGS-B takes 35 min, and ODIL with Newton takes only 0.5 s, which is about $100,000$ times faster than PINN. We note that running the same test on a GPU would reduce the timings of PINN since it creates a significant workload for a GPU. However, the timings obtained here on one CPU core serve as an estimate of the overall computational cost of each method, i.e. the number of required floating point operations.

Fig. 1.

Wave equation solved using PINN and ODIL. A–C) Root-mean-square error relative to the exact reference solution (A), required number of epochs (B), and execution time on one CPU core (C) of various methods: PINN with L-BFGS-B , ODIL with L-BFGS-B , ODIL with Newton , and line with slope $1 / N$ . For PINN, the dots show samples with random initial weights and collocation points, and solids lines show the median value. The number of parameters is the number of weights for PINN and the number of grid cells for ODIL. D) Solution obtained using PINN with L-BFGS-B with two hidden layers of 25 neurons, ODIL with Newton on a grid of $25 \times 25$ cells , and exact reference solution .

Open in new tab Download slide

Poisson equation in two dimensions

We apply ODIL and PINN to solve the boundary-value problem for the Poisson equation

u_{x x} + u_{y y} = f (x, y)

(25)

with zero Dirichlet boundary conditions in a unit domain $(x, y) \in [0, 1]^{2}$ ⁠. The exact reference solution is an oscillatory function

u (x, y) = sin (π (k x)^{2}) sin π y

(26)

with $k = 2$ or $k = 4$ and the right-hand side $f (x, y)$ is found by substituting the exact solution into the equation, i.e. using the method of manufactured solutions. ODIL solves the problem on a grid of $64 \times 64$ cells. PINN uses a neural network of size $32 \times 32 \times 32$ with 400 points on the boundaries and 1,000 collocation points inside the domain, which are resampled every 100 iterations. Both methods use the Adam optimizer with the learning rate set to $0.001$ for PINN and $0.005$ for ODIL. To implement the PINN method, we use the DeepXDE framework (62).

Figure 2 shows results of both methods compared to the reference solution along with the convergence history. The error is measured relative to the exact solution. In the case $k = 2$ ⁠, both methods recover the reference solution achieving an error of $0.02$ for PINN and $0.002$ for ODIL. However, PINN takes about 100 times more iterations to achieve the same error. In the case $k = 4$ ⁠, PINN fails to recover the reference solution after 500,000 epochs, while ODIL achieves an error of $0.1$ ⁠.

Fig. 2.

Poisson equation solved using PINN and ODIL. The exact reference solution is an oscillatory function $u (x, y) = sin (π (k x)^{2}) sin π y$ ⁠. A, B) Solution and history of root-mean-square error obtained using PINN with Adam and ODIL with Adam compared to the exact reference solution for $k = 2$ ⁠. C, D) Same for $k = 4$ ⁠.

Open in new tab Download slide

Velocity from tracer

We consider the evolution of a tracer field governed by the advection equation in two dimensions. The problem is to find a velocity field $u (x, y)$ given that the tracer field $c (x, y, t)$ satisfies the advection equation

\frac{\partial c}{\partial t} + u \cdot \nabla c = 0

(27)

in a unit domain, and the initial $c (x, y, 0) = c_{0} (x, y)$ and final $c (x, y, 1) = c_{1} (x, y)$ profiles are known. The discrete problem is solved in space and time on a $64 \times 64 \times 64$ grid. The loss function includes a discretization of the equation with a first-order upwind scheme, terms to impose known initial and final profiles of the tracer, and regularization terms $‖ 10^{- 3} \nabla^{2} u ‖^{2}$ and $‖ u_{t} ‖^{2}$ to prioritize velocity fields that are smooth and stationary. The results of inference are shown in Fig. 3. The inferred velocity field stretches the initial tracer representing a circle to match the final profile.

Fig. 3.

Inferring the velocity field from two snapshots of a tracer. The tracer satisfies the advection equation with an unknown velocity, and the reference tracer is imposed at the first and final time instants. Arrows show the inferred velocity field.

Open in new tab Download slide

Inferring the velocity field from tracers is an example of the optical flow problem (11) in computer vision and the image registration problem (63) in medical imaging. Furthermore, reconstructing the velocity field from a concentration field can assist experimental measurements.

Inferring conductivity from temperature

Here, we consider an inverse problem of inferring a conductivity function from temperature measurements. We solve the problem in a unit domain $(x, t) \in [0, 1]^{2}$ ⁠. The problem is to find a nonlinear conductivity function $k (u)$ and temperature field $u (x, t)$ that satisfies the heat equation

u_{t} - (k (u) u_{x})_{x} = 0

(28)

with zero Dirichlet boundary conditions $u (0, t) = u (1, t) = 0$ and initial conditions $u (x, 0) = U (x)$ ⁠, where $U (x) = g (x) - g (0)$ and $g (x) = e^{- 50 (x - 0.5)^{2}}$ ⁠. In addition, the temperature field needs to take known values $u (x_{j}, t_{j}) = {\tilde{U}}_{j}$ in a finite set of measurement points $({\tilde{x}}_{j}, {\tilde{t}}_{j})$ for $j = 1, \dots, N_{data}$ with $N_{data} = 200$ ⁠. We discretize the equation on a uniform grid of $N_{x} \times N_{t}$ cells with a second-order Crank–Nicolson scheme such that the residual operator takes the form

\begin{aligned} F_{i}^{n} [u, k] & = \frac{u_{i}^{n} - u_{i}^{n - 1}}{Δ t} \\ - \frac{k (u_{i + 1 / 2}^{n - 1 / 2}) (u_{i + 1}^{n - 1 / 2} - u_{i}^{n - 1 / 2}) - k (u_{i - 1 / 2}^{n - 1 / 2}) (u_{i}^{n - 1 / 2} - u_{i - 1}^{n - 1 / 2})}{Δ x^{2}} \end{aligned}

(29)

for $i = 1, \dots, N_{x}$ and $n = 1, \dots, N_{t}$ ⁠, where $u_{i + 1 / 2} = (u_{i + 1} + u_{i}) / 2$ ⁠, $u^{n - 1 / 2} = (u^{n} + u^{n - 1}) / 2$ ⁠, and $k : R \to R$ is a conductivity function. The initial and boundary conditions enter the discretization through values in the halo cells defined as

\begin{matrix} u_{i}^{0} = \frac{1}{3} (u_{i}^{2} - 6 u_{i}^{1} + 8 U (x_{i})) \\ u_{0}^{n} = \frac{1}{3} (u_{2}^{n} - 6 u_{1}^{n}) \\ u_{N_{x} + 1}^{n} = \frac{1}{3} (u_{N_{x} - 1}^{n} - 6 u_{N_{x}}^{n}) \end{matrix}

(30)

using second-order extrapolation. The loss function for the inverse problem consists of the residual and terms to impose the temperature measurements

\begin{aligned} L (u, θ) & = \frac{1}{N_{x} N_{t}} \sum_{i = 1}^{N_{x}} \sum_{n = 1}^{N_{t}} (F_{i}^{n} (u, k_{θ}))^{2} \\ + \frac{w_{data}^{2}}{N_{data}} \sum_{i = 1}^{N_{x}} \sum_{n = 1}^{N_{t}} ((u_{i}^{n} - {\tilde{U}}_{i}^{n}) χ_{i}^{n})^{2}, \end{aligned}

(31)

where $w_{data} = 2$ ⁠. We assume that the set of measurement points $({\tilde{x}}_{j}, {\tilde{t}}_{j})$ is a subset of the cell centers. If a cell $(i, n)$ contains a measurement point j, then we set ${\tilde{U}}_{i}^{n} = {\tilde{U}}_{j}$ and define a mask field $χ_{i}^{n} = 1$ ⁠, otherwise $χ_{i}^{n} = 0$ ⁠. For Newton’s method, the loss function additionally includes a damping term for the weights of the neural network

L^{Newton} (u, θ) = L (u, θ) + \frac{w_{θ}^{2}}{N_{θ}} \sum_{i = 1}^{N_{θ}} (θ_{i} - θ_{i}^{*})^{2},

(32)

where $w_{θ} = 1$ ⁠, θ is a vector of all weights, and $θ^{*}$ is a vector with the same weights but “frozen” so they are ignored in the linearization of the problem. This regularization makes the matrix in the linear system [13] nonsingular but does not affect the solution if the iteration converges.

To generate the temperature measurements and the reference solution, we specify the reference conductivity function as

k (u) = 0.02 e^{- 20 (u - 0.5)^{2}}

(33)

and solve the forward problem using ODIL on a grid of $256 \times 256$ cells. Then, we solve the inverse problem using PINN and ODIL and compare the results. To represent the unknown conductivity function $k (u)$ ⁠, we use a fully connected neural network of size $1 \times 5 \times 5 \times 1$ ⁠, i.e. one input u, two hidden layers with five neurons in each layer and $tanh$ activations, and its output q passed through $k = 0.1 / (1 + e^{- q})$ to ensure that the conductivity is non-negative. To represent the temperature field $u (x, t)$ ⁠, we use a fully connected neural network of size $2 \times 32 \times 32 \times 32 \times 32 \times 1$ in PINN and a uniform grid of $64 \times 64$ cells in ODIL. The number of collocation points for PINN is 4,096 points inside the domain and 384 for the initial and boundary conditions. To solve the optimization problem with PINN and ODIL with use Adam with the learning rate set to 0.001, and also use Newton’s method with ODIL. Figure 4 shows the convergence history, the inferred temperature and conductivity. Both PINN and ODIL infer similar temperature fields and conductivity functions. The convergence history includes the root-mean-square error in the temperature field relative to the reference solution and the root-mean-square error in the conductivity function in the range $u \in (0, 1)$ with both quantities normalized by their maximum values in the reference solution. ODIL with Newton demonstrates the fastest convergence and takes about 25 iterations to achieve an error of 5% in the conductivity function. ODIL with Adam takes 13,000 iterations to achieve the same error. In contrast, PINN converges slower achieving the same error after 55,000 iterations. The execution time on one CPU core amounts to about 250 ms per iteration for PINN, 3.2 ms per iteration for ODIL with Adam, and 5,500 ms per iteration for ODIL with Newton. On a GPU, PINN takes 3.5 ms per iteration. Consistent with our previous observations, ODIL takes fewer iterations than PINN and each iteration is cheaper, which results in two to three orders of magnitude lower computational cost overall. Figure 5 shows the results if the temperature measurements are perturbed by Gaussian noise with standard deviation $σ = 0.1$ ⁠. The error is computed relative to the unperturbed reference solution. All methods achieve lower accuracy of recovering the conductivity function, ODIL with Newton is affected more by the added noise as the inferred conductivity has a larger error than that of ODIL and PINN optimized with Adam.

Fig. 4.

Inferring conductivity from temperature measurements using ODIL and PINN. Temperature field (A) and conductivity function (B) inferred by PINN with Adam , ODIL with Adam , and ODIL with Newton compared to reference . The reference solution is from forward problem solved using ODIL on a finer grid. The temperature measurements in 200 points (black dots) are taken from the reference solution. History of root-mean-square error in temperature (C) and conductivity (D).

Open in new tab Download slide

Fig. 5.

Inferring conductivity from noisy temperature measurements using ODIL and PINN. Temperature field (A) and conductivity function (B) inferred by PINN with Adam , ODIL with Adam , and ODIL with Newton compared to reference . The reference solution is from forward problem solved using ODIL on a finer grid. The temperature measurements in 200 points (black dots) are taken from the reference solution perturbed by Gaussian noise with $σ = 0.1$ ⁠. History of root-mean-square error in temperature (C) and conductivity (D) relative to unperturbed reference solution.

Open in new tab Download slide

Lid-driven cavity: forward problem

The lid-driven cavity problem is a standard test case (64) for numerical methods for the steady-state Navier–Stokes equations in two dimensions

\begin{matrix} u_{x} + v_{y} = 0, \\ u u_{x} + v u_{y} = - p_{x} + \frac{1}{Re} (u_{x x} + u_{y y}), \\ u v_{x} + v v_{y} = - p_{y} + \frac{1}{Re} (v_{x x} + v_{y y}), \end{matrix}

(34)

where $u (x, y)$ and $v (x, y)$ are the two velocity components and $p (x, y)$ is the pressure. The problem is solved in a unit domain with no-slip boundary conditions. The upper boundary is moving to the right at a unit velocity while the other boundaries are stagnant. We solve this problem using ODIL and PINN. In ODIL, we use a finite volume discretization on a uniform Cartesian grid with the second-order upwind scheme for the momentum equations, Rhie-Chow interpolation (65) to prevent oscillations in the pressure field, and the deferred correction approach (66) that treats high-order discretization explicitly and low-order discretization implicitly to obtain an operator with a compact stencil and therefore reduce the size of the linear system for Newton’s method. In PINN, we use a fully connected network of size $2 \times 64 \times 64 \times 64 \times 64 \times 64 \times 3$ ⁠, with 400 collocation points on the boundaries, and inside the domain there are 10,000 points which are resampled every 100 epochs. The implementation of PINN is based on the DeepXDE framework (62). We refer to Figs. S6 and S7 for a hyperparameter study for PINN.

Figure 6 shows the streamlines for values of $Re$ between 100 and 3,200 obtained using ODIL with Newton on a $128 \times 128$ grid, as well as a convergence history of PINN with L-BFGS-B, ODIL with Adam, and ODIL with Newton. The learning rate of Adam is $10^{- 3}$ ⁠. The root-mean-square error in velocity u in ODIL is computed relative to the solution after 100 epochs with Newton. The error of PINN is measured relative to the solution after 500,000 epochs with L-BFGS-B. To get an idea about the convergence speed of each method, we measure how many epochs it takes to reach an error of $1 %$ ⁠. At $Re = 100$ ⁠, ODIL with Adam takes 5,500 epochs and PINN takes 18,000 epochs. At $Re = 1000$ ⁠, ODIL with Adam takes 6,050 epochs and PINN formally takes 420,000 epochs, but that indicates that the PINN optimization has not converged after 500,000 epochs. In contrast, ODIL with Newton takes three iterations at $Re = 100$ and eight iterations at $Re = 1000$ ⁠, showing that Newton’s method converges much faster, but at the cost of more expensive iterations. ODIL with Newton takes 2,200 ms per epoch, ODIL with Adam takes 8 ms on one CPU core, and PINN takes 30 ms on a GPU Nvidia A100. The results shown in Fig. 6 as contour plots of vorticity and profiles of vertical velocity agree well with reference numerical data (64) for both methods at $Re = 100$ ⁠, but PINN at $Re = 1000$ fails to reproduce the reference data.

Fig. 6.

Lid-driven cavity problem solved using PINN and ODIL. A) Streamlines for $Re = 100, 400, 1000,$ and $3200$ obtained using ODIL. The arrow indicates the direction of upper wall. B–D) Results for $Re = 100$ obtained using PINN with L-BFGS-B and ODIL with Newton . E–G) Same for $Re = 1000$ ⁠. B, E) Contours of vorticity compared to reference (64) . Reprinted with permission from Elsevier. C, F) Profiles of velocity v along the centerline $y = 0.5$ compared to reference (64) . D, G) History of root-mean-square error in velocity u obtained using PINN with L-BFGS-B , ODIL with Newton , and ODIL with Adam . The error of PINN is relative to the solution after 500,000 epochs with L-BFGS-B. The error of ODIL is relative to the solution after 100 epochs with Newton.

Open in new tab Download slide

Lid-driven cavity: flow reconstruction

Here, we solve the same equations [34] as in the forward problem, but we impose the known velocity values $u (x_{j}) = {\tilde{u}}_{j}$ in a finite set of measurement points $x_{i}$ for $j = 1, \dots, N_{data}$ with $N_{data} = 100$ ⁠. However, we remove the no-slip boundary conditions and replace them with linear extrapolation, turning this problem into a flow reconstruction problem. The loss function is the same as in the forward problem, but the known velocity values are imposed through reparameterization of the velocity field, which takes the known value in cells containing a measurement point, and an unknown value otherwise. The problem is solved on a grid of $128 \times 128$ cells and the reference velocity field is obtained from the forward problem. Figure 7 shows the results of reconstruction from 100 points.

Fig. 7.

Lid-driven cavity flow at $Re = 3200$ reconstructed using ODIL from velocity measurements in 100 points (black dots) without boundary conditions for velocity. Inferred and reference velocity u (A), velocity v (B), and vorticity ω (C).

Open in new tab Download slide

Measure of flow complexity

Based on the flow reconstruction from velocity measurements using ODIL in the previous case, we introduce a measure of flow complexity. Consider a velocity field $u_{ref}$ that satisfies the Navier–Stokes equations. First, we define the minimal reconstruction error given K points

E_{\min} (K) = min_{| X | = K} ‖ u_{ref} - u (X) ‖,

(35)

where $u (X)$ is the velocity field reconstructed from points X and the minimum is taken over all sets of K points. Then, we define a measure of flow complexity for a given accuracy $ε > 0$ as the minimal number of points required to achieve that accuracy

K_{\min} (ε) = {K | E_{\min} (K) < ε} .

(36)

To illustrate this complexity measure, we consider four types of flow: uniform velocity, Couette flow (linear profile), Poiseuille flow (parabolic profile), and the flow in a lid-driven cavity. The reference velocity fields in the first three cases are chosen once at an arbitrary orientation and such that the maximum velocity magnitude is unity. As in the previous case, we do not impose any boundary conditions for the velocity, replacing them with linear extrapolation. However, to generate the samples faster, we use a smaller $N \times N$ grid with $N = 64$ and optimize using Newton’s method. Newton’s method with ODIL requires adding a regularization term, for which we choose a discretization of

k_{reg}^{2} (‖ u_{x x} ‖^{2} + ‖ u_{y y} ‖^{2} + ‖ v_{x x} ‖^{2} + ‖ v_{y y} ‖^{2})

(37)

with $k_{reg} = 10^{- 3}$ ⁠. Figure 8 shows the values of $E (K)$ and $K_{\min} (0.05)$ ⁠. The error is the mean absolute difference between the inferred and reference solutions summed over both velocity components. As expected, the uniform flow takes 1 point to reconstruct while adding higher order terms to the velocity increases the number of points. Finally, 29 points are required to reach an error of $0.05$ for the lid-driven cavity flow.

Fig. 8.

Fluid flow reconstruction error depending on the number of measurement points for various flows: uniform, Couette flow, Poiseuille flow, and lid-driven cavity. The dots show samples of random sets of K points and the minimum over them is an estimate of the minimal reconstruction error $E_{\min} (K)$ .

Open in new tab Download slide

Inferring body shape from velocity in two dimensions

Here, we consider a 2D inverse problem of inferring the shape of a body from measurements of the flow velocity around the body. The model consists of the steady-state Navier–Stokes equations with penalization terms to impose the no-slip conditions on the body (67)

\begin{matrix} u_{x} + v_{y} = 0, \\ (1 - χ) (u u_{x} + v u_{y} + p_{x} - \frac{D}{Re} (u_{x x} + u_{y y})) + λ χ u = 0, \\ (1 - χ) (u v_{x} + v v_{y} + p_{y} - \frac{D}{Re} (v_{x x} + v_{y y})) + λ χ v = 0, \end{matrix}

(38)

where λ is a penalization parameter and D is a characteristic length of the body. The shape of the body is described by the body fraction $χ (x, y)$ which takes values $χ = 1$ inside the body and $χ = 0$ outside. The problem is solved in the domain $[0, 2] \times [0, 1]$ with the inlet conditions $u = 1$ and $v = 0$ at $x = 0$ ⁠, outlet at $x = 2$ ⁠, and free-slip walls at $y = 0$ and $y = 1$ ⁠. The discretization of the Navier–Stokes equations follows the one used for the lid-driven cavity problem. The forward problem is to find the velocities u and v and pressure p given equations [38] with boundary conditions and a prescribed body fraction χ. The inverse problem is to find the velocities u and v, pressure p, and body fraction χ given equations [38] with boundary conditions and values of u and v in a set of measurement points.

To solve the inverse problem using ODIL, we formulate it as minimization of the loss function in terms of the unknown fields: velocities u and v, pressure p, and a level-set function $φ$ ⁠. The problem is solved on a $2 N \times N$ grid with $N = 64$ ⁠. The known velocity values are imposed through reparameterization of the velocity field, which takes the known value in cells containing a measurement point, and an unknown value otherwise. The loss function consists of the residuals of the governing equations [38] and a term with the eikonal equation for the level-set function

k_{ϕ}^{2} ‖ | \nabla φ |^{2} - 1 ‖_{2}^{2}

(39)

discretized using an upwind scheme (68). The parameter decays as $k_{ϕ} = max (1, 10 \times (0.5)^{s / 1, 000})$ with epoch s, and the penalization parameter is set to $λ = 10$ ⁠. The level-set function determines the body fraction as

χ = max (0, min (1, \frac{1}{2} + \frac{ϕ}{4 Δ x})) .

(40)

This approach with the level-set function ensures that the interface is diffused over a constant number of cells. The reference data are obtained from the forward problem. The chosen grid is sufficiently fine to resolve the flow based on the grid refinement study in Fig. S11. Figures 9–11 show the results of inference for three different bodies: a circle, an ellipse, and a nonconvex body. The characteristic length of the body is $D = 0.4$ ⁠, and the Reynolds number is $Re = 60$ ⁠. To solve the optimization problem, we use Adam with a learning rate of $0.001$ ⁠. The initial guess is $u = 1$ ⁠, $v = 0$ ⁠, $p = 0$ ⁠, and the level-set function

ϕ (x, y) = 0.02 - \sqrt{(x - 0.5)^{2} + (y - 0.5)^{2}}

(41)

describing a small circle. The optimization takes 20,000 iterations and completes in about 3 min on one CPU core. ODIL recovers the reference shape from velocity measurements in 100 points if the shape is convex. However, we observe that the method does not infer nonconvex shapes and instead finds a convex shape that matches the velocity measurements.

$Inferring body shape from velocity measurements for a flow past a circle at Re=60 using ODIL. The reference solution is from the forward problem solved using ODIL. A) Inferred and reference vorticity ω overlapped by contours of body fraction χ=0.5. The velocity measurements are imposed in 100 points (black dots) chosen at a distance from the body. B) History of root-mean-square in velocity u. C) History of root-mean-square in body fraction χ normalized by the mean of reference. See Fig. S13 for visualizations of velocity u, velocity v, body fraction χ, and level-set function ϕ.$

Fig. 9.

Inferring body shape from velocity measurements for a flow past a circle at $Re = 60$ using ODIL. The reference solution is from the forward problem solved using ODIL. A) Inferred and reference vorticity ω overlapped by contours of body fraction $χ = 0.5$ ⁠. The velocity measurements are imposed in 100 points (black dots) chosen at a distance from the body. B) History of root-mean-square in velocity u. C) History of root-mean-square in body fraction χ normalized by the mean of reference. See Fig. S13 for visualizations of velocity u, velocity v, body fraction χ, and level-set function ϕ.

Open in new tab Download slide

$Inferring body shape from velocity measurements for a flow past an ellipse at Re=60 using ODIL. The reference solution is from the forward problem solved using ODIL. A) Inferred and reference vorticity ω overlapped by contours of body fraction χ=0.5. The velocity measurements are imposed in 100 points (black dots) chosen at a distance from the body. B) History of root-mean-square in velocity u. C) History of root-mean-square in body fraction χ normalized by the mean of reference. See Fig. S14 for visualizations of velocity u, velocity v, body fraction χ, and level-set function ϕ.$

Fig. 10.

Inferring body shape from velocity measurements for a flow past an ellipse at $Re = 60$ using ODIL. The reference solution is from the forward problem solved using ODIL. A) Inferred and reference vorticity ω overlapped by contours of body fraction $χ = 0.5$ ⁠. The velocity measurements are imposed in 100 points (black dots) chosen at a distance from the body. B) History of root-mean-square in velocity u. C) History of root-mean-square in body fraction χ normalized by the mean of reference. See Fig. S14 for visualizations of velocity u, velocity v, body fraction χ, and level-set function ϕ.

Open in new tab Download slide

$Inferring body shape from velocity measurements for a flow past a nonconvex body at Re=60 using ODIL. The reference solution is from the forward problem solved using ODIL. A) Inferred and reference vorticity ω overlapped by contours of body fraction χ=0.5. The velocity measurements are imposed in 100 points (black dots) chosen at a distance from the body. B) History of root-mean-square in velocity u. C) History of root-mean-square in body fraction χ normalized by the mean of reference. See Fig. S15 for visualizations of velocity u, velocity v, body fraction χ, and level-set function ϕ.$

Fig. 11.

Inferring body shape from velocity measurements for a flow past a nonconvex body at $Re = 60$ using ODIL. The reference solution is from the forward problem solved using ODIL. A) Inferred and reference vorticity ω overlapped by contours of body fraction $χ = 0.5$ ⁠. The velocity measurements are imposed in 100 points (black dots) chosen at a distance from the body. B) History of root-mean-square in velocity u. C) History of root-mean-square in body fraction χ normalized by the mean of reference. See Fig. S15 for visualizations of velocity u, velocity v, body fraction χ, and level-set function ϕ.

Open in new tab Download slide

Inferring body shape from velocity in three dimensions

Here, we consider a 3D inverse problem of inferring the shape of a body from measurements of the flow velocity around the body. The model consists of the steady-state Navier–Stokes equations with penalization terms to impose the no-slip conditions on the body (67)

\begin{matrix} \nabla \cdot u = 0, \\ (1 - χ) ((u \cdot \nabla) u + \nabla p - \frac{D}{Re} \nabla^{2} u) + λ χ u = 0, \end{matrix}

(42)

where λ is a penalization parameter and D is a characteristic length of the body. The shape of the body is described by the body fraction $χ (x)$ which takes values $χ = 1$ inside the body and $χ = 0$ outside. The problem is solved in the domain $[0, 2] \times [0, 1] \times [0, 1]$ with the inlet condition $u = (1, 0, 0)$ at $x = 0$ ⁠, outlet condition $p = 0$ at $x = 2$ ⁠, and free-slip walls on the other boundaries. The discretization of the Navier–Stokes equations follows the one used for the lid-driven cavity problem. The forward problem is to find the velocity $u$ and pressure p that satisfy Eqs. (42) given a prescribed body fraction χ. The inverse problem is to find the velocity $u$ ⁠, pressure p, and body fraction χ that satisfy Eqs. (42) such that the velocity field takes known values $u (x_{i}) = u_{i}$ in a finite set of N measurement points $x_{i}$ for $i = 1, \dots, N$ ⁠.

To solve the inverse problem using ODIL, we formulate it as minimization of the loss function in terms of the unknown fields: velocity $u$ ⁠, pressure p, and body fraction $\hat{χ}$ ⁠. Here, $\hat{χ}$ is a transformed body fraction defined as $χ = 1 / (1 + e^{- (\hat{χ} + 5)})$ ⁠, so that during the optimization the body fraction χ only takes values between 0 and 1. The loss function is a sum of the residuals of equations (42) and terms to impose the reference data. We do not include any regularization terms. The penalization parameter is set to $λ = 1$ ⁠. The problem is solved on a $129 \times 65 \times 65$ grid, and the reference data are obtained from the forward problem. The characteristic length of the body is $D = 0.4$ and the Reynolds number is $Re = 60$ ⁠. To solve the optimization problem, we use L-BFGS (69) implemented in TensorFlow Probability (70). The initial guess is $u = (1, 0, 0)$ for the velocity, $p = 0$ for the pressure, and $\hat{χ} = 0$ for the transformed body fraction. According to the above transformation, the corresponding initial guess for the body fraction is $χ = 1 / (1 + e^{5})$ ⁠. We terminate the algorithm after 10,000 epochs for the forward problem and 20,000 epochs for the inverse problem.

Figures 12 and 13 show the results of the inference from 684 measurement points for two different bodies: a sphere and a hemisphere. The convergence history includes the velocity error and the body fraction error which are defined relative to the solution of the forward problem. In both cases, ODIL recovers a body shape that qualitatively agrees with the reference, although the relative error in the body fraction field amounts to 50%, so the inferred body volume is larger. On a GPU Nvidia A100, the forward problem with a sphere takes 53 min in total and $320 ms$ per epoch, while the inverse problem takes 122 min in total and $366 ms$ per epoch. We note that solving the same inverse problem on a finer grid of $257 \times 129 \times 129$ cells takes 132 min in total and $400 ms$ per epoch. Therefore, an eightfold increase in the number of grid cells will lead to a minor additional cost in the execution time of 8% since the GPU operates more efficiently with larger arrays.

$Inferring body shape from velocity measurements for a flow past a sphere using ODIL. A, B) Inferred (A) and reference (B) body shape and contours of vorticity magnitude. The velocity measurements are imposed in 684 points (red dots) chosen at a distance from the body. C–E) History of the square root of the loss (C), root-mean-square error in x-velocity (D), and body fraction χ normalized by the mean of reference (E) for the inverse problem and forward problem . The error is relative to the solution of the forward problem after 10,000 epochs.$

Fig. 12.

Inferring body shape from velocity measurements for a flow past a sphere using ODIL. A, B) Inferred (A) and reference (B) body shape and contours of vorticity magnitude. The velocity measurements are imposed in 684 points (red dots) chosen at a distance from the body. C–E) History of the square root of the loss (C), root-mean-square error in x-velocity (D), and body fraction χ normalized by the mean of reference (E) for the inverse problem and forward problem . The error is relative to the solution of the forward problem after 10,000 epochs.

Open in new tab Download slide

$Inferring body shape from velocity measurements for a flow past a hemisphere using ODIL. A, B) Inferred (A) and reference (B) body shape and contours of vorticity magnitude. The velocity measurements are imposed in 684 points (red dots) chosen at a distance from the body. C–E) History of the square root of the loss (C), root-mean-square error in x-velocity (D), and body fraction χ normalized by the mean of reference (E) for the inverse problem and forward problem . The error is relative to the solution of the forward problem after 10,000 epochs.$

Fig. 13.

Inferring body shape from velocity measurements for a flow past a hemisphere using ODIL. A, B) Inferred (A) and reference (B) body shape and contours of vorticity magnitude. The velocity measurements are imposed in 684 points (red dots) chosen at a distance from the body. C–E) History of the square root of the loss (C), root-mean-square error in x-velocity (D), and body fraction χ normalized by the mean of reference (E) for the inverse problem and forward problem . The error is relative to the solution of the forward problem after 10,000 epochs.

Open in new tab Download slide

Conclusion

We introduce the ODIL framework for solving inverse problems for PDEs by casting their discretization as an optimization problem and applying optimization techniques that are widely available in machine-learning software. The concept of casting the PDE as is closely related to the neural network formulations proposed by (15–17) and recently revived as PINNs. However, the fact that we use the discrete approximation of the equations allows for ODIL to be orders of magnitude more efficient in terms of computational cost and accuracy compared to the PINN for which complex flow problems “remain elusive” (71).

We remark that the error of PINN typically scales as a square root of the number of training points regardless of the space dimensionality (25), which can be beneficial for high-dimensional problems (72, 73). However, a recent attempt (36) to apply PINN for simulating an open quantum system has proven unsuccessful, outperformed by the Q-Flow method based on normalizing flows proposed therein. In ODIL, the grid-based discretization may suffer from the curse of dimensionality as the error of a p-order accurate method with N grid points scales as $O (N^{p / d})$ in a d-dimensional space. To solve high-dimensional problems with ODIL, we suggest the use of use of sparse grids (74–76) that introduce a hierarchical basis so that the error scales as $O (N^{p} (\log N)^{d})$ with the number of space dimensions. Another approach is to use a grid in some of the dimensions (e.g. spatial dimensions) and discretize the remaining dimensions with neural networks or basis functions such as spherical harmonics. The latter approach has outperformed fully connected networks by two orders of magnitude on problems of view synthesis in computer graphics (77). The treatment of complex geometries with PINNs is equivalent to treatments deployed in particle methods, using globally interpolating kernels (1). as the collocation points can be positioned arbitrarily. For ODIL, we have demonstrated here the use of the penalization method for the flow past a body. Another possibility is to use an unstructured grid fitted to the complex boundaries or even ideas from particle methods.

We have presented the applications of ODIL to several forward and inverse problems involving PDEs, including the Poisson equation, the heat equation, the advection equation, and the Navier–Stokes equations. In most cases, the discrete loss has been formulated using second-order accurate finite volume discretizations and incorporate exactly the boundary and initial conditions through extrapolation. We emphasize that the primary scope of ODIL is in solving ill-posed and inverse problems. For forward, and in particular time-dependent problems, a multitude of classical solvers based on marching in time are far more efficient than PINNs and ODIL since they do not require storing all the time steps as unknowns.

The ideas and results presented in this article suggest that ODIL, and even more the ideas it entails, can serve as inspiration for advancing the solution of inverse problems. We remark that while ODIL is shown to vastly outperform PINNs, its development was motivated by the widespread use of PINNs and their treatment of equations and data. We believe that there are many promising venues at such interfaces (and competitions) of machine learning and scientific computing for solving inverse problems using physics and data informed methodologies. Ongoing works using ODIL include tracking the evolution of brain tumors from medical images (78), the derivation of material properties for specific heat conduction and the identification of submerged obstacles from surface information.

Acknowledgments

The authors thank Professors Michael P. Brenner (Harvard University), Georges-Henri Cottet (retired), and Thomas Y. Hou (Caltech) for several insightful discussions. They are also grateful to the five anonymous Reviewers for their multiple, valuable suggestions that helped us to improve this article.

Supplementary Material

Supplementary material is available at PNAS Nexus online.

Funding

This work was supported in part by funds from The European High Performance Computing Joint Undertaking (EuroHPC) Grant DCoMEX (956201-H2020-JTI-EuroHPC-2019-1). P.Ko. acknowledges support by the AFOSR program (FA9550-21-1-0058).

Author Contributions

P.Ka.: methodology, software implementation of ODIL, visualization, and writing (original draft). S.L.: methodology, software implementation of PINN, validation, and writing (review and editing). P.Ko.: conceptualization, methodology, supervision, and writing (review and editing).

Preprints

A preprint of this article is published at arXiv:2205.04611.

Data Availability

The source code of the implementation of the method is available at https://github.com/cselab/odil along with examples and instructions to reproduce the results.

References

Koumoutsakos

2005

Multiscale flow simulations using particles

Annu Rev Fluid Mech

(

457

–

487

Month:	Total Views:
January 2024	1,734
February 2024	664
March 2024	457
April 2024	439
May 2024	474
June 2024	339
July 2024	420
August 2024	680
September 2024	438
October 2024	1,443
November 2024	1,405
December 2024	786
January 2025	472
February 2025	506
March 2025	369
April 2025	289
May 2025	45

Article Contents

Solving inverse problems in physics by optimizing a discrete loss: Fast and accurate learning without neural networks

Abstract

Introduction

Methods

Formulation of ODIL

Calculating sparse Jacobians

Multigrid decomposition

Results

Wave equation: accuracy and cost of PINN

Poisson equation in two dimensions

Velocity from tracer

Inferring conductivity from temperature

Lid-driven cavity: forward problem

Lid-driven cavity: flow reconstruction

Measure of flow complexity

Inferring body shape from velocity in two dimensions

Inferring body shape from velocity in three dimensions

Conclusion

Acknowledgments

Supplementary Material

Funding

Author Contributions

Preprints

Data Availability

References

Author notes

Supplementary data

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only