ABSTRACT

Neural networks demonstrate vulnerability to small, non-random perturbations, emerging as adversarial attacks. Such attacks, born from the gradient of the loss function relative to the input, are discerned as input conjugates, revealing a systemic fragility within the network structure. Intriguingly, a mathematical congruence manifests between this mechanism and the quantum physics’ uncertainty principle, casting light on a hitherto unanticipated interdisciplinarity. This inherent susceptibility within neural network systems is generally intrinsic, highlighting not only the innate vulnerability of these networks, but also suggesting potential advancements in the interdisciplinary area for understanding these black-box networks.

INTRODUCTION

Despite the widely demonstrated success across various domains—from image classification [1] and speech recognition [2] to predicting protein structures [3], playing chess [4] and other games [5], etc.—deep neural networks have recently come under scrutiny for an intriguing vulnerability [6,7]. The robustness of these intricately trained models is being called into question, as they seem to falter under attacks that are virtually imperceptible to human senses.

A growing body of both empirical [8–16] and theoretical [17–20] evidence suggests that these sophisticated networks can be tripped up by minor, non-random perturbations, producing high-confidence yet erroneous predictions—a striking and quite succinct example being the fast gradient sign method (FGSM) attack [18]. These findings raise significant concerns about the vulnerabilities of such neural networks. If their performance can indeed be undermined by such slight disruptions, the reliability of technologies that hinge on state-of-the-art deep learning could potentially be at risk.

A natural question emerges concerning the vulnerability of deep neural networks. Despite the classical approximation theorems [21–24] promising that a neural network can approximate a continuous function to any desired level of accuracy, is the observed trade-off between accuracy and robustness an intrinsic and universal property of these networks?

This query stems from the intuition that stable problems, described by stable functions, should intrinsically produce stable solutions. The debate within the scientific community is still ongoing. If this trade-off is indeed an inherent feature then a comprehensive exploration into the foundations of deep learning is warranted. Alternatively, if this phenomenon is merely an outcome of approaches to constructing and training neural networks, it would be beneficial to concentrate on enhancing these processes, as has already been undertaken, e.g. the certified adversarial robustness via randomized smoothing [25–27] and the concurrent training strategy [19,28–34].

In this study, we uncover an intrinsic characteristic of neural networks: their vulnerability shares a mathematical equivalence with the uncertainty principle in quantum physics [35,36]. This is observed when gradient-based attacks [18,37–41] on the inputs are identified as conjugate variables, in relation to these inputs. Since modern network structures always include a loss function with respect to some inputs, we are allowed to ‘design’ the conjugate variables by taking the gradient of the loss function with respect to the input variables. These conjugate variables, when involved in the inputs, can drastically decrease the prediction accuracy. Thus, the uncertainty principle is a natural result of minimizing the loss function, which is inevitable and universal.

Taking into account a trained neural network model, denoted f(X, θ), where θ signifies the parameters and X represents the input variable of the network, we observe a consistent pattern. The network cannot achieve arbitrary levels of measuring certainties on two factors simultaneously: the conjugate variable ∇Xl(f(X, θ), Y) (where Y denotes the underlying ground-truth label of X) and the input X, leading to the observed accuracy-robustness trade-off. This phenomenon, similar to the quantum physics’ uncertainty principle, offers a nuanced understanding of the limitations inherent in neural networks.

RESULTS

Conjugate variables as attacks

In quantum mechanics, the concept of conjugate variables plays a critical role in understanding the fundamentals of particle behavior. Conjugate variables are a pair of observables, typically represented by operators, which do not commute. This non-commutativity implies that the order of their operations is significant and it is intrinsically tied to Heisenberg’s uncertainty principle [35,36]. A prime example of such a pair is the position operator, |$\hat{x}_{\text qt}$|⁠, and the momentum operator, |$\hat{p}_{\text qt}=-i{\partial }/{\partial x_{\text qt}}$|⁠. Here, the order of operations matters such that |$\hat{x}_{\text qt}\hat{p}_{\text qt}$| is not equal to |$\hat{p}_{\text qt}\hat{x}_{\text qt}$|⁠, indicating the impossibility of simultaneously determining the precise values of both the position and momentum. This inherent uncertainty is quantitatively expressed in Heisenberg’s uncertainty relation: |$\Delta x_{\text qt} \Delta p_{\text qt} \ge \frac{1}{2}$| with Δxqt and Δpqt respectively representing the standard deviations of the position and momentum measurements.

Drawing an analogy from quantum mechanics, we can formulate the concepts of conjugate variables within the realm of neural networks. Specifically, the features of the input data provided to a neural network can be conceptualized as feature operators, denoted |$\hat{x}_{i}$|⁠, while the gradients of the loss function with respect to these inputs can be viewed as attack operators, denoted |$\hat{p}_{i}={\partial }/{\partial x_{i}}$|⁠. Here, subscript i refers to the ith feature of the entire input feature vector. The attack operators, corresponding to the gradients on inputs, hold a clear relationship with gradient-based attacks, such as the FGSM attack (the application of such attacks often involves a sign function, although this is not strictly necessary [42,43]).

This analogy leads us to an inherent uncertainty relation for neural networks, mirroring Heisenberg’s uncertainty principle in quantum mechanics. Providing a trained neural network with properly normalized loss functions, the relation reads |$\Delta x_{i} \Delta p_{i} \ge \frac{1}{2}$| (see the derivations in the online supplementary material). This relation, relying on both the dataset and the network structure, suggests that there exists an intrinsic limitation in precisely measuring both features and attacks simultaneously. This intrinsically reveals an inherent vulnerability of neural networks, echoing the uncertainty we observe in the quantum world.

To intuitively visualize the manifestation of Δx = (∑Δxi)1/2 and Δp = (∑Δpi)1/2 within neural networks, we use the MNIST dataset as a representative example. The neural network is trained and subsequently subjected to attacks at each training epoch.

In this scenario, a trained network partitions the hyperspace (the space inhabited by the samples) into distinct regions. A given input, represented as a point in this space, is classified based on the label of the region it falls within. After 50 epochs of training, the shaded areas encapsulate most correctly labeled data points (Fig. 1a). Conversely, the attacks shift these input points slightly, leading to misclassification. The shifted points do not overlap with the regions defined by the trained network (Fig. 1b).

Illustration of Δx and Δp in a three-layer convolutional neural network trained on the MNIST dataset over 50 epochs. The data’s high-dimensional feature space was reduced to two dimensions using the t-distributed stochastic neighbor embedding (t-SNE) algorithm for easy visualization. (a) Shaded regions indicate the class predictions obtained by the finally trained network, and the colors imposed on individual points indicate the true labels of corresponding test samples. (b) All test samples were subjected to the projected gradient descent (PDG) adversarial attack method [37,38] with ϵ = 0.1 and α = 0.1/4 over four iterative steps. We see that these adversarially perturbed samples evidently deviate from the class regions they should be located within. (c) The prediction region evolution for the digit ‘8’ is displayed at epochs 1, 21 and 41. The deeper the color, the more confident the prediction by the network. (d) The shaded area is similar to (c), but with points representing the adversarial predictions of the attacked images, illustrating the temporal impact of the PDG attack on model accuracy.
Figure 1.

Illustration of Δx and Δp in a three-layer convolutional neural network trained on the MNIST dataset over 50 epochs. The data’s high-dimensional feature space was reduced to two dimensions using the t-distributed stochastic neighbor embedding (t-SNE) algorithm for easy visualization. (a) Shaded regions indicate the class predictions obtained by the finally trained network, and the colors imposed on individual points indicate the true labels of corresponding test samples. (b) All test samples were subjected to the projected gradient descent (PDG) adversarial attack method [37,38] with ϵ = 0.1 and α = 0.1/4 over four iterative steps. We see that these adversarially perturbed samples evidently deviate from the class regions they should be located within. (c) The prediction region evolution for the digit ‘8’ is displayed at epochs 1, 21 and 41. The deeper the color, the more confident the prediction by the network. (d) The shaded area is similar to (c), but with points representing the adversarial predictions of the attacked images, illustrating the temporal impact of the PDG attack on model accuracy.

We pay particular attention to class number 8, which exhibits the most interconnections with other classes. This class is further illustrated in Fig. 1c and d. As the training epochs progress, the ‘effective radius’ of the shaded area shrinks, causing the area to gradually coincide with the correctly labeled data points (Fig. 1c). Simultaneously, the ‘effective radius’ of the attacked points begins to deviate further from the shaded regions, and thus from the correctly labeled data (Fig. 1d). An intuitive correspondence of the ‘effective radii’ are the domain representations in Fourier transformation, where a narrow time domain representation (Δt) corresponds to a wider frequency domain representation (Δω), restricted by a similar relation, ΔtΔω ≥ 1/4π. Intuitively, we may think of Δt and Δω as correspondences to the ‘effective radius’ in neural networks.

This visualization reveals an inherent trade-off: a reduction in the effective radius of the trained class corresponds to an increase in the effective radius of the attacked points. These two radii can be conceptualized as the visual representations of the uncertainties, Δx and Δp, highlighting the delicate balance between precision and vulnerability in neural networks.

In addition to the adversarial attacks explored in this study, there exist analogous effective conjugates in other types of adversarial attacks as well [37,39–41]. While we are currently unable to explicitly define the conjugates associated with black-box attacks as referenced in [44,45], it is plausible that these methods may adhere to the same underlying principle.

Manifestation of the uncertainty principle in neural networks

The shaded areas in Fig. 1a are actually representative of wave functions in quantum physics. Specifically, for the MNIST dataset, we have 10 corresponding wave functions, corresponding to 10-digit number classes. Therefore, the uncertainty relation |$\Delta x \Delta p \ge \frac{1}{2}$| shown in Fig. 1c and d should be reinterpreted as |$\Delta x[\text{class 8}] \Delta p[\text{class 8}] \ge \frac{1}{2}$|⁠, indicating that we are concentrating on the class of number 8. This equation is a clear depiction of the trade-off between Δx[class8] and Δp[class8], as depicted in Fig. 2b, accompanied by the associated trade-off between accuracy and robustness (Fig. 2a). The CIFAR-10 dataset, having a higher complexity than MNIST, poses a potential indeterminacy in identifying a specific class that has more connectivity with other classes. In this case, the average values Δx = Mean(Δx[Allclasses]) and Δp = Mean(Δp[Allclasses]) are employed instead. Similar results obtained on CIFAR-10 underscore the inherent uncertainty relation that drives the accuracy-robustness trade-off, as demonstrated in Fig. 2c and d.

Results of the three different types of neural networks: a three-layer convolutional network running on the MNIST dataset, a four-layer convolutional network on the CIFAR-10 dataset and a residual network [46] with eight convolutional layers on the CIFAR-10 dataset. The term ‘feature’ in the labels represents the results obtained by attacking the features of the input images, while ‘pixel’ corresponds to attacks directed at the pixels themselves. Each neural network underwent training for a span of 50 epochs. The quantities Δx and Δp were determined through high-dimensional Monte Carlo integrations. Subfigures (a), (c), (e), (g), (i) and (k) depict the test and robust accuracy metrics, with the robust accuracy evaluated on images perturbed by the PDG adversarial attack method, using parameters ϵ = 8/255 and α = 2/255 across four iterative steps. Subfigures (b), (d), (f), (h), (j) and (l) illustrate the trade-off relationship between Δx and Δp.
Figure 2.

Results of the three different types of neural networks: a three-layer convolutional network running on the MNIST dataset, a four-layer convolutional network on the CIFAR-10 dataset and a residual network [46] with eight convolutional layers on the CIFAR-10 dataset. The term ‘feature’ in the labels represents the results obtained by attacking the features of the input images, while ‘pixel’ corresponds to attacks directed at the pixels themselves. Each neural network underwent training for a span of 50 epochs. The quantities Δx and Δp were determined through high-dimensional Monte Carlo integrations. Subfigures (a), (c), (e), (g), (i) and (k) depict the test and robust accuracy metrics, with the robust accuracy evaluated on images perturbed by the PDG adversarial attack method, using parameters ϵ = 8/255 and α = 2/255 across four iterative steps. Subfigures (b), (d), (f), (h), (j) and (l) illustrate the trade-off relationship between Δx and Δp.

DISCUSSIONS

To elucidate the impacts of the uncertainty principle across various strata, we examine its repercussions on input selection, variabilities in network architectures and the interplay between physical sciences and artificial intelligence as follows.

Attacking features is more effective than attacking pixels

The pixels in our dataset serve as the raw, unprocessed data, gathered directly from the detectors. These pixels carry the features that serve as an accurate representation of the real world. While there is a possibility of manipulating these features, it is more common and practical to focus on the pixels themselves. By doing so, we can observe the accuracy-robustness trade-off (Fig. 2e and g), a fundamental concept that is underpinned by the uncertainty relation, as seen in Fig. 2f and h.

However, it is important to note, as evidenced by the testing accuracy results from the MNIST dataset, that there is an initial learning curve or ‘kick’ that is encountered (Fig. 2e). This is to be expected as the neural network must first familiarize itself with, or ‘learn,’ the features before it can effectively classify the images.

While processing the initial learning stages, it is also worth noting the fluctuation in both Δx and Δp for input pixels. This fluctuation is more pronounced than that seen in the features, highlighting the random exploration nature of the learning algorithm. As illustrated in Fig. 2h, these fluctuations could be attributed to the inherent randomness of the learning process, a factor that is crucial to potentially uncover more optimal weight configurations.

Meanwhile, since attacking features are more effective than attacking the pixels, we may expect that if sufficient training data are provided to the neural network, the attack will be more effective.

Phenomenon in attacking well-designed neural networks

Typically, network structures are scrupulously architected to fit the demands of specific tasks. Take panels c, d, g and h of Fig. 2 as examples. In the figure, the network only achieves a test accuracy of around 65% due to the relatively simple network architecture. To address this, we introduce a more advanced network structure that incorporates residual networks and additional convolutional layers. This refined structure increases the accuracy to nearly 90%. (Given that the quantities Δx and Δp are approximately computed through high-dimensional Monte Carlo integrations, a process that is exceedingly time consuming, we can only feasibly perform these computations for the network with such complexity. If they could be calculated more accurately under more complex and accurate networks with stronger computational resources, we believe that the calculated patterns will better conform to the expected regularities.) One can still observe a clear pattern in the trade-off between Δx and Δp for both features and pixels (Fig. 2i–l). Besides, this trade-off is also more pronounced for features than for pixels. Understanding this trade-off allows for a more effective optimization of the network structure. In closing, constructing a network structure that best fits the task at hand is pivotal in delivering optimal performance.

Neural network as a complex physical system

As scientific research and engineering become increasingly reliant on artificial intelligence (AI) methods, questions about the future role of human beings in these fields naturally arise. Whether guiding AI or being guided by it, understanding the fundamental principles underpinning these sophisticated structures is paramount. One approach to glean this understanding is to treat neural networks as complex physical systems, thereby applying principles of physics to elucidate the inner mechanisms of AI.

In the study at hand, it is posited that neural networks, much like quantum systems, are subject to a form of the uncertainty principle. This connection potentially uncovers intrinsic vulnerabilities within the neural networks. A comparison of formulas from these distinct fields is presented in Table 1. Here, concepts from quantum physics such as the position, momentum and wave function are juxtaposed with their counterparts in neural networks: image, attack, normalized loss function and so on. This comparison not only reveals striking similarities, but also indicates that the methodologies employed in physical sciences could potentially be harnessed to investigate the properties of neural networks.

Table 1.

Comparison of the uncertainty principle between quantum physics and neural networks. Subscript i represents the ith dimension. For physics, i stands for the spatial coordinates (x, y and z), whereas in the context of neural networks, i refers to the ith feature. When we consider pixels, i simply pertains to the ith pixel. Additionally, we utilize Dirac notation, for instance, |$\langle \hat{x}_{i,\text{qt}}\rangle = \int \psi ^{*}(X)x_{i,\text{qt}}\psi (X)dX\!$|⁠, where |$\langle \hat{x}_{i,\text{qt}}\rangle$| is the expectation value of the ith dimension. Similarly, |$\langle \hat{x}_{i}\rangle = \int \psi _{Y}(X)x_{i}\psi _{Y}(X)dX$| for neural networks.

Quantum physicsNeural networks
PositionX = (x, y, z)X = (x1, …, xi, …, xM)Image/feature (input)
Momentum (conjugate of position)P = (px, py, pz)P = (p1, …, pi, …, pM)Attack (conjugate of input)
Wave functionψ(X)ψY(X)Normalized loss function (neural packet)
Normalized condition∫|ψ(X)|2dX = 1∫|ψY(X)|2dX = 1Normalized condition
Position operator|$\hat{x}_{i,\text{qt}}\psi (X)=x_{i,\text{qt}}\psi (X)$||$\hat{x}_{i}\psi _{Y}(X)=x_{i}\psi _{Y}(X)$|Feature operator
Momentum operator|$\hat{p}_{i,\text{qt}}\psi (X)=-i{\partial \psi (X)}/{\partial x_{i,\text{qt}}}$||$\hat{p}_{i}\psi _{Y}(X)={\partial \psi _{Y}(X)}/{\partial x_{i}}$|Attack operator
Standard deviation for measuring position|$\sigma _{x_{i,\text{qt}}}=\langle (\hat{x}_{i,\text{qt}}-\langle \hat{x}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {x_{i}}=\langle (\hat{x}_{i} -\langle \hat{x}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving pixel
Standard deviation for measuring momentum|$\sigma _{p_{i,\text{qt}}}=\langle (\hat{p}_{i,\text{qt}}-\langle \hat{p}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {p_{i}}=\langle (\hat{p}_{i}-\langle \hat{p}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving attack
Uncertainty relation|$\sigma _{x_{i,\text{qt}}}\sigma _{p_{i,\text{qt}}}\ge \frac{1}{2}$||$\Delta {x_{i}}\Delta {p_{i}}\ge \frac{1}{2}$|Uncertainty relation
Quantum physicsNeural networks
PositionX = (x, y, z)X = (x1, …, xi, …, xM)Image/feature (input)
Momentum (conjugate of position)P = (px, py, pz)P = (p1, …, pi, …, pM)Attack (conjugate of input)
Wave functionψ(X)ψY(X)Normalized loss function (neural packet)
Normalized condition∫|ψ(X)|2dX = 1∫|ψY(X)|2dX = 1Normalized condition
Position operator|$\hat{x}_{i,\text{qt}}\psi (X)=x_{i,\text{qt}}\psi (X)$||$\hat{x}_{i}\psi _{Y}(X)=x_{i}\psi _{Y}(X)$|Feature operator
Momentum operator|$\hat{p}_{i,\text{qt}}\psi (X)=-i{\partial \psi (X)}/{\partial x_{i,\text{qt}}}$||$\hat{p}_{i}\psi _{Y}(X)={\partial \psi _{Y}(X)}/{\partial x_{i}}$|Attack operator
Standard deviation for measuring position|$\sigma _{x_{i,\text{qt}}}=\langle (\hat{x}_{i,\text{qt}}-\langle \hat{x}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {x_{i}}=\langle (\hat{x}_{i} -\langle \hat{x}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving pixel
Standard deviation for measuring momentum|$\sigma _{p_{i,\text{qt}}}=\langle (\hat{p}_{i,\text{qt}}-\langle \hat{p}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {p_{i}}=\langle (\hat{p}_{i}-\langle \hat{p}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving attack
Uncertainty relation|$\sigma _{x_{i,\text{qt}}}\sigma _{p_{i,\text{qt}}}\ge \frac{1}{2}$||$\Delta {x_{i}}\Delta {p_{i}}\ge \frac{1}{2}$|Uncertainty relation
Table 1.

Comparison of the uncertainty principle between quantum physics and neural networks. Subscript i represents the ith dimension. For physics, i stands for the spatial coordinates (x, y and z), whereas in the context of neural networks, i refers to the ith feature. When we consider pixels, i simply pertains to the ith pixel. Additionally, we utilize Dirac notation, for instance, |$\langle \hat{x}_{i,\text{qt}}\rangle = \int \psi ^{*}(X)x_{i,\text{qt}}\psi (X)dX\!$|⁠, where |$\langle \hat{x}_{i,\text{qt}}\rangle$| is the expectation value of the ith dimension. Similarly, |$\langle \hat{x}_{i}\rangle = \int \psi _{Y}(X)x_{i}\psi _{Y}(X)dX$| for neural networks.

Quantum physicsNeural networks
PositionX = (x, y, z)X = (x1, …, xi, …, xM)Image/feature (input)
Momentum (conjugate of position)P = (px, py, pz)P = (p1, …, pi, …, pM)Attack (conjugate of input)
Wave functionψ(X)ψY(X)Normalized loss function (neural packet)
Normalized condition∫|ψ(X)|2dX = 1∫|ψY(X)|2dX = 1Normalized condition
Position operator|$\hat{x}_{i,\text{qt}}\psi (X)=x_{i,\text{qt}}\psi (X)$||$\hat{x}_{i}\psi _{Y}(X)=x_{i}\psi _{Y}(X)$|Feature operator
Momentum operator|$\hat{p}_{i,\text{qt}}\psi (X)=-i{\partial \psi (X)}/{\partial x_{i,\text{qt}}}$||$\hat{p}_{i}\psi _{Y}(X)={\partial \psi _{Y}(X)}/{\partial x_{i}}$|Attack operator
Standard deviation for measuring position|$\sigma _{x_{i,\text{qt}}}=\langle (\hat{x}_{i,\text{qt}}-\langle \hat{x}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {x_{i}}=\langle (\hat{x}_{i} -\langle \hat{x}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving pixel
Standard deviation for measuring momentum|$\sigma _{p_{i,\text{qt}}}=\langle (\hat{p}_{i,\text{qt}}-\langle \hat{p}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {p_{i}}=\langle (\hat{p}_{i}-\langle \hat{p}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving attack
Uncertainty relation|$\sigma _{x_{i,\text{qt}}}\sigma _{p_{i,\text{qt}}}\ge \frac{1}{2}$||$\Delta {x_{i}}\Delta {p_{i}}\ge \frac{1}{2}$|Uncertainty relation
Quantum physicsNeural networks
PositionX = (x, y, z)X = (x1, …, xi, …, xM)Image/feature (input)
Momentum (conjugate of position)P = (px, py, pz)P = (p1, …, pi, …, pM)Attack (conjugate of input)
Wave functionψ(X)ψY(X)Normalized loss function (neural packet)
Normalized condition∫|ψ(X)|2dX = 1∫|ψY(X)|2dX = 1Normalized condition
Position operator|$\hat{x}_{i,\text{qt}}\psi (X)=x_{i,\text{qt}}\psi (X)$||$\hat{x}_{i}\psi _{Y}(X)=x_{i}\psi _{Y}(X)$|Feature operator
Momentum operator|$\hat{p}_{i,\text{qt}}\psi (X)=-i{\partial \psi (X)}/{\partial x_{i,\text{qt}}}$||$\hat{p}_{i}\psi _{Y}(X)={\partial \psi _{Y}(X)}/{\partial x_{i}}$|Attack operator
Standard deviation for measuring position|$\sigma _{x_{i,\text{qt}}}=\langle (\hat{x}_{i,\text{qt}}-\langle \hat{x}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {x_{i}}=\langle (\hat{x}_{i} -\langle \hat{x}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving pixel
Standard deviation for measuring momentum|$\sigma _{p_{i,\text{qt}}}=\langle (\hat{p}_{i,\text{qt}}-\langle \hat{p}_{i,\text{qt}} \rangle )^{2}\rangle ^{1/2}$||$\Delta {p_{i}}=\langle (\hat{p}_{i}-\langle \hat{p}_{i}\rangle )^{2}\rangle ^{1/2}$|Standard deviation for resolving attack
Uncertainty relation|$\sigma _{x_{i,\text{qt}}}\sigma _{p_{i,\text{qt}}}\ge \frac{1}{2}$||$\Delta {x_{i}}\Delta {p_{i}}\ge \frac{1}{2}$|Uncertainty relation

Meanwhile, the inherent uncertainty principle should also hold for large fundamental models in principle. Since these large networks have much more complex structures and are trained with a huge amount of data, they can be treated as complex physical systems containing various features at different levels and scales. Therefore, these large models should be more sensitive to the gradient-based conjugates. In fact, some recent research has revealed the fact that large language models can be easily cheated by adversarial attacks [47] and jailbreak prompts [48], leading to potential risks as we rely much more on these large models. However, more thorough and versatile empirical studies along this line have yet been widely investigated, leaving a large space for future studies.

The intersection of AI and physics has the potential to provide novel insights into the intricate complexities of neural networks. For instance, the emergent capabilities exhibited by large language models might be correlated with principles found in statistical physics. Moreover, phenomena such as small data learning could be linked to concepts from Noether’s theorem and gauge transformations [49]. By drawing inspiration from physical processes such as weak interactions, we can devise innovative generative models, such as ‘Yukawa generative models’ [50]. Viewing neural networks through the lens of physics can give us a deeper understanding of their structure and functionality from an entirely new perspective.

The synergy between AI and physics, two seemingly distinct fields, could lead to advancements in both domains. It is a two-fold benefit: AI could gain from the structured, universal laws of physics, and in return, physics could possibly leverage the predictive and analytical power of AI.

CONCLUSION

This study reveals the remarkable link between quantum physics and neural networks, demonstrating that these artificial systems, like quantum systems, are subject to the uncertainty principle. This principle, often associated with precision and vulnerability trade-offs, provides new insights into the potential frailties inherent in neural networks.

Our findings also indicate that attacking the features of a neural network can be more effective than focusing on its pixels. This insight could possibly influence the optimization of network structures for better performance.

Meanwhile, viewing neural networks as complex physical systems allows us to apply principles from physics to understand the behaviour of these AI systems better. This interdisciplinary approach not only enhances our comprehension of AI systems, but also suggests a wealth of potential applications and advancements in both fields.

As we move forward, further exploration of this accuracy-robustness trade-off and its influence on the design of neural networks will be crucial. While this study provides a valuable perspective on the relationship between quantum physics and AI, additional research is still needed to more comprehensively understand how these principles can be applied to improve neural network robustness and design.

METHODS

Detailed methods and materials are given in the online supplementary material.

DATA AVAILABILITY

All data are available in the main text or the online supplementary material. Additional data related to this paper are available at https://doi.org/10.7910/DVN/SWDL1S and https://doi.org/10.48550/arXiv.2205.01493.

FUNDING

This work was partly supported by the National Key Research and Development Program of China (2020YFA0713900) and the National Natural Science Foundation of China (12105227, 12226004 and 62272375)

AUTHOR CONTRIBUTIONS

J.J.Z. contributed to the origin, concepts, physical perspective of the proposed framework. D.Y.M. contributed to the machine learning perspective of this work and supervised the whole project.

Conflict of interest statement. None declared.

REFERENCES

1.

Krizhevsky
 
A
,
Sutskever
 
I
,
Hinton
 
GE
.
Imagenet classification with deep convolutional neural networks
.
Commun ACM
 
2017
;
60
:
84
90
.

2.

Hinton
 
G
,
Deng
 
L
,
Yu
 
D
 et al.  
Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
.
IEEE Signal Process Mag
 
2012
;
29
:
82
97
.

3.

Senior
 
AW
,
Evans
 
R
,
Jumper
 
J
 et al.  
Improved protein structure prediction using potentials from deep learning
.
Nature
 
2020
;
577
:
706
10
.

4.

Silver
 
D
,
Huang
 
A
,
Maddison
 
CJ
 et al.  
Mastering the game of go with deep neural networks and tree search
.
Nature
 
2016
;
529
:
484
9
.

5.

Schrittwieser
 
J
,
Antonoglou
 
I
,
Hubert
 
T
 et al.  
Mastering Atari, Go, chess and shogi by planning with a learned model
.
Nature
 
2020
;
588
:
604
9
.

6.

Szegedy
 
C
,
Zaremba
 
W
,
Sutskever
 
I
 et al.  
Intriguing properties of neural networks
.
International Conference on Learning Representations
,
Banff, Canada, 14–16 January 2014
.

7.

Ren
 
K
,
Zheng
 
T
,
Qin
 
Z
 et al.  
Adversarial attacks and defenses in deep learning
.
Engineering
 
2020
;
6
:
346
60
.

8.

Su
 
D
,
Zhang
 
H
,
Chen
 
H
 et al.  
Is robustness the cost of accuracy? – A comprehensive study on the robustness of 18 deep image classification models
. In:
Ferrari
 
V
,
Hebert
 
M
,
Sminchisescu
 
C
,
Weiss
 
Y
(eds).
Computer Vision—ECCV 2018
.
Cham
:
Springer
,
2018
,
644
61
.

9.

Eykholt
 
K
,
Evtimov
 
I
,
Fernandes
 
E
 et al.  
Robust physical-world attacks on deep learning visual classification
. In:
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
.
Los Alamitos
:
IEEE Computer Society
,
2018
,
1625
34
.

10.

Jia
 
R
,
Liang
 
P
.
Adversarial examples for evaluating reading comprehension systems
. In:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
.
Stroudsburg
:
Association for Computational Linguistics
,
2017
,
2021
31
.

11.

Chen
 
H
,
Zhang
 
H
,
Chen
 
PY
 et al.  
Attacking visual language grounding with adversarial examples: a case study on neural image captioning
. In:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics
.
Stroudsburg
:
Association for Computational Linguistics
,
2018
,
2587
97
.

12.

Carlini
 
N
,
Wagner
 
AD
.
Audio adversarial examples: targeted attacks on speech-to-text
. In:
2018 IEEE Security and Privacy Workshops (SPW)
.
Los Alamitos
:
IEEE Computer Society
,
2018
,
1
7
.

13.

Xu
 
H
,
Caramanis
 
C
,
Mannor
 
S
.
Sparse algorithms are not stable: a no-free-lunch theorem
.
IEEE Trans Pattern Anal Mach Intell
 
2012
;
34
:
187
93
.

14.

Benz
 
P
,
Zhang
 
C
,
Karjauv
 
A
 et al.  
Robustness may be at odds with fairness: an empirical study on class-wise accuracy
. In:
NeurIPS 2020 Workshop on Pre-registration in Machine Learning
.
PMLR
,
2021
,
325
42
.

15.

Morcos
 
SA
,
Barrett
 
GTD
,
Rabinowitz
 
CN
 et al.  
On the importance of single directions for generalization
.
International Conference on Learning Representations
,
Vancouver, Canada, 30 April-5 May 2018
.

16.

Springer
 
J
,
Mitchell
 
M
,
Kenyon
 
G
.
A little robustness goes a long way: leveraging robust features for targeted transfer attacks
. In:
Advances in Neural Information Processing Systems
.
Red Hook
:
Curran Associates
,
2021
,
9759
73
.

17.

Zhang
 
H
,
Yu
 
Y
,
Jiao
 
J
 et al.  
Theoretically principled trade-off between robustness and accuracy
. In:
Proceedings of the 36th International Conference on Machine Learning
.
PMLR
,
2019
,
7472
82
.

18.

Goodfellow
 
JI
,
Shlens
 
J
,
Szegedy
 
C
.
Explaining and harnessing adversarial examples
.
International Conference on Learning Representations
,
San Diego, US, 7–9 May 2015
.

19.

Tsipras
 
D
,
Santurkar
 
S
,
Engstrom
 
L
 et al.  
Robustness may be at odds with accuracy
.
International Conference on Learning Representations
,
Vancouver, Canada, 30 April-5 May 2018
.

20.

Colbrook
 
JM
,
Antun
 
V
,
Hansen
 
CA
.
The difficulty of computing stable and accurate neural networks: on the barriers of deep learning and Smale’s 18th problem
.
Proc Natl Acad Sci USA
 
2021
;
119
:
e2107151119
.

21.

Cybendo
 
G
.
Approximations by superpositions of a sigmoidal function
.
Math Control Signals Syst
 
1992
;
2
:
303
14
.

22.

Hornik
 
K
,
Stinchcombe
 
M
,
White
 
H
.
Multilayer feedforward networks are universal approximators
.
Neural Networks
 
1989
;
2
:
359
66
.

23.

Gelenbe
 
E
.
Random neural networks with negative and positive signals and product form solution
.
Neural Comput
 
1989
;
1
:
502
10
.

24.

Gelenbe
 
E
,
Mao
 
ZH
,
Li
 
YD
.
Function approximation with spiked random networks
.
IEEE Trans Neural Networks
 
1999
;
10
:
3
9
.

25.

Yang
 
G
,
Duan
 
T
,
Hu
 
JE
 et al.  
Randomized smoothing of all shapes and sizes
. In:
Proceedings of the 37th International Conference on Machine Learning
.
JMLR
,
2020
,
693
10705
.

26.

Hao
 
Z
,
Ying
 
C
,
Dong
 
Y
 et al.  
GSmooth: certified robustness against semantic transformations via generalized randomized smoothing
. In:
Proceedings of the 39th International Conference on Machine Learning
.
JMLR
,
2022
,
8465
83
.

27.

Cohen
 
J
,
Rosenfeld
 
E
,
Kolter
 
Z
.
Certified adversarial robustness via randomized smoothing
. In:
Proceedings of the 36th International Conference on Machine Learning
.
JMLR
,
2019
,
1310
20
.

28.

Yang
 
YY
,
Rashtchian
 
C
,
Zhang
 
H
 et al.  
A closer look at accuracy vs. robustness
. In:
Proceedings of the 34th International Conference on Neural Information Processing Systems
.
Red Hook
:
Curran Associates
,
2020
,
8588
601
.

29.

Arani
 
E
,
Sarfraz
 
F
,
Zonooz
 
B
.
Adversarial concurrent training: optimizing robustness and accuracy trade-off of deep neural networks
.
31st British Machine Vision Conference 2020
,
Virtual, 7–10 September 2020.

30.

Arcaini
 
P
,
Bombarda
 
A
,
Bonfanti
 
S
 et al.  
ROBY: a tool for robustness analysis of neural network classifiers
. In:
2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST)
.
Los Alamitos
:
IEEE Computer Society
,
2021
,
442
7
.

31.

Sehwag
 
V
,
Mahloujifar
 
S
,
Handina
 
T
 et al.  
Improving adversarial robustness using proxy distributions
.
International Conference on Learning Representations
,
Virtual, 3–7 May 2021
.

32.

Leino
 
K
,
Wang
 
Z
,
Fredrikson
 
M
.
Globally-robust neural networks
. In:
Proceedings of the 38th International Conference on Machine Learning
.
PMLR
,
2021
,
6212
22
.

33.

Antun
 
V
,
Renna
 
F
,
Poon
 
C
 et al.  
On instabilities of deep learning in image reconstruction and the potential costs of AI
.
Proc Natl Acad Sci USA
 
2020
;
117
:
30088
95
.

34.

Rozsa
 
A
,
Günther
 
M
,
Boult
 
ET
.
Are accuracy and robustness correlated?
In:
2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)
.
Los Alamitos
:
IEEE Computer Society
,
2016
,
227
32
.

35.

Heisenberg
 
W
.
Über den anschaulichen inhalt der quantentheoretischen kinematik und mechanik
.
Zeitschrift für Physik
 
1927
;
43
:
172
98
.

36.

Bohr
 
N
.
On the notions of causality and complementarity
.
Science
 
1950
;
111
:
51
4
.

37.

Kurakin
 
A
,
Goodfellow
 
JI
,
Bengio
 
S
.
Adversarial examples in the physical world
.
International Conference on Learning Representations
,
San Juan, US, 2–4 May 2016
.

38.

Madry
 
A
,
Makelov
 
A
,
Schmidt
 
L
 et al.  
Towards deep learning models resistant to adversarial attacks
.
International Conference on Learning Representations
,
Vancouver, Canada, 30 April-3 May 2018
.

39.

Papernot
 
N
,
McDaniel
 
P
,
Jha
 
S
 et al.  
The limitations of deep learning in adversarial settings
. In:
2016 IEEE European Symposium on Security and Privacy (EuroS&P)
.
Los Alamitos
:
IEEE Computer Society
,
2016
,
372
87
.

40.

Moosavi-Dezfooli
 
SM
,
Fawzi
 
A
,
Frossard
 
P
.
DeepFool: A simple and accurate method to fool deep neural networks
. In:
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
.
Los Alamitos
:
IEEE Computer Society
,
2016
,
2574
82
.

41.

Modas
 
A
,
Moosavi-Dezfooli
 
SM
,
Frossard
 
P
.
SparseFool: a few pixels make a big difference
. In:
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))
.
Los Alamitos
:
IEEE Computer Society
,
2019
,
9079
88
.

42.

Zhao
 
M
,
Dai
 
X
,
Wang
 
B
 et al.  
Further understanding towards sparsity adversarial attacks
. In:
Sun
 
X
,
Zhang
 
X
,
Xia
 
Z
,
Bertino
 
E
(eds).
Advances in Artificial Intelligence and Security
.
Cham
:
Springer
,
2022
,
200
12
.

43.

Zhang
 
C
,
Benz
 
P
,
Lin
 
C
 et al.  
A survey on universal adversarial attack
. In:
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21
.
International Joint Conferences on Artificial Intelligence Organization
,
2021
,
4687
94
.

44.

Su
 
J
,
Vargas
 
DV
,
Sakurai
 
K
.
One pixel attack for fooling deep neural networks
.
IEEE Trans Evol Comput
 
2019
;
23
:
828
41
.

45.

Andriushchenko
 
M
,
Croce
 
F
,
Flammarion
 
N
 et al.  
Square attack: a query-efficient black-box adversarial attack via random search
. In:
Vedaldi
 
A
,
Bischof
 
H
,
Brox
 
T
 et al.
(eds).
Computer Vision—ECCV 2020
.
Cham
:
Springer
,
2020
,
484
501
.

46.

He
 
K
,
Zhang
 
X
,
Ren
 
S
 et al.  
Deep residual learning for image recognition
. In:
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
.
Los Alamitos
:
IEEE Computer Society
,
2016
,
770
8
.

47.

Shayegani
 
E
,
Mamun
 
MAA
,
Fu
 
Y
 et al.  
Survey of vulnerabilities in large language models revealed by adversarial attacks
.
arXiv: 2310.10844
.

48.

Liu
 
Y
,
Deng
 
G
,
Xu
 
Z
 et al.  
Jailbreaking ChatGPT via prompt engineering: an empirical study. arXiv: 2305.13860
.

49.

Peskin
 
ME
,
Schroeder
 
DV
,
Martinec
 
E
.
An introduction to quantum field theory
.
Phys Today
 
1996
;
49
:
69
72
.

50.

Liu
 
Z
,
Luo
 
D
,
Xu
 
Y
 et al.  
GenPhys: from physical processes to generative models. arXiv: 2304.02637
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data