Abstract

The automatic and accurate extraction of diverse biomedical relations from literature constitutes the core elements of medical knowledge graphs, which are indispensable for healthcare artificial intelligence. Currently, fine-tuning through stacking various neural networks on pre-trained language models (PLMs) represents a common framework for end-to-end resolution of the biomedical relation extraction (RE) problem. Nevertheless, sequence-based PLMs, to a certain extent, fail to fully exploit the connections between semantics and the topological features formed by these connections. In this study, we presented a graph-driven framework named BioGSF for RE from the literature by integrating shortest dependency paths (SDP) with entity-pair graph through the employment of the graph neural network model. Initially, we leveraged dependency relationships to obtain the SDP between entities and incorporated this information into the entity-pair graph. Subsequently, the graph attention network was utilized to acquire the topological information of the entity-pair graph. Ultimately, the obtained topological information was combined with the semantic features of the contextual information for relation classification. Our method was evaluated on two distinct datasets, namely S4 and BioRED. The outcomes reveal that BioGSF not only attains the superior performance among previous models with a micro-F1 score of 96.68% (S4) and 96.03% (BioRED), but also demands the shortest running times. BioGSF emerges as an efficient framework for biomedical RE.

Introduction

Amidst the exponential growth of biomedical literature, efficiently tapping into this extensive knowledge base has escalated into a formidable task. Knowledge graph (KG) technologies, with their ability to decipher the intricate web of connections among biomedical concepts, have surfaced as potential solutions to this predicament, thereby garnering significant research focus [1, 2]. Currently, a multiplicity of knowledge graphs have emerged, such as NetMe 2.0 [3], SPOKE [4], MKG-GC [5], etc. Knowledge graphs are making significant contributions in various domains, including facilitating drug repurposing initiatives [6], enhancing drug discovery efforts [7], elucidating molecular regulatory mechanisms [8], and informing clinical decision-making processes [9], among others. The knowledge extraction phase in KG development comprises two key components: entity extraction, involving the identification and retrieval of pertinent entities from the literature, and relation extraction (RE), which determines the connections among these extracted entities.

Relation extraction, a pivotal component of knowledge extraction, has garnered considerable attention, driving extensive research by various scholars and collaborative teams. Early attempts at RE predominantly relied on rule-based methods to extract relations, focusing on analyzing textual syntactic and semantic patterns [10, 11]. However, this method requires manual updates of expressions, and the constructed rules might be tailored only for specific tasks. As a result, researchers have transitioned towards adopting machine-learning techniques for automated relation determination. Singhal et al. [12], for instance, introduced a decision tree method to accurately identify disease-related point mutations from biomedical literature and enhanced the model performance by incorporating features such as statistical, distance, and affective elements.

The aforementioned methods require manual feature engineering, with distinct features needing to be designed for different text types. However, deep learning-based methods elegantly address this challenge by automatically learning feature representations from text, capturing various levels of information. These methods exhibit a superior understanding of contextual information and semantic relations. Currently, deep learning models for the RE task can be broadly categorized into two types: (i) Pre-trained language models that utilize large-scale text datasets for pre-training, enabling them to learn the language’s deep semantic and syntactic nuances; (ii) Task-specific deep learning models, which enrich relation information by integrating sentence features, entity information features, dependency features, and other relevant features.

In the realm of pre-trained language models, Li and colleagues [13] introduced BioBERT, a model based on the BERT architecture. It underwent training using data from Wikipedia, BooksCorpus, and PubMed article abstracts, emerging as the most widely used medical pre-trained language model. Meanwhile, the Generative Pre-trained Transformer (GPT), a recently popular large language model, has gained significant attention and increasing application within the biomedical sphere. For instance, Luo et al. [14] presented BioGPT, a GPT variant obtained through pre-training with GPT-2 and a wealth of domain-specific biological data. This model emphasizes comprehensive coverage of biomedical knowledge. Recently, Jin and colleagues [15] proposed a GPT model tailored for the genetics domain, innovatively utilizing the NCBI Web API to answer genomic inquiries.

On the other hand, to extract relational information more accurately, researchers have integrated more effective sentence feature information into deep learning models. Features based on dependency paths have contributed to improving the accuracy of RE. Miwa et al. proposed an end-to-end LSTM-RNN model based on dependency trees [16]. Later, Peng et al. introduced graph LSTM, which enables bidirectional information transfer between child and parent nodes through dependency relationships [17]. Lai et al. incorporated a neighbor attention mechanism and considered the dependency items of words [18]. Tian et al. designed a dependency-driven with Attentive Graph Convolutional Networks, which obtains head words and child nodes through dependency trees to design local connections, and uses the shortest dependency paths (SDP) to design global connections [19]. Chen et al. not only utilized the dependency relationships and types between words but also distinguished between reliable and noisy dependency information through weighting [20].

Although external information such as dependency relationships improves the performance of relation recognition, the connections between entity pairs themselves also provide rich feature information. Therefore, based on this, we propose a novel graph-driven framework BioGSF for biomedical RE based on entity-pair graph and the SDP. We used dependency relationships to obtain the SDP between entities, and incorporate this information into the entity-pairs graph. Then, we employed graph attention network to obtain topological information of the entity-pairs graph. Finally, we combine the acquired topological information with the semantic features of contextual information for relation classification. This framework harnesses the complementary strengths of diverse methodologies to overcome the constraints of current approaches and enhance the model’s efficacy and generalization capabilities in RE.

Methods

Datasets

In this study, we opted for two extensive datasets, namely S4 and BioRED [21], for the training and testing of our model. The S4 dataset was curated by carefully selecting four distinct relationship types: Protein–Protein Interaction (PPI), Drug–Drug Interaction (DDI), Chemical-Protein Interaction (CPI), and Chemical-Disease Interaction (CDI). These relationships were extracted from Dataset-11 in our prior research [5], which amalgamated eight different resources, including BC5CDR [22], BC6ChemProt [23], BC7DrugProt [24], LLL [25], AIMed, HPRD-50, BioInfer, and IEPA [26]. On the other hand, BioRED comprises a document-level biomedical relation dataset, encompassing 600 PubMed abstracts [21]. Apart from the aforementioned four relationship types, BioRED also features Protein-Disease Interaction (PDI), Disease-Variant Interaction (DVI), and Chemical-Variant Interaction (CVI). Both of the datasets provide pre-annotated entity information, and the sentences we used in our study are drawn from them with the entity tags already in place. As a result, entity recognition is explicitly excluded from the scope of our work and there is no requirement to conduct this task within the context of our study. The detail of the two datasets were summarized in Table 1.

Table 1

The statistics of S4 and BioRED datasets

Relation typeS4: Train SetS4: Test SetBioRED: Train SetBioRED: Test set
PPI21,09011143744614
DDI11,7856292407690
CPI60,85032043984899
CDI565929843031166
PDI72161715
DVI1819686
CVI26026
Total99,384523623,7335796
Relation typeS4: Train SetS4: Test SetBioRED: Train SetBioRED: Test set
PPI21,09011143744614
DDI11,7856292407690
CPI60,85032043984899
CDI565929843031166
PDI72161715
DVI1819686
CVI26026
Total99,384523623,7335796
Table 1

The statistics of S4 and BioRED datasets

Relation typeS4: Train SetS4: Test SetBioRED: Train SetBioRED: Test set
PPI21,09011143744614
DDI11,7856292407690
CPI60,85032043984899
CDI565929843031166
PDI72161715
DVI1819686
CVI26026
Total99,384523623,7335796
Relation typeS4: Train SetS4: Test SetBioRED: Train SetBioRED: Test set
PPI21,09011143744614
DDI11,7856292407690
CPI60,85032043984899
CDI565929843031166
PDI72161715
DVI1819686
CVI26026
Total99,384523623,7335796

Overall architecture

Our model BioGSF is primarily divided into three modules: the embedding layer, the feature fusion layer, and the classification layer. BioGSF takes into account not only the dependencies between entities but also the topological information of the entity pair graph. The overall framework is illustrated in Fig. 1. We chose BioBERT [13] as the pre-trained language model to obtain the embedding information of sentences. The feature fusion layer integrates both sentence semantic features and graph topological features. For sentence semantics, we use a fully connected layer to output contextual information, entity information, and SDP information. To capture graph information, we combine entity pair information and SDP information, which are then inputted into a GAT to obtain topological information. Finally, semantic and topological information are integrated and inputted into the classification layer for categorization.

The overall architecture of BioGSF.
Figure 1

The overall architecture of BioGSF.

Embedding layer and entity feature representation

Given an input sentence S, we derive its embedding via BioBERT, which has been trained on a corpus comprising 4.5 billion words from PubMed abstracts and 13.5 billion words from PubMed Central full-text articles. Within the sentence, we designate the positions of two entities, denoted as Entity 1 and Entity 2. The start and end of Entity 1 are marked by ‘[s1]’ and ‘[e1]’, respectively, while ‘[s2]’ and ‘[e2]’ similarly delineate the boundaries of Entity 2. This annotation process aids in the extraction of entity features during subsequent steps. Initially, the sentence undergoes tokenization to yield individual tokens. BioBERT then generates embeddings for each token, encoding positional, semantic, and contextual information. Furthermore, BioBERT produces a ‘[CLS]’ output that encapsulates the overall semantic content of the text sequence. We employ this as a sentence feature, which will be integrated with the forthcoming fusion features.

However, it is important to note that an entity can be tokenized into multiple tokens. To capture more precise semantic features of the entity, we encode the positions of the classified entities by labeling their locations with a mask. Furthermore, the entities are represented by the average embedding of their constituent tokens. Let estart and eend denote the start and end positions of the entity, respectively, and H represent the output embedding from BioBERT. The representation of the entity is given by Equation (1).

(1)

Where FCLayer signifies a linear transformation layer coupled with dropout layers, whose purposes are to regulate the output dimensionality of the entity embedding and mitigate overfitting, respectively.

Dependency feature representation

Although pre-trained language encoding can achieve good performance, it often captures contextual information while ignoring the part of speech of words in sentences. For instance, a word can function as a subject, predicate, attribute, adverbial, complement, etc., in different sentences. To address this issue, researchers propose using dependency relationships to obtain richer entity relationship information. In this study, we employed the NLP tool ScispaCy [27] to obtain dependency relationships from sentences. ScispaCy is a natural language processing library that has been trained with biomedical domain expertise, leveraging ScispaCy’s framework, and it yields superior outcomes for semantic analysis within medical data. The part-of-speech tagging and dependency parsing module within ScispaCy utilize the GENIA 1.0 corpus [28], PubMed abstracts, and the OntoNotes 5.0 Corpus [29, 30] for their corpora. Conversely, the named entity recognition module of ScispaCy relies on the MedMentions Dataset [31] for its corpus. The parsing results of ScispaCy include the head word, token, category, and child nodes for each token. The category is used to distinguish the annotated entity pairs. When an entity is parsed into multiple tokens, it facilitates marking all entity positions, their head words, and child nodes.

Leveraging the ScispaCy toolkit, we can tokenize sentences and delve into their syntactic structure to extract part-of-speech tags and establish dependency relationships among words, as depicted in Fig. 2A. The resultant dependency relationships are organized into a tree structure that interlinks the entities. Drawing upon this dependency tree, we implemented the Breadth-First Search algorithm to identify the SDP between pairs of entities, as depicted in Fig. 2B. In cases where entity pairs span across sentences, the ROOT node may not be singular, potentially hindering the identification of an SDP. To mitigate this, we utilize the head word of the entity, as identified by ScispaCy, as the SDP. The head word typically serves as the central term within a subtree, capturing the primary semantic relationship within the sentence or clause. This approach ensures that the model maintains robust comprehension and inferential capabilities even within intricate syntactic frameworks by constructing pathways that connect semantic cores. The detailed algorithm is outlined in Algorithm 1.

Shortest dependency path acquisition and processing module. A. ScispaCy parsing results. B. Example of shortest dependency path. C. Shortest dependency path representation acquisition module.
Figure 2

Shortest dependency path acquisition and processing module. A. ScispaCy parsing results. B. Example of shortest dependency path. C. Shortest dependency path representation acquisition module.

Algorithm 1

Shortest Dependent Path Calculations

Input: Marked Sentences S1, Marked Sentences S2
Output: Shortest Dependency Path
Tools: ScispaCy
1 |$doc1=\left[\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\},\dots \right]\leftarrow ScispaCy(S1)$|
2 |$doc2\leftarrow ScispaCy(S2)$|
3 The alternative |$ROOT$| can be found through the parsing result |$\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\}$| = head word, word, type, children
4 for |$ROOT$| in Alternative |$ROOT$| do
   |$path1\leftarrow ScispaCy\left( ROOT, Entity1\right)$|
   |$path2\leftarrow ScispaCy\left( ROOT, Entity2\right)$|
 end
5 if Common path in path1 with path2 then
  Path|$\leftarrow shortest\ dependent\ path$|
 else
  Path|$\leftarrow Entity{1}^{\prime }s\ head\ word+ Entity{2}^{\prime }s\ head\ word$|
 end
6 return Path
Input: Marked Sentences S1, Marked Sentences S2
Output: Shortest Dependency Path
Tools: ScispaCy
1 |$doc1=\left[\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\},\dots \right]\leftarrow ScispaCy(S1)$|
2 |$doc2\leftarrow ScispaCy(S2)$|
3 The alternative |$ROOT$| can be found through the parsing result |$\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\}$| = head word, word, type, children
4 for |$ROOT$| in Alternative |$ROOT$| do
   |$path1\leftarrow ScispaCy\left( ROOT, Entity1\right)$|
   |$path2\leftarrow ScispaCy\left( ROOT, Entity2\right)$|
 end
5 if Common path in path1 with path2 then
  Path|$\leftarrow shortest\ dependent\ path$|
 else
  Path|$\leftarrow Entity{1}^{\prime }s\ head\ word+ Entity{2}^{\prime }s\ head\ word$|
 end
6 return Path
Algorithm 1

Shortest Dependent Path Calculations

Input: Marked Sentences S1, Marked Sentences S2
Output: Shortest Dependency Path
Tools: ScispaCy
1 |$doc1=\left[\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\},\dots \right]\leftarrow ScispaCy(S1)$|
2 |$doc2\leftarrow ScispaCy(S2)$|
3 The alternative |$ROOT$| can be found through the parsing result |$\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\}$| = head word, word, type, children
4 for |$ROOT$| in Alternative |$ROOT$| do
   |$path1\leftarrow ScispaCy\left( ROOT, Entity1\right)$|
   |$path2\leftarrow ScispaCy\left( ROOT, Entity2\right)$|
 end
5 if Common path in path1 with path2 then
  Path|$\leftarrow shortest\ dependent\ path$|
 else
  Path|$\leftarrow Entity{1}^{\prime }s\ head\ word+ Entity{2}^{\prime }s\ head\ word$|
 end
6 return Path
Input: Marked Sentences S1, Marked Sentences S2
Output: Shortest Dependency Path
Tools: ScispaCy
1 |$doc1=\left[\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\},\dots \right]\leftarrow ScispaCy(S1)$|
2 |$doc2\leftarrow ScispaCy(S2)$|
3 The alternative |$ROOT$| can be found through the parsing result |$\left\{{h}_1,{w}_1,{t}_1,{c}_1\right\}$| = head word, word, type, children
4 for |$ROOT$| in Alternative |$ROOT$| do
   |$path1\leftarrow ScispaCy\left( ROOT, Entity1\right)$|
   |$path2\leftarrow ScispaCy\left( ROOT, Entity2\right)$|
 end
5 if Common path in path1 with path2 then
  Path|$\leftarrow shortest\ dependent\ path$|
 else
  Path|$\leftarrow Entity{1}^{\prime }s\ head\ word+ Entity{2}^{\prime }s\ head\ word$|
 end
6 return Path

To extract more effective information from the SDP, we designed a special feature extraction module for it, as shown in Fig. 2C. Firstly, the obtained SDP is encoded by BioBERT denoted as HD, and then richer and more effective embedded information is obtained through a multi-head attention mechanism. Finally, a one-dimensional convolutional network (1D CNN) is used for dimensionality reduction to obtain the SDP representation. 1D-CNN can effectively extract local features from sequential data and better capture local patterns and characteristics at different positions in the sequence. The process is represented as shown in Equation (2):

(2)

Graph topology information representation

The mainstream graph neural networks primarily treat the words in a single sentence as nodes in the graph. Here, we utilize the entity-pair graph [32] to obtain graph topology information. In this graph, the node set represents entity pairs, and edges represent connections between entity pairs. If two entity pairs share a common entity, they are connected by an edge, represented as 1 in the adjacency matrix. As illustrated in Fig. 3, sentence A contains the entity pair (e1: TREM2; e2: AD), sentence B contains (e1: R47H; e2: AD), and sentence C contains (e1: R47H; e2: TREM2). It is evident that each pair of sentences A, B, and C shares a common entity, thus they are all connected by edges.

Schematic diagram of entity pairs.
Figure 3

Schematic diagram of entity pairs.

GAT is a network architecture based on graph-structured data, which assigns weights to neighbors through an attention mechanism and dynamically adjusts the relationship weights between different nodes during the learning process. Therefore, we combine the information of entity pairs with the SDP information and input them into GAT. Here, the combined information of the two is denoted as fusing embedding |$H=\left[{E}_1,{E}_2,D\right]$|⁠, where E1 and E2 represent the feature information of the two entities respectively, and D represents the dependency feature information. Given the input embeddings of the graph as |$H=\left\{{h}_1,{h}_2,\dots, {h}_n\right\}$|⁠, the attention score is calculated as |$a\left(W{h}_i,W{h}_j\right)$|⁠, which represents the importance of node j to node i. Let denote N is all neighbors of node i in the graph. We normalize all nodes using softmax and activate them using LeakyReLU to obtain the attention mechanism coefficient aij, as shown in Equation 3:

(3)

Once the normalized attention scores have been obtained, we are able to extract the output features corresponding to each node. To guarantee the robustness of the self-attention learning process, a multi-head attention mechanism is utilized to aggregate K distinct attention results, ultimately yielding the final output, denoted as |${h}_i^{(0)}$|⁠. Furthermore, a layer of GAT is appended at the end, functioning as the output layer.

(4)
(5)

Where |${F}_g$| is the topological information of the final graph, |${W}^{(0)}$| is and |${W}^{(1)}$| is the weight matrix of linear change, and the input embeddings are reduced to |${d}_m$| and |${d}_g$| respectively. Additionally, |$\sigma$| is the activation function.

Relation classification

We utilize the aforementioned information to classify relationships. Initially, we integrate the semantic features of the sentence, entity features, and the shortest dependent path information to generate FS, as represented in Equation (6) below:

(6)

where Hcls denotes the sentence feature, |$\oplus$| represents the concatenation operation, and the FCLayer reduces the dimensionality of the four types of information to ds.

Then the topology information Fg is incorporated, and the outcome is subsequently classified using a softmax-classifier. Assuming r represents the number of label categories, Wr denotes the size of r × (ds + dg), and br is the bias. The probability derived is exhibited in Equation (7):

(7)

To quantify the model’s loss, we employ cross-entropy loss, as defined in Equation (8):

(8)

For tag prediction, we apply the argmax function to determine the most probable label from p(y|t).

(9)

The detailed model parameters are listed in Supplementary Table S1.

Evaluation metrics

To evaluate the performance of our method and compare it with previous methods, we utilized the micro-average and macro-average F1-score metrics. Both scores are determined based on precision and recall, with their formulas defined as follows:

(10)
(11)
(12)
(13)
(14)
(15)
(16)

Results

Ablation studies

To validate the efficacy of our model, we performed ablation studies on each of the three proposed components: the choice of graph neural network, the SDP network layer, the maximum length L of the SDP and learning rate.

Initially, to validate the effectiveness of the graph neural network, we compared three models: one that directly outputs the fused information obtained by combining the SDP with entity pairs to the classification layer, and another that feeds this information into a graph neural network (GCN or GAT) before classification. As shown in the Table 2, the results demonstrate that the combination of GAT, multi-head attention mechanism, and 1D-CNN yields the best performance with micro-F1 score of 96.47 and 96.03 in S4 and BioRED dataset separately, significantly outperforming other methods. Furthermore, we compared our method with the approach that utilizes the average embedding of entities in RBERT [33]. The results showed that the combined use of entity information and the SDP yields better performance than the method that solely relies on entity information. This also indicates that the SDP has a positive effect on relation classification.

Table 2

Comparison of model effect of different network layers

LayerS4BioRED
AVG94.7995.48
MA + 1D-CNN94.7195.67
GCN + AVG95.0395.60
GCN + MA + 1D-CNN94.6595.69
GAT+AVG95.0295.26
GAT + MA + 1D-CNN(ours)96.4796.03
LayerS4BioRED
AVG94.7995.48
MA + 1D-CNN94.7195.67
GCN + AVG95.0395.60
GCN + MA + 1D-CNN94.6595.69
GAT+AVG95.0295.26
GAT + MA + 1D-CNN(ours)96.4796.03

AVG: entity average embedding method; MA: multi-head self-attention mechanism

Table 2

Comparison of model effect of different network layers

LayerS4BioRED
AVG94.7995.48
MA + 1D-CNN94.7195.67
GCN + AVG95.0395.60
GCN + MA + 1D-CNN94.6595.69
GAT+AVG95.0295.26
GAT + MA + 1D-CNN(ours)96.4796.03
LayerS4BioRED
AVG94.7995.48
MA + 1D-CNN94.7195.67
GCN + AVG95.0395.60
GCN + MA + 1D-CNN94.6595.69
GAT+AVG95.0295.26
GAT + MA + 1D-CNN(ours)96.4796.03

AVG: entity average embedding method; MA: multi-head self-attention mechanism

Next, we investigate the maximum length L of the SDP. Here, the length of the SDP refers to the length of the result obtained after tokenizing the SDP using BioBERT. When sentences are excessively long, the SDP between two entities may also become longer, potentially introducing significant noise into the obtained shortest dependency. We conducted statistics on the length L of the SDP in two datasets. As shown in Fig. 4A, over 95% of sentences have a SDP length less than or equal to 16, and all sentences have a SDP length less than 64. To test the effectiveness and generalization of our model, we set the maximum length of the SDP to 16, 32, and 64, i.e. applying physical truncation to adjust the SDP length to keep same data set size with three different SDP lengths during the experiment.

Different length of SDP statics and effect. A. Proportion of different SDP length. B. Comparison of different SDP length on model performance.
Figure 4

Different length of SDP statics and effect. A. Proportion of different SDP length. B. Comparison of different SDP length on model performance.

The results are shown in Fig. 4B. When L = 32, the model performs best on the S4 dataset but poorly on the BioRED dataset, achieving only 94.6% accuracy. When L = 64, although it performs slightly worse on the S4 dataset compared to L = 32, it significantly outperforms other settings on BioRED. L = 64 covers the SDP of all sentences, ensuring the completeness and generalization of the model.

Moreover, we also performed the ablation study on learning rates, including different settings such as 1e-5, 2e-5, and 3e-5, to deeply analyze the impact of the learning rate on the model’s performance. During the experiments, we observed that when the learning rate was set to 1e-5, the model performed better than other settings. It could converge more effectively and avoid issues like oscillation or overfitting. Based on this experimental result, we selected 1e-5 as the optimal learning rate for this study and conducted all subsequent experiments on this basis. The detail information of hyper-parameters were listed in Table S1 and Fig. S1.

Cross-validation

To rigorously validate the performance and stability of our proposed model, we conducted five-fold cross-validation experiments on the S4 and BioRED datasets, respectively. As demonstrated in Table 3, on the S4 train set, the average values of micro-F1 and macro-F1 are 95.93 ± 0.29 and 90.42 ± 0.61, respectively. Similarly, on the BioRED train data set, the micro-F1 and macro-F1 averages stand at 94.83 ± 0.37 and 93.53 ± 0.41. The obtained experimental results strongly suggest that our model showcases remarkable stability on both datasets. This not only indicates its consistent performance but also further confirms its robust generalization capabilities.

Table 3

Cross-validation performance of our model on training sets from S4 and BioRED

DatasetsMicro-F1Macro-F1
S495.93 ± 0.2990.42 ± 0.61
BioRED94.83 ± 0.3793.53 ± 0.41
DatasetsMicro-F1Macro-F1
S495.93 ± 0.2990.42 ± 0.61
BioRED94.83 ± 0.3793.53 ± 0.41
Table 3

Cross-validation performance of our model on training sets from S4 and BioRED

DatasetsMicro-F1Macro-F1
S495.93 ± 0.2990.42 ± 0.61
BioRED94.83 ± 0.3793.53 ± 0.41
DatasetsMicro-F1Macro-F1
S495.93 ± 0.2990.42 ± 0.61
BioRED94.83 ± 0.3793.53 ± 0.41

Performance comparison with different models

To assess the efficacy of our model, we conducted a comparative analysis of BioGSF’s performance against five preceding methodologies—BioBERT [13], RBERT [33], EPGNN [32], AGCN [19], BioEGRE [34] and ChatGPT-4.0 [35]—using the S4 and BioRED datasets.

Based on the comparison experimental results in Table 4, it becomes apparent that our BioGSF model exhibits optimal performance on both datasets. This notable achievement is primarily due to the efficient integration of the SDP with entity pair information, a strategy that considerably boosts classification performance. Although AGCN demonstrates commendable performance on the BioRED dataset, it demands extensive computational resources and endures a lengthy validation phase. Conversely, our model swiftly generates results with 75 s under similar conditions, whereas AGCN requires a significantly longer duration of 311 s. Additionally, the EPGNN model, which incorporates a GCN approach centered on entity pairs, showcases robust performance on both datasets. However, it disregards the critical importance of incorporating the SDP, relying exclusively on internal data. Furthermore, we employed the ChatGPT-4.0 model and utilized the zero-shot and one-shot method for comparison. Specifically, we designed appropriate prompts to guide the model’s responses without conducting additional training or tuning on the model to directly leverage the model’s general language understanding ability to rapidly test its performance in RE task. However, as shown in the table 4, the performance of the model was not satisfactory. It is evident from the results that although ChatGPT-4.0 has a strong ability in handling general language tasks, in specific tasks, its effect is poor and it fails to fully understand the context information.

Table 4

Comparative performance evaluation of our model against other state-of-the-art models on the test set from S4 and BioRED datasets

ModelS4BioRED
Micro-F1Macro-F1TimesMicro-F1Macro-F1Times
BioBERT93.8586.2078 s94.3889.7175 s
RBERT94.8988.0975 s94.8194.2775 s
EPGNN94.9890.6275 s94.6392.1775 s
AGCN92.2279.11311 s96.0194.42362 s
BioEGRE89.5775.411038s94.1793.691366s
GPT-4o(zero-shot)57.7952.331h30m76.4580.651h30m
GPT-4o(one-shot)49.3350.81h30m89.3380.411h30m
BioGSF96.6892.4675 s96.0394.6573 s
ModelS4BioRED
Micro-F1Macro-F1TimesMicro-F1Macro-F1Times
BioBERT93.8586.2078 s94.3889.7175 s
RBERT94.8988.0975 s94.8194.2775 s
EPGNN94.9890.6275 s94.6392.1775 s
AGCN92.2279.11311 s96.0194.42362 s
BioEGRE89.5775.411038s94.1793.691366s
GPT-4o(zero-shot)57.7952.331h30m76.4580.651h30m
GPT-4o(one-shot)49.3350.81h30m89.3380.411h30m
BioGSF96.6892.4675 s96.0394.6573 s
Table 4

Comparative performance evaluation of our model against other state-of-the-art models on the test set from S4 and BioRED datasets

ModelS4BioRED
Micro-F1Macro-F1TimesMicro-F1Macro-F1Times
BioBERT93.8586.2078 s94.3889.7175 s
RBERT94.8988.0975 s94.8194.2775 s
EPGNN94.9890.6275 s94.6392.1775 s
AGCN92.2279.11311 s96.0194.42362 s
BioEGRE89.5775.411038s94.1793.691366s
GPT-4o(zero-shot)57.7952.331h30m76.4580.651h30m
GPT-4o(one-shot)49.3350.81h30m89.3380.411h30m
BioGSF96.6892.4675 s96.0394.6573 s
ModelS4BioRED
Micro-F1Macro-F1TimesMicro-F1Macro-F1Times
BioBERT93.8586.2078 s94.3889.7175 s
RBERT94.8988.0975 s94.8194.2775 s
EPGNN94.9890.6275 s94.6392.1775 s
AGCN92.2279.11311 s96.0194.42362 s
BioEGRE89.5775.411038s94.1793.691366s
GPT-4o(zero-shot)57.7952.331h30m76.4580.651h30m
GPT-4o(one-shot)49.3350.81h30m89.3380.411h30m
BioGSF96.6892.4675 s96.0394.6573 s

Finally, to further validate the efficacy of our model, we reconstructed a data set based on the most recently published papers. Specifically, the data set was constructed using the abstracts of papers included in PubMed in the past month, namely October 2024 (Search terms: protein–protein interaction[tiab] cancer[tiab] Filters: from 2024/10/1–2024/10/31), and these were manually annotated (Table S2). We re-evaluated our method with this data set and the results demonstrated that our model achieved the best performance with micro-F1 and macro-F1 scores of 97.44 and 78.77, respectively (Table S3).

Our study underscores the profound benefits of merging SDP information with entity pair data, thereby greatly facilitating information acquisition by graph neural networks. This observation aligns seamlessly with the findings derived from ablation studies, further validating the efficacy of our approach.

Case study

To further investigate the effectiveness of our model for entity relationship extraction in medical literature, we randomly selected recently published research articles from PubMed as the basis for our case study. The results, presented in Table 5, aim to demonstrate the performance differences among various models in predicting relationship types. A notable observation from the table is that advanced models like BioBERT accurately identify the relationship type as CDI between widely known drug types, such as 5-Fluorouracil, and specific diseases. This indicates that the models perform well when dealing with drug entities that have extensive literature support.

Table 5

Case study demonstrating the efficacy of BioGSF

SentenceGold StandardBioBERTRBERTEPGNNAGCNBioEGREBioGSF
Finally, according to the results of drug susceptibility analysis, docetaxel, 5-Fluorouracil, [s1] gemcitabin [e1], and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDIPDIPDIPDICDICDICDI
Finally, according to the results of drug susceptibility analysis, docetaxel, [s1] 5-Fluorouracil [e1], gemcitabin, and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDICDICDIPDICDICDICDI
Here, we constructed [s1] ACZ2 [e1] and investigated its efficacy and potential mechanism for [s2] gastric cancer [e2] in vitro and in vivo.CDIPDIPDIPDIPDICDICDI
Changes in gene expression, chemokine and cytokine secretion, plasma [s1] IgE [e1], and lung histology were quantified using RT-qPCR, ELISA, and immunohistochemistry, respectively. Arg1 elimination [s2] OVA [e2] lso decreased number and tightness of correlations between adaptive changes in lung function and inflammatory parameters in OVA/OVA-treated female mice.PPIDDIDDIDDICVICPIDDI
SentenceGold StandardBioBERTRBERTEPGNNAGCNBioEGREBioGSF
Finally, according to the results of drug susceptibility analysis, docetaxel, 5-Fluorouracil, [s1] gemcitabin [e1], and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDIPDIPDIPDICDICDICDI
Finally, according to the results of drug susceptibility analysis, docetaxel, [s1] 5-Fluorouracil [e1], gemcitabin, and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDICDICDIPDICDICDICDI
Here, we constructed [s1] ACZ2 [e1] and investigated its efficacy and potential mechanism for [s2] gastric cancer [e2] in vitro and in vivo.CDIPDIPDIPDIPDICDICDI
Changes in gene expression, chemokine and cytokine secretion, plasma [s1] IgE [e1], and lung histology were quantified using RT-qPCR, ELISA, and immunohistochemistry, respectively. Arg1 elimination [s2] OVA [e2] lso decreased number and tightness of correlations between adaptive changes in lung function and inflammatory parameters in OVA/OVA-treated female mice.PPIDDIDDIDDICVICPIDDI
Table 5

Case study demonstrating the efficacy of BioGSF

SentenceGold StandardBioBERTRBERTEPGNNAGCNBioEGREBioGSF
Finally, according to the results of drug susceptibility analysis, docetaxel, 5-Fluorouracil, [s1] gemcitabin [e1], and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDIPDIPDIPDICDICDICDI
Finally, according to the results of drug susceptibility analysis, docetaxel, [s1] 5-Fluorouracil [e1], gemcitabin, and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDICDICDIPDICDICDICDI
Here, we constructed [s1] ACZ2 [e1] and investigated its efficacy and potential mechanism for [s2] gastric cancer [e2] in vitro and in vivo.CDIPDIPDIPDIPDICDICDI
Changes in gene expression, chemokine and cytokine secretion, plasma [s1] IgE [e1], and lung histology were quantified using RT-qPCR, ELISA, and immunohistochemistry, respectively. Arg1 elimination [s2] OVA [e2] lso decreased number and tightness of correlations between adaptive changes in lung function and inflammatory parameters in OVA/OVA-treated female mice.PPIDDIDDIDDICVICPIDDI
SentenceGold StandardBioBERTRBERTEPGNNAGCNBioEGREBioGSF
Finally, according to the results of drug susceptibility analysis, docetaxel, 5-Fluorouracil, [s1] gemcitabin [e1], and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDIPDIPDIPDICDICDICDI
Finally, according to the results of drug susceptibility analysis, docetaxel, [s1] 5-Fluorouracil [e1], gemcitabin, and paclitaxel were found to be more sensitive to [s2] gastric cancer [e2].CDICDICDIPDICDICDICDI
Here, we constructed [s1] ACZ2 [e1] and investigated its efficacy and potential mechanism for [s2] gastric cancer [e2] in vitro and in vivo.CDIPDIPDIPDIPDICDICDI
Changes in gene expression, chemokine and cytokine secretion, plasma [s1] IgE [e1], and lung histology were quantified using RT-qPCR, ELISA, and immunohistochemistry, respectively. Arg1 elimination [s2] OVA [e2] lso decreased number and tightness of correlations between adaptive changes in lung function and inflammatory parameters in OVA/OVA-treated female mice.PPIDDIDDIDDICVICPIDDI

However, the situation changes when encountering new or less reported drugs, such as gemcitabin and ACZ2, especially when they appear in abbreviated forms. Due to the scarcity of entity information for these newer drugs, relying solely on entity information to determine relationship types becomes more challenging. It’s worth noting that the BioGSF model struggles in such cases, primarily because it heavily relies on direct information between entity pairs, neglecting the richer and crucial contextual information in the text. This design limitation often leads the model to mistakenly predict relationships as PDI when faced with the aforementioned challenges. On the other hand, our model and BioEGRE adopt different strategies, focusing more on extracting and utilizing rich entity and contextual information from dependency information. This approach enables accurate determination of relationship types between new drugs and other entities, even when direct entity information is scarce. Furthermore, our model transmits entity information and dependency information through an entity-pair graph, allowing other related entities to learn relationship type information.

Discussion

In this study, we present a model that integrates the SDPs and the entity-pair graph for the multi-biological entity relationship extraction from literatures. For diverse medical entity types, we can proficiently handle semantic information and acquire the relationships among entities, thereby contributing significantly to the construction of the medical knowledge graph.

First, we utilize the multi-head attention mechanism to capture the relationships and features of different input parts in the SDPs, thereby better understanding complex patterns and relationships. The SDPs not only contain the words and dependencies that are most directly related to the relationship between the target entity pairs, thereby simplifying the sentence structure, eliminating the noise in the sentence, and being able to represent the semantic relationship between the entities in the syntactic structure more intuitively. By combining local and global features, we can extract the useful parts of each SDP word based on local features. Secondly, different attention ‘heads’ can learn various representations, and these representations will guide the subsequent 1D-CNN to screen features. 1D-CNN captures local dependency relationships and temporal features, and simultaneously extracts local information from sequence data to reduce the influence of local noise and select the most beneficial words from all the SDP words. Regarding the selection of the length of the SDP, when L = 64, it can not only cover all the words on the SDPs but also achieve good results in two datasets. Possibly, adjustments need to be made for different datasets, which also reflects that our model has good generalization.

When selecting the graph neural network model, we compared the effects of GCN and GAT. Our results demonstrated that the combination of GAT and MA + 1DCNN achieved the optimal outcome. Meanwhile, we also found that the combination of GAT and AVG did not yield very good results, and was even worse than the combination of GCN and average pooling. This might be because GAT relies on a more complex structure when learning node representations and cannot capture node features only through simple average pooling. The combination of multi-head attention and one-dimensional CNN can provide GAT with richer feature expressions and enhance feature integration on the original basis.

Although our method has achieved good results through the strategy of integrating the SDPs and the entity-pair graph, there are still some shortcomings. Firstly, the method used in the model to obtain the entity itself is the approach adopted by most models, which is to perform average pooling at the entity positions using the embeddings obtained through pre-trained language. The entity information extracted through this method is not rich enough, especially when the sentences have limited information of entities as shown in the case in Table 5. In future work, we will integrate more methods and incorporate multiple modalities of information, such as text, image, and voice, to acquire richer information for representing the entities. Secondly, our model currently only supports binary RE and has not yet achieved n-ary RE. N-ary RE involves complex relationships among multiple entities, and its information processing is more complex. To accurately capture and understand these complex relationships, especially when the connection between the two entities, such as IgE and OVA in Table 5, is not sufficiently clear, the model requires stronger context understanding capabilities and higher computing power to be able to conduct effective reasoning and relationship identification among multiple entities. In future research, we will improve the dependency relationship between words and construct different weight coefficients for different dependency relationships. This will make the model have different sensitivities to different dependency relationship types and dependent words and reduce the influence of noise on word representations. Furthermore, the application of large language models (such as Llama3 [36]) in RE tasks will also provide new strategies for capturing complex context information and multi-entity relationships. Finally, it is worth noting that our approach solely focuses on the identification of entity relations and does not perform entity recognition.

In this paper, we innovatively put forward the idea of combining the SDPs with the entity-pair graph, and utilized the graph neural network model based on the two to extract the relationships of medical entity pairs. This model not only significantly reduced the time but also performed more outstandingly in terms of efficacy, opening up a new approach for extracting knowledge from biomedical texts and demonstrating broad application prospects.

Conclusion

In this paper, we innovatively proposed the concept of integrating the SDP with the entity-pair graph and employed the graph neural network model based on the two to extract the relationships of medical entity pairs. This model not only shortened the running time but also achieved excellent performance, opening up a new avenue for extracting knowledge from biomedical texts and demonstrating broad application prospects.

Key Points
  • We introduced BioGSF, a novel graph-driven framework that integrates shortest dependency paths (SDP) with entity-pair graphs, enhancing the extraction of biomedical relations from literature.

  • We obtained rich SDP information through the combination of multi-head attention and one-dimensional convolutional neural network, which takes into account global features as well as local patterns and characteristics at different locations.

  • BioGSF demonstrates exceptional performance in biomedical relation extraction tasks, achieving a high micro-F1 score of 96.68% on the S4 dataset and 96.03% on the BioRED dataset with faster processing times than existing models.

Conflict of interest: The authors declare no conflicts of interest.

Funding

This research was funded by the start-up fund from Suzhou City University; Medical and Health Science and Technology Innovation Project of Suzhou (SKY2022010, SKYD2022097); Foundation of Suzhou Medical College of Soochow University (MP13405423, MX13401423); the Priority Academic Program Development of Jiangsu Higher Education Institutions; the open research fund of Suzhou Key Lab of Multi-modal Data Fusion and Intelligent Healthcare.

Data availability

The data and source codes used in this paper are available at https://github.com/serien-zzx/BioGSF

References

1.

Peng
 
C
,
Xia
 
F
,
Naseriparsa
 
M
. et al.  
Knowledge graphs: Opportunities and challenges
.
Artif Intell Rev
 
2023
;56:13071–2. .

2.

Yang
 
Y
,
Lu
 
Y
,
Yan
 
W
.
A comprehensive review on knowledge graphs for complex diseases
.
Brief Bioinform
 
2023
;
24
:
24
. .

3.

Di Maria
 
A
,
Bellomo
 
L
,
Billeci
 
F
. et al.  
NetMe 2.0: A web-based platform for extracting and modeling knowledge from biomedical literature as a labeled graph
.
Bioinformatics
 
2024
;
40
:
40
. .

4.

Morris
 
JH
,
Soman
 
K
,
Akbas
 
RE
. et al.  
The scalable precision medicine open knowledge engine (SPOKE): A massive knowledge graph of biomedical information
.
Bioinformatics
 
2023
;
39
:btad080. .

5.

Yang
 
Y
,
Lu
 
Y
,
Zheng
 
Z
. et al.  
MKG-GC: A multi-task learning-based knowledge graph construction framework with personalized application to gastric cancer
.
Comput Struct Biotechnol J
 
2024
;
23
:
1339
47
. .

6.

Bang
 
D
,
Lim
 
S
,
Lee
 
S
. et al.  
Biomedical knowledge graph learning for drug repurposing by extending guilt-by-association to multiple layers
.
Nat Commun
 
2023
;
14
:
3570
. .

7.

Ni
 
S
,
Kong
 
X
,
Zhang
 
Y
. et al.  
Identifying compound-protein interactions with knowledge graph embedding of perturbation transcriptomics
.
Cell Genomics
 
2024
;
4
:
100655
. .

8.

Li
 
J
,
Zhang
 
H
,
Wang
 
J
. et al.  
Development and validation of an AI-driven system for automatic literature analysis and molecular regulatory network construction, advanced
.
Science
 
2024
;
11
:2405395. .

9.

Santos
 
A
,
Colaco
 
AR
,
Nielsen
 
AB
. et al.  
A knowledge graph to interpret clinical proteomics data
.
Nat Biotechnol
 
2022
;
40
:
692
702
. .

10.

Mahmood
 
AA
,
Wu
 
T-J
,
Mazumder
 
R
. et al.  
DiMeX: A text mining system for mutation-disease association extraction
.
PloS One
 
2016
;
11
:
e0152725
. .

11.

Li
 
L
,
Wang
 
P
,
Yan
 
J
. et al.  
Real-world data medical knowledge graph: Construction and applications(MKG)
.
Artif Intell Med
 
2020
;
103
:
101817
. .

12.

Singhal
 
A
,
Simmons
 
M
,
Lu
 
Z
.
Text mining for precision medicine: Automating disease-mutation relationship extraction from biomedical literature
.
J Am Med Inform Assoc
 
2016
;
23
:
766
72
. .

13.

Lee
 
J
,
Yoon
 
W
,
Kim
 
S
. et al.  
BioBERT: A pre-trained biomedical language representation model for biomedical text mining
.
Bioinformatics
 
2020
;
36
:
1234
40
. .

14.

Luo
 
R
,
Sun
 
L
,
Xia
 
Y
. et al.  
BioGPT: Generative pre-trained transformer for biomedical text generation and mining
.
Brief Bioinform
 
2022
;
23
:
bbac409
. .

15.

Jin
 
Q
,
Yang
 
Y
,
Chen
 
Q
. et al.  
Genegpt: Augmenting large language models with domain tools for improved access to biomedical information
.
Bioinformatics
 
2024
;
40
:
btae075
. .

16.

Miwa
 
M
,
Bansal
 
M
. End-to-end relation extraction using lstms on sequences and tree structures. In: Erk K, Smith N. (eds).
The 54th Annual Meeting of the Association for Computational Linguistics
.
Berlin, Germany
:
Association for Computational Linguistics
,
2016
.

17.

Peng
 
N
,
Poon
 
H
,
Quirk
 
C
. et al.  
Cross-sentence N-ary relation extraction with graph LSTMs, transactions of the association for
.
Comput Linguist
 
2017
;
5
:
101
15
. .

18.

Lai
 
P-T
,
Lu
 
Z
.
BERT-GT: Cross-sentence N-ary relation extraction with BERT and graph transformer
.
Bioinformatics
 
2020
;
36
:
5678
85
.

19.

Tian
 
Y
,
Chen
 
G
,
Song
 
Y
. et al.  Dependency-driven relation extraction with attentive graph convolutional networks. In: Zong C, Xia F, Li W, Navigli R. (eds).
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
,
Vol. 1: Long Papers
. Online:
Association for Computational Linguistics
,
2021
,
4458
71
.

20.

Chen
 
G
,
Tian
 
Y
,
Song
 
Y
. et al.  
Relation extraction with type-aware map memories of word dependencies
.
In: Findings of the Association for Computational Linguistics: ACL-IJCNLP
 
2021
;
2021
:
2501
12
.

21.

Luo
 
L
,
Lai
 
P-T
,
Wei
 
C-H
. et al.  
BioRED: A rich biomedical relation extraction dataset
.
Brief Bioinform
 
2022
;
23
:
bbac282
. .

22.

Li
 
J
,
Sun
 
Y
,
Johnson
 
RJ
. et al.  
BioCreative V CDR task corpus: A resource for chemical disease relation extraction
.
Database
 
2016
;
2016
:baw068. .

23.

Krallinger
 
M
,
Rabal
 
O
,
Akhondi
 
SA
. et al.  Overview of the BioCreative VI chemical-protein interaction track. In:
Proceedings of the Sixth BioCreative Challenge Evaluation Workshop
, Bethesda, MDUSA: BioCreative,
2017
,
141
6
.

24.

Miranda-Escalada A, Mehryary F, Luoma J. et al. Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical–protein relations. Database 2023; 2023:baad080.

.

25.

Nédellec
 
C
. Learning language in logic-genic interaction extraction challenge. In:
Proceedings of the 4th Learning Language in Logic Workshop (LLL05)
. Born, Germany.,
2005
, 7:
1
7
.

26.

Pyysalo
 
S
,
Airola
 
A
,
Heimonen
 
J
. et al.  Comparative analysis of five protein-protein interaction corpora. In: Baker C and Jian S. (eds).
BMC Bioinformatics
. Singapore:
BioMed Central
,
2008
,
1
11
.

27.

Neumann
 
M
,
King
 
D
,
Beltagy
 
I
. et al.  ScispaCy: Fast and robust models for biomedical natural language processing. In: Demner-Fushman D, Cohen K, Ananiadou S, Tsujii J. (eds).
18th SIGBioMed Workshop on Biomedical Natural Language Processing, BioNLP 2019
.
Florence, Italy
:
Association for Computational Linguistics (ACL)
,
2019
,
319
27
.

28.

Kim
 
J-D
,
Ohta
 
T
,
Tateisi
 
Y
. et al.  
GENIA corpus - a semantically annotated corpus for bio-textmining
.
Bioinformatics
 
2003
;
19
:
i180
2
.

29.

Hovy
 
E
,
Marcus
 
M
,
Palmer
 
M
. et al.  
OntoNotes: The 90% solution
.
Proceedings of the Human Language Technology Conference of the NAACL
 
2006
;
Companion Volume: Short Papers
:
57
60
.

30.

Pradhan
 
S
,
Ramshaw
 
L
. OntoNotes: Large scale multi-layer, multi-lingual, distributed annotation. In:
Ide
 
N
,
Pustejovsky
 
J
(eds).
Handbook of Linguistic Annotation
.
Dordrecht
:
Springer Netherlands
,
2017
,
521
54
.

31.

Murty
 
S
,
Verga
 
P
,
Vilnis
 
L
. et al.  Hierarchical losses and new resources for fine-grained entity typing and linking. In: Gurevych I, Miyao Y. (eds).
ACL
.
56th Annual Meeting of the Association for Computational Linguistics
, Melbourne, Australia.
2018
,
97
109
.

32.

Zhao
 
Y
,
Wan
 
H
,
Gao
 
J
. et al.  Improving relation classification by entity pair graph. In: Lee W and Nagoya ST. (eds).
Asian Conference on Machine Learning
.
Japan: PMLR
.
2019
,
1156
71
.

33.

Wu
 
S
,
He
 
Y
. Enriching pre-trained language model with entity information for relation classification. In:
Proceedings of the 28th ACM International Conference on Information and Knowledge Management
, New York, NY, USA, Association for Computing Machinery.
2019
,
2361
4
.

34.

Zheng
 
X
,
Wang
 
X
,
Luo
 
X
. et al.  
BioEGRE: A linguistic topology enhanced method for biomedical relation extraction based on BioELECTRA and graph pointer neural network
.
BMC bioinformatics
 
2023
;
24
:
486
. .

35.

Open AI.

GPT-4 technical report
. 2023, arXiv 2024;230308774. .

36.

LlamaTeam, AI@Meta.

The Llama 3 Herd of Models
, 2024, arXiv:2407.21783 [cs.AI]. .

Author notes

Yang Yang and Zixuan Zheng contributed equally to this work.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]