Abstract

Lamiales is an order of core eudicots with abundant diversity, and many Lamiales plants have important medicinal and ornamental values. Here, we comparatively reanalyzed 11 Lamiales species with well-assembled genome sequences and found evidence that Lamiales plants, in addition to a hexaploidization or whole-genome triplication (WGT) shared by core eudicots, experienced further polyploidization events, establishing new groups in the order. Notably, we identified a whole-genome duplication (WGD) occurred just before the split of Scrophulariaceae from the other Lamiales families, such as Acanthaceae, Bignoniaceae, and Lamiaceae, suggesting its likely being the causal reason for the establishment and fast divergence of these families. We also found that a WGT occurred ∼68 to 78 million years ago (Mya), near the split of Oleaceae from the other Lamiales families, implying that it may have caused their fast divergence and the establishment of the Oleaceae family. Then, by exploring and distinguishing intra- and intergenomic chromosomal homology due to recursive polyploidization and speciation, respectively, we inferred that the Lamiales ancestral cell karyotype had 11 proto-chromosomes. We reconstructed the evolutionary trajectories from these proto-chromosomes to form the extant chromosomes in each Lamiales plant under study. We must note that most of the inferred 11 proto-chromosomes, duplicated during a WGD thereafter, have been well preserved in jacaranda (Jacaranda mimosifolia) genome, showing the credibility of the present inference implementing a telomere-centric chromosome repatterning model. These efforts are important to understand genome repatterning after recursive polyploidization, especially shedding light on the origin of new plant groups and angiosperm cell karyotype evolution.

Introduction

Lamiales plants, forming an order of angiosperms, play important roles in our daily life, health preservation, medicinal use, ecological balance, and biodiversity conservation. According to the APG IV system (THE ANGIOSPERM PHYLOGENY Group 2016), it includes thousands of plants divided into 24 families, such as Lamiaceae, Gesneriaceae, Plantaginaceae, and Scrophulariaceae. As the largest family in Lamiales, Lamiaceae includes more than 240 genera and more than 7,800 plants, which are widely distributed around the world, with 99 genera and 808 species found on the Chinese mainland (Zhao et al. 2021). Many plants in Lamiaceae have medicinal values, such as basil and perilla (Dudai and Belanger 2016; Ahmed 2019). Plants in Gesneriaceae are widely used in horticulture and green construction (Moller et al. 2019). Plantaginaceae plants have medicinal values such as clearing heat, detoxifying, diuresis, and reducing swelling (Mutinda et al. 2023). Most plants in Scrophulariaceae have certain economic and medicinal values (Cock et al. 2022). To date, there has been considerable progress in sequencing Lamiales plant genomes. According to the PlabiPD database (http://plabipd.de/index.ep), 80 Lamiales plants have been deciphered of their genomes, distributed in 13 families and 52 genera, including 27 genome sequences assembled to the chromosome level, providing valuable data to understand their biology and evolution.

Polyploidy plays important roles during the evolution of plants (Bowers et al. 2003; Jiao et al. 2011; Barker et al. 2016). Through polyploidization, plants can increase their genotype and phenotype diversity and improve their adaptability to the environment due mainly to the production of thousands of simultaneously duplicated genes, enhancing the evolutionary potentials of plants (Soltis and Soltis 2016; Van de Peer et al. 2017). After polyploidization, a genome often becomes unstable, resulting in phenomena such as chromosome breakages, abnormal recombination, and widespread gene losses. Genome instability provides nearly infinite possibilities for plant evolution and divergence (Leitch and Leitch 2008; Wendel 2015). As one of the most diverse orders in eudicots, Lamiales experienced the whole-genome triplication event common to major core eudicots (γ WGT), having occurred ∼115 to 130 million years ago (Mya; Jaillon et al. 2007). Previously, it was inferred that Oleaceae, a Lamiales family, experienced an extra WGT (named ω WGT) in the late Cretaceous about 66 Mya, likely contributing to the split of Oleaceae and the other Lamiales plants, and the fast diversification of Oleaceae plants (Xu et al. 2022). Further analysis of olive (Olea europaea) and green ash (Fraxinus pennsylvanica), both from the family of Oleaceae, suggested that they experienced an extra whole-genome duplication (named χ WGD) after the ω WGT (Rao et al. 2021; Huff et al. 2022). Extra polyploidization may have occurred in other Lamiales families. For example, it was inferred that rosemary (Salvia rosmarinus), a Lamiaceae plant, underwent 3 polyploidization events, including the γ WGT, an intermediate one having occurred ∼48 Mya, and a recent one ∼15 Mya (Han et al. 2023).

Chromosomes double during a WGD, and after a period of time, chromosome number often decreases mainly due to fusions of chromosomes, accompanied by different types of DNA rearrangement (Schubert and Lysak 2011). By performing pan-family or cross-family analysis of extant genomes, researchers attempted to reconstruct the ancestral cell karyotypes (Salse et al. 2009a, 2009b; Murat et al. 2012). A telomere-centric model was raised to reconstruct not only ancestral cell karyotypes but also the evolutionary trajectories of chromosomes (Wang et al. 2015a, 2015b, 2015c). The model considers the property of telomeres to protect the integrity of chromosomes, and their removal, by crossing-over between telomeres’ proximal regions, would likely facilitate their reorganization. The model explained how chromosome numbers decreased during evolution, especially after polyploidization. Three main types of chromosome reorganization were proposed, namely nested chromosome fusion (NCF), chromosome end–end joining (EEJ), and reciprocal exchanges of chromosomal arms (RTA). The occurrence of NCF is inferred to involve crossing-over between the proximal regions of 2 telomeres from the same chromosome to form a circular chromosome, and the resolution of the crossing-over could result in a free-end chromosome and a satellite chromosome to contain mainly the telomeric DNA. This NCF model explained how recurrent fusions of different chromosomes form extant chromosomes in Brachypodium (Vogel et al. 2010). The free-end chromosome may invade another chromosome and break the latter, eventually merging into it. The occurrence of EEJ is inferred to occur between different chromosomes; that is, a crossing-over may occur between the proximal regions of telomeres of the 2 chromosomes. The resolution of the crossing-over could result in a fused chromosome and a satellite chromosome. The model predicts that the occurrence of EEJ and NCF can form the satellite chromosomes, often called B chromosomes previously, which will be lost during evolution, eventually resulting in chromosome number reduction. The model has been applied to the study of karyotype evolution in multiple plant groups, including Cucurbitales, Leguminosae, Apiales, Poales, and other monocot orders (Wang et al. 2015a, 2015b, 2015c, 2019; Zhuang et al. 2019; Song et al. 2021; Wang et al. 2022). Until now, there has been no research on the ancestral cell karyotypes and Lamiales chromosome evolution.

Here, we reanalyzed the genome sequences of 11 representative Lamiales plants, assembled to the chromosome level (Hamilton et al. 2020; Xu et al. 2020a, 2020b; Friis et al. 2021; Jia et al. 2021; Ma et al. 2021; Rao et al. 2021; Wang et al. 2021a, 2021b; Huff et al. 2022; Xu et al. 2022), by referring to grape (Vitis vinifera), which has a conserved genome and therefore is often used as a reference to understand other eudicot genomes’ structure and polyploidization. We firstly determined whether the abovementioned polyploidization events were shared by Lamiales plants or not, secondly inferred the cell karyotypes of the ancestors of Lamiales at key evolutionary nodes, especially Lamiales proto-chromosomes, and thirdly reconstructed the evolutionary trajectories from those proto-chromosomes to form the extant chromosomes in Lamiales plants under consideration.

Results

Inference of gene collinearity

In order to clarify the polyploidization events experienced by Lamiales plants, we conducted a comparative analysis of the genome structures of 11 representative species in Lamiaceae, Acanthaceae, Bignoniaceae, Scrophulariaceae, and Oleaceae (Supplementary Table S1). Firstly, a phylogenetic tree was constructed with V. vinifera as the outgroup (Fig. 1A). Secondly, using WGDI (Sun et al. 2022), a gene collinearity inference toolkit, we extracted the collinear gene pairs within each genome and between the genomes of the studied species (Supplementary Table S2). Compared to the other species’ genomes under consideration, the genome of F. pennsylvanica contains much more collinear genes, with 31,189 homologous gene pairs residing in 2,126 homologous genomic blocks. In contrast, we revealed only 7,482 pairs in jasmine (Jasminum sambac). In addition, jacaranda (Jacaranda mimosifolia) from Bignoniaceae has more intergenomic collinear genes with the other plants (22,245 to 38,551), located in 1,074 to 1,999 blocks. Thirdly, we estimated the synonymous nucleotide substitution rate (Ks) between collinear gene pairs within a genome or between different genomes, which helped date polyploidization events and speciation events (Fig. 2; Supplementary Table S3).

Phylogenetic trees and examples of local homologous gene dot plots of Lamiales plants. A) The phylogenetic relationship between the 5 families of Lamiales plants and the outgroup plant, V. vinifera; circle represents WGD; the light blue triangle represents WGT. B to E) Local homologous gene dot plots, where the numbers represent the Ks values at the corresponding positions. B) The dot plot between J. mimosifolia and C. americana, and the orange boxed segment represents 1 segment of J. mimosifolia to have 1 best-matched or orthologous segment of C. americana. C) The dot plot between J. mimosifolia and T. grandis, and the blue boxed segment represents 1 segment in J. mimosifolia to have 1 best-matched or orthologous segment in T. grandis. D) The dot plot between J. mimosifolia and B. alternifolia, and the purple boxed segment represents 1 segment in J. mimosifolia to have 1 best orthologous segment in B. alternifolia. E) The dot plot between J. mimosifolia and A. marina, and the green boxed segment represents 1 segment in J. mimosifolia to have the 2 best orthologous segments in A. marina. F) Species phylogenetic trees for the V. vinifera, J. sambac, F. pennsylvanica, B. alternifolia, A. marina, J. mimosifolia, and C. americana genomes. The light blue triangle represents the core eudicot common hexaploidization (ECH or γ) event, while the pink triangle, blue circle, orange circle, and purple circle represent the polyploidization events of in Lamiales, respectively. G) Corresponding gene tree. In V. vinifera (V), there were 3 paralogous genes generated by the ECH, namely V1, V2, and V3, if no gene losses had occurred after the ECH, and each grape gene was expected to have 3 orthologous genes in J. sambac (J) genome, 6 orthologs in F. pennsylvanica (F) genome, 2 orthologs in B. alternifolia (B), J. mimosifolia (Jm), and C. americana (C) genomes, and 4 orthologs in A. marina (A) genome.
Figure 1.

Phylogenetic trees and examples of local homologous gene dot plots of Lamiales plants. A) The phylogenetic relationship between the 5 families of Lamiales plants and the outgroup plant, V. vinifera; circle represents WGD; the light blue triangle represents WGT. B to E) Local homologous gene dot plots, where the numbers represent the Ks values at the corresponding positions. B) The dot plot between J. mimosifolia and C. americana, and the orange boxed segment represents 1 segment of J. mimosifolia to have 1 best-matched or orthologous segment of C. americana. C) The dot plot between J. mimosifolia and T. grandis, and the blue boxed segment represents 1 segment in J. mimosifolia to have 1 best-matched or orthologous segment in T. grandis. D) The dot plot between J. mimosifolia and B. alternifolia, and the purple boxed segment represents 1 segment in J. mimosifolia to have 1 best orthologous segment in B. alternifolia. E) The dot plot between J. mimosifolia and A. marina, and the green boxed segment represents 1 segment in J. mimosifolia to have the 2 best orthologous segments in A. marina. F) Species phylogenetic trees for the V. vinifera, J. sambac, F. pennsylvanica, B. alternifolia, A. marina, J. mimosifolia, and C. americana genomes. The light blue triangle represents the core eudicot common hexaploidization (ECH or γ) event, while the pink triangle, blue circle, orange circle, and purple circle represent the polyploidization events of in Lamiales, respectively. G) Corresponding gene tree. In V. vinifera (V), there were 3 paralogous genes generated by the ECH, namely V1, V2, and V3, if no gene losses had occurred after the ECH, and each grape gene was expected to have 3 orthologous genes in J. sambac (J) genome, 6 orthologs in F. pennsylvanica (F) genome, 2 orthologs in B. alternifolia (B), J. mimosifolia (Jm), and C. americana (C) genomes, and 4 orthologs in A. marina (A) genome.

Distribution of Ks among collinearity genes in or between compared genomes. A timeline is provided to estimate dating. Mya, million years ago.
Figure 2.

Distribution of Ks among collinearity genes in or between compared genomes. A timeline is provided to estimate dating. Mya, million years ago.

Inference of homology ratios

Ratios of orthology between genomes or paralogy within a genome are key to understanding whether or not polyploidization event(s) are shared and the ploidy levels of a specific polyploidization event. Here, by integrating the information about gene collinearity inferred above, we used Blast-supported putative homologous genes to draw homologous gene dot plots within a genome and between different genomes to find homology ratios (Tang et al. 2011; Wang et al. 2022). The intraspecific homologous gene dot plot of each of these plants, callicarpa (Callicarpa americana; a Lamiaceae plant), teak (Tectona grandis; a Lamiaceae plant), J. mimosifolia (a Bignoniaceae plant), and butterfly bush (Buddleja alternifolia; a Scrophulariaceae plant), indicated that the paralogy ratio, produced by the best-matched genes between their chromosomes, is 1:1. This indicates that each of these plants has experienced a WGD (Supplementary Figs. S1 to S4). Moreover, the interspecific homologous gene dot plots between J. mimosifolia and each of these 3 plants, C. americana, T. grandis, and B. alternifolia, showed that orthology ratios are 1:1, indicating that these plants shared a WGD before the split of Lamiaceae, Bignoniaceae, Acanthaceae, and Scrophulariaceae. That is, there has been no WGD specific to any of them. In comparison, the orthology ratio between J. mimosifolia and Avicennia marina (an Acanthaceae plant) is 1:2 (Fig. 1, B to E; Supplementary Figs. S5 to S7), which indicates that A. marina was affected by an extra duplication event after their split.

To find the homology ratio between J. mimosifolia and B. alternifolia, illustrated in the homologous gene dot plot (Supplementary Fig. S8), we extracted the best-matched genes from the Blast results. Taking the chromosome Jmi2 as an example, the number of best-matched genes in the chromosome Bal3 is 482, and that in Bal18 is 195, showing significant difference (χ2 test, P < 2.004 × 10−5; Supplementary Tables S4 and S5). This shows that Bal3 was likely orthologous to Jmi2, and Bal18 was outparalogous to Jmi2. Here, orthology forms due to speciation and outparalogy due to shared polyploidization. Two genes from different species are orthologous if derived from the same ancestral gene in their latest common ancestor. Two genes from different species are outparalogous if they were derivatives from 2 ancestral genes produced by an ancient gene duplication event before the split of the 2 species. Notably, 2 genes from the same plant can be referred to as outparalogous ones if produced by duplication in the common ancestor of relative species to distinguish paralogous genes produced by duplication in its own lineage (Supplementary Fig. S9). Grossly, a check of orthology and outparalogy across 2 genomes supports an orthology ratio of 1:1, and the orthologous collinear genes produced estimated Ks to be 0.532 ± 0.040. However, the outparalogy pattern was much more complex; a J. mimosifolia chromosome often has multiple outparalogous correspondence. By checking Ks values between outparalogous gene pairs between J. mimosifolia and B. alternifolia, we found that there existed 2 groups of outparalogous genes, with 1 group containing 24,265 gene pairs (313 blocks) and having Ks ∼0.650 ± 0.092 and the other group containing 1,135 gene pairs (118 blocks) and having Ks ∼2.037 ± 0.276 (Supplementary Table S3). The older group obviously resulted from the γ WGT shared by major core eudicots, while the younger group shows that there was a polyploidization shared by the 2 plant families, not revealed before. As the younger polyploidization, an outparalogy ratio of 1:1 can be inferred, showing its tetraploidization nature, or a WGD. Characterization of interspecific Ks values further supports that the WGD is shared among 4 studied Lamiales families, including Acanthaceae, Scrophulariaceae, Bignoniaceae, and Lamiaceae. For convenience, it is named as β WGD.

Similarly, by checking homology ratios and Ks, besides the γ WGT (Ks ∼1.986 ± 0.267), we confirmed that J. sambac, F. pennsylvanica, and O. europaea shared another WGT (Ks ∼0.801 ± 0.144); then, F. pennsylvanica and O. europaea shared a WGD event, namely χ WGD, as to the above context (Ks ∼0.269 ± 0.027; Fig. 2). The orthologous genes between the 2 Oleaceae plants have Ks ∼0.483 ± 0.067, showing that they share the 2 older WGT events, with the oldest being γ WGD and the more recent being ω WGT. Correspondingly, the orthology ratio between V. vinifera and J. sambac is 1:3, the orthology ratio between V. vinifera and F. pennsylvanica is 1:6, and the orthology ratio between O. europaea and F. pennsylvanica is 1:1 (Supplementary Fig. S10). This is consistent with previous reports (Rao et al. 2021; Huff et al. 2022; Xu et al. 2022). As to the homologous gene dot plots between A. marina or Strobilanthes cusia and J. mimosifolia, the orthology ratio between J. mimosifolia and A. marina or S. cusia is 1:2, indicating that the latter 2 species shared an extra WGD event (Supplementary Figs. S11 and S12). A comparison of A. marina and S. cusia revealed 3 polyploidization events, with peaks located at 0.519 ± 0.026, 0.774 ± 0.084, and 2.029 ± 0.386. The oldest one is surely the γ WGT, shared with the other core eudicots, the intermediate one β WGD, and the recent one shared by the 2 plants, here referred to as α WGD. Here, we also provided the 4DTV values, reflecting synonymous nucleotide substitutions involving 4-fold degenerative codons, and the relevant analysis came to the same inference (Supplementary Fig. S13).

The 4DTV values between collinear genes can reflect relative species differentiation events and whole-genome replication events during evolution. Therefore, in order to further validate the polyploidization events revealed earlier, we conducted a 4DTV value analysis on gene pairs within and between collinear segments of the genome (Supplementary Fig. S13). In the J. mimosifolia, B. alternifolia, C. americana, and T. grandis genomes, we can observe 2 peaks, one near 0.2 and the other near 0.4, indicating recent and ancient polyploidization events in all 4 species. One of the ancient events must have been γ WGT, shared by core eudicots; the most recent event is β WGD. In the A. marina genome, we observed 3 peaks located near 0.16, 0.24, and 0.43, corresponding to γ WGT, β WGD, and α WGD, 3 polyploidization events. Similarly, in the F. pennsylvanica and J. sambac genomes, we observed 3 peaks and 2 peaks, respectively, with 2 peaks with higher 4DTV values corresponding to γ WGT and ω WGT; the most recent one is χ WGD. Overall, the results of the 4DTV analysis are consistent with those of the Ks analysis, further confirming the correctness of our previous inference of the polyploidization events.

Supposing the occurrence of γ WGD to be 130 to 150 Mya, the other polyploidization and speciation events were dated (Fig. 2). The ω WGT occurred ∼68 to 78 Mya, near the split of Oleaceae from the other Lamiales families, implying its contribution to their divergence, while β WGD occurred ∼36 to 51 Mya, near the split of Scrophulariaceae from the 3 families, Acanthaceae, Bignoniaceae, and Lamiaceae, implying its contribution to their divergence. Clarifying the polyploidization events affecting the Lamiales plants, we constructed the gene tree reflecting the relationship between homologous collinear genes in these plants (Fig. 1, F and G), which would help study the evolution and functional innovation of genes, especially those simultaneously duplicated during each polyploidization.

Inference of ancestral chromosome karyotypes

The reconstruction of ancestral chromosome karyotypes helps reveal the evolution of Lamiales plants, especially to answer how recursive polyploidization and following genome stability contributed to chromosome rearrangement and number reduction. Here, by checking homologous gene dot plots, we determined 11 Lamiales proto-chromosomes, with approaches as illustrated previously (Zhuang et al. 2019; Sun et al. 2022).

We reconstructed the proto-karyotype of B. alternifolia and C. americana common ancestors. By checking orthology and paralogy ratios shown with the homolog gene dot plots within and between their genomes, we found that the common ancestor of them after the β WGD had n = 2x = 22 chromosomes, or n = x = 11 chromosomes before the β WGD (Fig. 3). Taken as an example, the inference of their ancestral chromosome L1 before β WGD is due to the following fact: B. alternifolia chromosomes Bal3 and Bal18 were paralogous chromosomes, doubled during β WGD, and their corresponding C. americana orthologous chromosomes are Cam5 and Cam13, respectively (Fig. 3A). Ignoring certain local DNA inversion or a few small missing DNA regions, they share homology in their full length. Therefore, a parsimonious explanation is that they were derivatives of an ancestral proto-chromosome and retained the main structure of it, referred to as L1 (Fig. 3A). Taken as another example, as to the inference of ancestral chromosomes L6 and L11, Bal10 and Cam10 are orthologous to one another in their full length and are outparalogous to 1 arm of Bal5 and the corresponding region of Cam3. Meanwhile, Bal17, in its full length, is orthologous to the majority of Cam14 and 1 arm of Cam3. This suggests that Bal10 (Cam10) and Bal17 are likely derivatives of 2 pre-β WGD chromosomes, here referred to as L6 and L11, respectively (Fig. 3B). As to the inference of ancestral chromosome L8, Bal19 in its full length is orthologous to the majority of Cam6 and to the full length of Cam4; meanwhile, 1 arm of Bal1 and Bal15 is orthologous to the corresponding region of Cam4 (Cam6), respectively. Therefore, we infer that they are likely derivatives of ancestral proto-chromosomes and retain their main structure, referred to as L8 (Fig. 3C). As to the inference of ancestor chromosome L9, Bal8 and Cam11 are orthologous to one another in their full length and are orthologous to Cam1in it partly length; at the same time, Bal1 has a corresponding orthologous region in Cam1 and has an orthologous region in Cam11. Ignoring likely gene loss on involved chromosomes, we infer that they likely originated from the same ancestral proto-chromosome, here referred to as L9 (Fig. 3D). As to the inference of ancestor chromosomes L7 and L2, Bal13 in its full length is orthologous to a portion of Cam6, while most of Bal2 is orthologous to regions in Cam8 and Cam9; at the meantime, Bal4 is orthologous to other regions in Cam8 and Cam9, while a portion of Bal11 and Bal16 is directly orthologous to a portion of Cam1. This indicates that the Bal2 portion (part of Cam8 and Cam9) and Bal4 are much likely derivatives of 2 pre-β WGD chromosomes, here referred to as L7 and L2, respectively (Fig. 3E). As to the inference of ancestral chromosome L4, Bal6 and the middle part of Bal1 are paralogous chromosomes resulting from β WGD. Their corresponding C. americana orthologous chromosomes are Cam17 and a part of Cam14. Ignoring certain inversion or a few small missing DNA regions, they share homology in their full length. Therefore, we inferred that they were likely derivatives of an ancestral proto-chromosome and retained the main structure of it, here referred to as L4 (Fig. 3G). Similarly, the ancestral chromosome L10 was inferred based on the homologous regions corresponding to Bal9, Bal12, Bal4 and Cam1, Cam2 (Fig. 3H). The ancestral chromosome L3 was inferred based on the homologous regions corresponding to Bal2, Bal7 and Cam12, Cam15, Cam16 (Fig. 3F). As to the inference of ancestor chromosome L5, based on the correspondence between Bal9, Bal12, and Cam7, it is uncertain whether Cam7 evolved from a complete ancestral chromosome or was formed by the merging of 2 ancestral chromosomes (Fig. 3F). In order to determine the exact number, we used 2 outgroups, F. pennsylvanica and J. sambac, to find the answer. A comparative analysis with 2 outgroups showed that the 2 parts that make up Cam7 are homologous to Fpe5, Fpe8, Fpe17, and Fpe19 at full length and to Jsa3 and Jsa13 in their full length, too, indicating that Cam7 evolved from 1 ancestral chromosome, here referred to as L5 (Fig. 3I). Therefore, the explanation is that an RTA occurred between L5 and L10 to form neo-chromosomes Bal9 and Bal12, as shown in Fig. 3F.

Inference of the ancestral chromosomes of Lamiales. A to H) Inferring ancestral chromosomes based on the homologous gene dot plots between B. alternifolia (Bal) and C. americana (Cam). I) Inferring the proto-chromosome L5 based on comparing C. americana to J. sambac (Jsa) and F. pennsylvanica (Fpe). L1 to L11: distinguishing inferred Lamiales proto-chromosomes using distinct identifiers (Hex codes: f83ba6, be6ab5, 0000ff, 5f52a0, fff100, 00ffff, 00a0e9, b1f005, 8FBC8F, 920783, f29b76).
Figure 3.

Inference of the ancestral chromosomes of Lamiales. A to H) Inferring ancestral chromosomes based on the homologous gene dot plots between B. alternifolia (Bal) and C. americana (Cam). I) Inferring the proto-chromosome L5 based on comparing C. americana to J. sambac (Jsa) and F. pennsylvanica (Fpe). L1 to L11: distinguishing inferred Lamiales proto-chromosomes using distinct identifiers (Hex codes: f83ba6, be6ab5, 0000ff, 5f52a0, fff100, 00ffff, 00a0e9, b1f005, 8FBC8F, 920783, f29b76).

It is worth noting that the majority of the inferred n = x = 11 proto-chromosomes, duplicated to n = 2x = 22 during the β WGD, have been well preserved in the J. mimosifolia (Jmi) genome. In the evolution after the β WGD, occurring ∼36 to 51 Mya, without considering small chromosomal changes, only 2 EEJs and 2 NCFs occurred to result in chromosome number reduction from 22 to 18 in extant J. mimosifolia plant (Fig. 4A). This means that 4 Jmi chromosomes are composite chromosomes of ancestral chromosomes, and 14 ancestral chromosomes have been preserved (Fig. 4A). Chromosome Jmi1 is a composite chromosome formed by a fusion of a β WGD copy of L4 into a β WGD copy of L5, and the other β WGD copies of L4 and L5 evolved to form Jmi9 and Jmi10, respectively. Chromosome Jmi6 is a composite chromosome formed by a fusion of a β WGD copy of L3 into a β WGD copy of L7, and the other β WGD copies of L3 and L7 evolved to form Jmi8 and Jmi12, respectively. Chromosome Jmi3 is a composite chromosome formed by an EEJ of a β WGD copy of L6 and a copy of L11, and the other β WGD copies evolved to form Jmi11 and Jmi18, respectively. Chromosome Jmi7 is a composite chromosome formed by an EEJ of a β WGD copy of L9 and a copy of L10, and the other β WGD copies evolved to form Jmi15 and Jmi16, respectively.

Supporting evidence for ancestral chromosome inference. A and B) Supportive evidence to the correctness of ancestral chromosome inference based on comparing J. mimosifolia (Jmi) chromosomes and homology between J. mimosifolia and J. sambac. L1 to L11: distinguishing inferred Lamiales proto-chromosomes using distinct identifiers (Hex codes: f83ba6, be6ab5, 0000ff, 5f52a0, fff100, 00ffff, 00a0e9, b1f005, 8FBC8F, 920783, f29b76).
Figure 4.

Supporting evidence for ancestral chromosome inference. A and B) Supportive evidence to the correctness of ancestral chromosome inference based on comparing J. mimosifolia (Jmi) chromosomes and homology between J. mimosifolia and J. sambac. L1 to L11: distinguishing inferred Lamiales proto-chromosomes using distinct identifiers (Hex codes: f83ba6, be6ab5, 0000ff, 5f52a0, fff100, 00ffff, 00a0e9, b1f005, 8FBC8F, 920783, f29b76).

In addition, though J. sambac has been affected by a WGT and a WGD after the split from plants in Lamiaceae, Bignoniaceae, Acanthaceae, and Scrophulariaceae, subjected to extensive genome instability and repatterning, 7 of the 11 proto-chromosomes have been preserved as an extant chromosome in J. sambac (Fig. 4B). The other 4 proto-chromosomes also have considerably large patches in extant chromosomes. Therefore, we inferred these 11 proto-chromosomes (L1 to L11) as the common ancestor chromosomes of all Lamiales plants.

We have to note that the findings of well-preserved proto-chromosomes in J. mimosifolia and J. sambac provide supportive evidence that the inference of the 11 proto-chromosomes is most likely credible, in that the inference was based on B. alternifolia and C. americana, being independent of J. mimosifolia and J. sambac.

Reconstruction of chromosome evolutionary trajectories

Then, we inferred karyotype changes from Lamiales ancestors to J. mimosifolia, B. alternifolia, and C. americana (Fig. 5A). As described above, the 22 β WGD duplicated chromosomes underwent 2 EEJs and 2 NCFs to form J. mimosifolia 18 chromosomes (Fig. 5B). The evolutionary trajectories of chromosomes are depicted to show the major changes during the evolution. A similar inference was made with the other genomes. Actually, subjected to 1 EEJ, 2 NCFs, and 3RTAs, B. alternifolia 19 chromosomes formed (Fig. 5C), and subjected to 4 EEJs, 1 NCF, and 2RTAs, C. americana’s 17 chromosomes formed (Fig. 5D).

Examples of ancestral karyotypes and evolutionary trajectories from Lamiales common ancestors. A) Reconstructed ancestral karyotypes. L1 to L11: inferred Lamiales proto-chromosomes (each distinguished by a unique identifier, listed in Hex format: f83ba6, be6ab5, 0000ff, 5f52a0, fff100, 00ffff, 00a0e9, b1f005, 8FBC8F, 920783, f29b76). NCF, nested chromosome fusion; EEJ, end–end joining; RTA, reciprocal exchange of chromosomal arms; WGD, whole-genome duplication. B to D) reconstructed evolutionary trajectories to form extant chromosomes in J. mimosifolia (Jmi), B. alternifolia (Bal), and C. americana (Cam). COV and X: crossing-over; DF: the inactivation of extra centromeres; lost: the loss of supposed B chromosomes; L1 to L11 and L1′ to L11′: duplicated copies of L1.
Figure 5.

Examples of ancestral karyotypes and evolutionary trajectories from Lamiales common ancestors. A) Reconstructed ancestral karyotypes. L1 to L11: inferred Lamiales proto-chromosomes (each distinguished by a unique identifier, listed in Hex format: f83ba6, be6ab5, 0000ff, 5f52a0, fff100, 00ffff, 00a0e9, b1f005, 8FBC8F, 920783, f29b76). NCF, nested chromosome fusion; EEJ, end–end joining; RTA, reciprocal exchange of chromosomal arms; WGD, whole-genome duplication. B to D) reconstructed evolutionary trajectories to form extant chromosomes in J. mimosifolia (Jmi), B. alternifolia (Bal), and C. americana (Cam). COV and X: crossing-over; DF: the inactivation of extra centromeres; lost: the loss of supposed B chromosomes; L1 to L11 and L1′ to L11′: duplicated copies of L1.

How are the Lamiales proto-chromosomes formed after the split from the other eudicots? Here, we first reconstructed the 11 Lamiales proto-chromosomes or proto-genome using J. mimosifolia extant chromosomes. As to the correspondence between proto-chromosomes and J. mimosifolia chromosomes shown (Fig. 4A), a proto-chromosome was reconstructed by the gene content in the corresponding extant chromosome. In that there are paralogous chromosomes or regions in J. mimosifolia, an alternative one was used, which will not affect the following analysis. Then, by revealing putative homologous genes between the reconstructed proto-chromosomes and V. vinifera, we checked the homology between the Lamiales proto-genome and that of V. vinifera (Fig. 6A). V. vinifera has n = 19 chromosomes, from which revealed the γ WGT and 7 pre-γ WGT proto-chromosomes and 21 post-γ WGT chromosomes common to major core eudicots, making V. vinifera a model plant to understand the structure and evolution of the other eudicot genomes. The 21 post-γ WGT chromosomes were referred to as A1 to A7, B1 to B7, and C1 to C7. We found that 9 EEJs and 1 NCF occurred to form the 11 Lamiales proto-chromosomes (Fig. 6B). L9 and L10 were derivatives of B4 and A3. L5 formed due to an NCF of B2 into B6. L3 formed due to an EEJ between B5 and A2. C4 and C5 formed an intermediate chromosome by an EEJ, which crossed with C6 to form L2 and another intermediate chromosome. The intermediate chromosome then crossed with B1 to form another 2 intermediates, one of which was linked to C1 to form L4 by EEJ. The other Lamiales proto-chromosomes and their evolutionary trajectories were reconstructed similarly.

Homologous gene dot plot between the reconstructed Lamiales chromosomes and core eudicot proto-chromosomes inferred by using V. vinifera, and the formation of Lamiales proto-chromosomes. A) Dot plot. Best-matched regions are shown in shadow; each group of 7 ancestor chromosomes is distinguished by distinct identifiers (Hex codes: 3366FF, 00CCFF, FF00FF, 8FBC8F, 99CC00, FFCC00, C46AB5). B) A1 to A7, B1 to B7, and C1 to C7 represent 21 core eudicot ancestral chromosomes; L1 to L11 represent the Lamiales proto-chromosomes. EEJ, end–end joining; NCF, nested chromosome fusion; COV, crossing-over; DF, the inactivation of extra centromeres; lost, the loss of supposed B chromosomes.
Figure 6.

Homologous gene dot plot between the reconstructed Lamiales chromosomes and core eudicot proto-chromosomes inferred by using V. vinifera, and the formation of Lamiales proto-chromosomes. A) Dot plot. Best-matched regions are shown in shadow; each group of 7 ancestor chromosomes is distinguished by distinct identifiers (Hex codes: 3366FF, 00CCFF, FF00FF, 8FBC8F, 99CC00, FFCC00, C46AB5). B) A1 to A7, B1 to B7, and C1 to C7 represent 21 core eudicot ancestral chromosomes; L1 to L11 represent the Lamiales proto-chromosomes. EEJ, end–end joining; NCF, nested chromosome fusion; COV, crossing-over; DF, the inactivation of extra centromeres; lost, the loss of supposed B chromosomes.

Eventually, we inferred ancestral karyotypes at major evolutionary nodes from the major core eudicot proto-chromosomes and reconstructed evolutionary trajectories to extant Lamiales plants under consideration (Fig. 7).

Reconstructing the evolutionary trajectories of chromosomes from core eudicot common ancestor to the extant Lamiales. Orange circles represent WGD (WGD events). Tthe light blue triangles represent WGT (WGT events). ω, χ, β, and α represent the polyploidization events that occur in Lamiales species, respectively. Seven ancestor chromosomes are distinguished by distinct identifiers (Hex codes: 3366FF, 00CCFF, FF00FF, 8FBC8F, 99CC00, FFCC00, C46AB5). The figure shows the number of chromosome fusions (EEJs and NCFs) experienced by each speciation.
Figure 7.

Reconstructing the evolutionary trajectories of chromosomes from core eudicot common ancestor to the extant Lamiales. Orange circles represent WGD (WGD events). Tthe light blue triangles represent WGT (WGT events). ω, χ, β, and α represent the polyploidization events that occur in Lamiales species, respectively. Seven ancestor chromosomes are distinguished by distinct identifiers (Hex codes: 3366FF, 00CCFF, FF00FF, 8FBC8F, 99CC00, FFCC00, C46AB5). The figure shows the number of chromosome fusions (EEJs and NCFs) experienced by each speciation.

Discussion

Polyploidy plays an important role in the evolution and functional innovation of plants (Bowers et al. 2003; Soltis and Soltis 2016). However, due to the complexity of species genomes, polyploid events could be omitted or misidentified (Wang et al. 2018a, 2018b, 2020). A sophisticated comparative genomic analysis showed that a WGD event contributed to the establishment of the Cucurbitaceae family (Wang et al. 2018a, 2018b). This polyploidization event had been overlooked in several previous reports (Huang et al. 2009; Garcia-Mas et al. 2012; Guo et al. 2013). Additionally, a similar reanalysis of Selaginella moellendorffii revealed multiple ancient polyploidization events during the evolution of Lycophytes (Wang et al. 2020), negating the previous inference that Lycophytes did not undergo ancient polyploidization (Banks et al. 2011). This showed that all the vascular plants were affected by the paleopolyploidization. Here, we investigated the polyploid events in Lamiales plants by identifying orthologous and paralogous genes. We found that, in addition to the γ WGT shared by core eudicots, Lamiales plants also experienced certain specific polyploidization events. Apart from the ω WGT and χ WGD reported in Oleaceae species, we found evidence that the β WGD is shared by Lamiaceae, Bignoniaceae, Acanthaceae, and Scrophulariaceae. Notably, we found that the β WGD occurred just before the split of Scrophulariaceae from the 3 families, Acanthaceae, Bignoniaceae, and Lamiaceae, ∼33 to 45 Mya, suggesting its contribution to their divergence of these families. We also found that ω WGT occurred ∼68 to 78 Mya, near the split of Oleaceae from the other Lamiales families ∼65 to 67 Mya, implying its contribution to their divergence. Actually, the link between paleopolyploidization and speciation has been proposed. The fast divergence and establishment of angiosperms and seed plants were related to likely polyploidization (Jiao et al. 2011). There has been genomic evidence that the establishment of the plant families Poaceae (Wang et al. 2015a, 2015b, 2015c), Cucurbitaceae (Wang et al. 2018a, 2018b), Fabaceae (Zhuang et al. 2019), Solanaceae (International Tomato Genome Sequencing Consortium 2012), and Apiaceae (Song et al. 2021) were related to their family-specific polyploidization. However, at least in the case of seed plants, whether they shared a common polyploid ancestor has been still hotly debated (Leebens-Mack et al. 2019; Liu et al. 2021; Han et al. 2022; Wan et al. 2022).

The occurrence of various WGDs or WGTs in Lamiales has been in obscurity. Comparative genomic analysis of C. americana and T. grandis inferred that C. americana underwent 3 ancient WGD events, including 1 WGD detected in the T. grandis lineage but not shared with C. americana (Hamilton et al. 2020). The scarlet sage (Salvia splendens) genome also revealed 3 WGD events after the γ event, with 2 in its own lineage and the other shared by Lamiaceae plants (Jia et al. 2021). The present study provided solid evidence that C. americana and T. grandis shared 2 polyploidization events, including γ WGT shared by major core eudicots and the other one shared by the 4 Lamiales families Scrophulariaceae, Acanthaceae, Bignoniaceae, and Lamiaceae (β WGD). The present reanalysis of S. splendens consolidated the above inference that the β WGD is shared by the abovementioned 4 Lamiales families and 2 additional WGD events experienced by the species alone. In addition, the present reanalysis of 3 Oleaceae plants, J. sambac, F. pennsylvanica, and O. europaea, revealed that the Oleaceae underwent a WGT event, consistent with previous findings (Rao et al. 2021; Huff et al. 2022; Xu et al. 2022). In conclusion, the present research cleared the obscurity over the occurrence of polyploidization during the evolution of Lamiales. Polyploidization, as a type of abrupt evolutionary event, may produce a new species overnight (Ruprecht et al. 2017; Van de Peer et al. 2017). Though the new species has a very slim chance of survival, its parental lines are in a niche that have been most likely adapted to; if it survives, it would have numerous opportunities to have genetic novelties. In a neo-polyploid, the existence of homoeologous chromosomes and thousands of duplicated genes provides enormous chances for mutation (Wang et al. 2015a, 2015b, 2015c; Barker et al. 2016). Multivalent recombination may occur between homologous and homologous chromosomes, likely resulting in chromosome breakages and rearrangements, gene losses and copy number variations, increased evolutionary rates, chromosome number changes (often reduction), etc. (Wendel 2015; Wang et al. 2018a, 2018b; Sun et al. 2022). A fact is though tens of paleopolyploidization events have been inferred as entering the genomic era, and a plant and its ancestral line may have only been affected by 2 to 6 paleopolyploidization events, all plants have been supposed to be derived from an ancestral paleopolyploid (Zhuang et al. 2019; Wang et al. 2021a, 2021b; Sun et al. 2022). The findings here in Lamiales provide further supportive evidence that paleopolyploidization would contribute to fast divergence and establishment of plant groups.

Following polyploidization, a duplicated genome is often unstable, and extensive DNA rearrangements may result in chromosome number changes, often reduction, resulting in cell karyotype variation in derivative plants. Karyotype refers to the characteristics of the number, size, and morphology of chromosomes in a cell. Currently, 2 methods for predicting chromosome karyotypes have been proposed in comparative genomic research. A pure bioinformatic approach was proposed, which uses a cumulative identity percentage (CIP, 60%) and a cumulative alignment length percentage (CALP, 70%) to identify orthologous and paralogous genes, as well as Closeup software to identify blocks (Salse et al. 2009a, 2009b). Then, continuous ancestral regions (CARs) are defined by comparing conserved fragments between 2 genomes (orthologous genes/blocks) and within a single genome (paralogous genes/blocks); we determined the most likely evolution based on CARs. The approach hypothesized that the ancestral genome was formed by replication or rearrangement events of modern species at homologous positions and attempted to explain the evolutionary history from the ancestral genome to the modern karyotype through the minimum number of rearrangements (including inversion, deletion, fusion, division, and translocation; Murat et al. 2012, 2014). Based on this approach, ancestral karyotypes of rosid crops, including the Caricaceae, Brassicaceae, Malvaceae, Fabaceae, Rosaceae, Salicaceae, and Vitaceae families, were inferred, respectively (Murat et al. 2015). More recently, a telomere-centric model was proposed to explain chromosome rearrangement and fusion initially based on the analysis of Poaceae and Arabidopsis (Wang et al. 2015a, 2015b, 2015c). Supposing that a chromosome's DNA crossed-over near its 2 telomeres, the resolution of the crossing-over may remove the telomeres to produce a transitional free-end chromosome, which may then be inserted into another chromosome. Alternatively, crossing-over may occur between 2 different chromosomes near their telomeres, and the resolution of the crossing-over may result in removing the involved telomeres and merging 2 chromosomes into a larger one. During the processes, the B chromosome may be formed by removed telomeres, and the loss of the B chromosome results in chromosome number reduction. The telomere-centric model was applied to infer ancestral karyotypes in Poaceae, Brassicaceae, Malvaceae, Apiaceae, Fabaceae, and Cucurbitales (Wang et al. 2015a, 2015b, 2015c; Zhuang et al. 2019; Song et al. 2021; Wang et al. 2022). Besides, the model facilitates the inference of evolutionary trajectories from very ancient proto-chromosome to extant chromosomes.

Here, using the telomere-centric model, we inferred that the Lamiales common ancestor has 11 proto-chromosomes and reconstructed the evolutionary trajectories from the major core eudicot common ancestor to the Lamiales common ancestor and then from the Lamiales common ancestor to the extant chromosomes in Lamiales plants under consideration. Actually, the inference of the Lamiales proto-chromosomes was based on the comparative analysis B. alternifolia and C. americana chromosomes. Notably, further analysis with the other Lamiales plants, J. mimosifolia and J. sambac, provides supportive evidence to the above inference in that most of the inferred proto-chromosomes have perfect extant derivatives, ignoring small-scale DNA rearrangement and loss, in these plants. The other inferred proto-chromosomes also have very good gene collinearity in extant chromosomes, even to the length of each considered proto-chromosome. Grossly, these above facts support the credibility of the present inference of Lamiales proto-chromosomes and their evolutionary trajectories using the telomere-centric model.

Materials and methods

Plant genome data materials

Genome data of Lamiales species and grape were downloaded from relevant public databases (Supplementary Table S1). Python scripts were used to process data format to facilitate subsequent research.

Phylogenetic analysis

We used the gene functional analysis tool OrthoFinder to construct the species phylogenetic tree (Emms and Kelly 2019). Amino acid sequences of Lamiales species and outgroup V. vinifera were used to do the analysis. OrthoFinder is run to find homologous gene groups, and diamond (Buchfink et al. 2015) is used as the comparison method. FastTree is used to build phylogenetic trees. MEGA-X (Kumar et al. 2018) was used to display and edit the alignment, tree production, and beautification. Default parameters were adopted with the adopted software items, if not specified in the context.

Multiple sequence alignment and inference of gene collinearity

The first step is to perform multiple sequence alignment. According to the sequence alignment tool BLAST (Altschul et al. 1990), select the -blastp module to perform homology alignment within and between genomes of the selected species’ genomic protein sequences. The E-value threshold of the output result is set to 1e−5 to accommodate the duplicated genes produced by paleopolyploidization ∼10 Mya, and the output file format (-outfmt) is set to 6 during the specific operation.

Based on the previously obtained homologous sequence alignment result file (Blast file), combined with the genome annotation file (Gff file) and chromosome length file (Lens file), we used the - d module implemented in WGDI (Sun et al. 2022) to draw homologous gene dot plots. The dots of different colors (red, blue, and gray) in the dot plots represent the level of similarity of their gene pairs. According to the dot plots of homologous genes, homologous collinearity was inferred within each genome and between genomes. Next, we used the -icl module to perform collinearity analysis and obtained the collinearity regions, described by scores, statistical significance, collinear gene numbers, etc.

Calculation of synonymous nucleotide substitutions

According to the collinear genes obtained previously, the Ks values were calculated by combining the cds and pep files. Here, the -Ks module in WGDI was used. This module used Muscle software (Edgar 2021) to perform protein binding based on the protein sequence, used pal2pal.pl to convert the protein binding into codon binding based on the CDS sequence, and finally calculated Ks using yn00 from PAML (Yang 2007).

Calculation of 4DTV value

Run MCScanX to obtain collinear gene pairs, and calculate 4DTV values by combining cds and pep files (Wang et al. 2012). Here, we used the ParaAT program and related scripts (calculate_4DTV_correction.pl and axt2one-line.py) to calculate 4DTV values (Zhang et al. 2012). Then, we used RStudio for visualization and created corresponding graphs.

Construction of ancestral karyotypes and chromosome evolution trajectories

The homologous gene dot plot can intuitively display the gene arrangement positions on each chromosome of a species and can be used to check the ortholog and paralog relationships within a single species genome and between different species genomes, exploring the process of chromosome fusion, breakage, and rearrangement, providing convenience for inferring the formation of ancestral chromosomes and reconstructing modern chromosome evolution history. According to the previously proposed telomere-centric chromosome reconstruction model, its several manifestations in homologous gene dot plots are shown in Supplementary Fig. S14. Supplementary Fig. S14A shows that with species B as a reference, the chromosome B1 of species B can find a complete corresponding chromosome A1 in species A, indicating that they originated from the same ancestor chromosome; Supplementary Fig. S14B shows the EEJ model, where 1 chromosome A1 of species A is formed through EEJ by 2 chromosomes B1 and B2 of species B; Supplementary Fig. S14C shows the NCF model, in which the formation of chromosome A1 of species A is due to the nesting of chromosome B2 of species B into B1, leading to NCF; Supplementary Fig. S14D shows the RTA model, in which the 2 chromosomes A1 and A2 of species A are due to RTA between the 2 chromosomes B1 and B2 of species B.

Author contributions

X.W. and J.W. designed and conceptualized the work. J.W. and X.W. prepared the draft manuscript. J.W., B.S., M.Y., F.H., H.Q., H.Z., Y.J., Y.L., and Z.W. prepared the figure. X.W. and J.W. revised the manuscript critically for important intellectual content. All authors edited and reviewed the final version.

Supplementary data

The following materials are available in the online version of this article.

Supplementary Fig. S1. Homologous gene dot plot in the genome of J. mimosifolia.

Supplementary Fig. S2. Homologous genes dot plot in the genome of B. alternifolia.

Supplementary Fig. S3. Homologous genes dot plot in the genome of C. americana.

Supplementary Fig. S4. Homologous genes dot plot in the genome of T. grandis.

Supplementary Fig. S5. Homologous genes dot plot between the genomes of J. mimosifolia and C. americana.

Supplementary Fig. S6. Homologous genes dot plot between the genomes of J. mimosifolia and T. grandis.

Supplementary Fig. S7. Homologous genes dot plot between the genomes of J. mimosifolia and A. marina.

Supplementary Fig. S8. Homologous genes dot plot between the genomes of J. mimosifolia and B. alternifolia.

Supplementary Fig. S9. Explanation and differentiation of orthology, paralogy, and outparalogy.

Supplementary Fig. S10. Local homologous gene dot plots.

Supplementary Fig. S11. Homologous gene dot plot in the genome of A. marina.

Supplementary Fig. S12. Homologous gene dot plot between the genomes of J. mimosifolia and S. cusia.

Supplementary Fig. S13. Distribution of 4DTV values among collinearity genes in or between compared genomes.

Supplementary Fig. S14. Supposed gene collinearity shown in homologous gene dot plots and corresponding chromosome rearrangements.

Supplementary Table S1. Species genome data information and download sources.

Supplementary Table S2. Statistics of colinear gene pairs and blocks of colinear genes.

Supplementary Table S3. Kernel function analysis of Ks distribution related to duplication events within each genome and between selected genomes (before evolutionary rate correction).

Supplementary Table S4. Best-matched results for Blast (using Jmi2 as an example, corresponding to chromosome Bal3).

Supplementary Table S5. Best-matched results for Blast (using Jmi2 as an example, corresponding to chromosome Bal18).

Funding

This project was financially supported by the National Natural Science Foundation of China (32070669) and financial support from the Bureau for Human and Social Resources Security of Tangshan Municipal to X.W.

Data availability

If you need our data set, you can contact the corresponding author.

Dive Curated Terms

The following phenotypic, genotypic, and functional terms are of significance to the work described in this paper:

References

Ahmed
HM
.
Ethnomedicinal, phytochemical and pharmacological investigations of Perilla frutescens (L.). Britt
.
Molecules
2019
:
24
(
1
):
102
. https://doi.org/10.3390/molecules24010102

Altschul
SF
,
Gish
W
,
Miller
W
,
Myers
EW
,
Lipman
DJ
.
Basic local alignment search tool
.
J Mol Biol
.
1990
:
215
(
3
):
403
410
. https://doi.org/10.1016/S0022-2836(05)80360-2

Banks
JA
,
Nishiyama
T
,
Hasebe
M
,
Bowman
JL
,
Gribskov
M
,
dePamphilis
C
,
Albert
VA
,
Aono
N
,
Aoyama
T
,
Ambrose
BA
, et al.
The Selaginella genome identifies genetic changes associated with the evolution of vascular plants
.
Science
2011
:
332
(
6032
):
960
963
. https://doi.org/10.1126/science.1203810

Barker
MS
,
Husband
BC
,
Pires
JC
.
Spreading Winge and flying high: the evolutionary importance of polyploidy after a century of study
.
Am J Bot
.
2016
:
103
(
7
):
1139
1145
. https://doi.org/10.3732/ajb.1600272

Bowers
JE
,
Chapman
BA
,
Rong
J
,
Paterson
AH
.
Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events
.
Nature
2003
:
422
(
6930
):
433
438
. https://doi.org/10.1038/nature01521

Buchfink
B
,
Xie
C
,
Huson
DH
.
Fast and sensitive protein alignment using DIAMOND
.
Nat Methods.
2015
:
12
(
1
):
59
60
. https://doi.org/10.1038/nmeth.3176

Cock
IE
,
Baghtchedjian
L
,
Cordon
ME
,
Dumont
E
.
Phytochemistry, medicinal properties, bioactive compounds, and therapeutic potential of the genus Eremophila (Scrophulariaceae)
.
Molecules
2022
:
27
(
22
):
7734
. https://doi.org/10.3390/molecules27227734

Dudai
N
,
Belanger
FC
.
Biotechnology in flavor production
. In: Biotechnology in flavor production.
New York
:
Wiley
;
2016
. p. 32–61. https://doi.org/10.1002/9781118354056.ch2

Edgar
RC
(
2021
) MUSCLE v5 enables improved estimates of phylogenetic tree confidence by ensemble bootstrapping. bioRxiv 2021.06.20.449169. https://doi.org/10.1101/2021.06.20.449169 21 June 2021, preprint: not peer reviewed.

Emms
DM
,
Kelly
S
.
OrthoFinder: phylogenetic orthology inference for comparative genomics
.
Genome Biol.
2019
:
20
(
1
):
238
. https://doi.org/10.1186/s13059-019-1832-y

Friis
G
,
Vizueta
J
,
Smith
EG
,
Nelson
DR
,
Khraiwesh
B
,
Qudeimat
E
,
Salehi-Ashtiani
K
,
Ortega
A
,
Marshell
A
,
Duarte
CM
, et al.
A high-quality genome assembly and annotation of the gray mangrove, Avicennia marina
.
G3 (Bethesda)
.
2021
:
11
(
1
):
jkaa025
. https://doi.org/10.1093/g3journal/jkaa025

Garcia-Mas
J
,
Benjak
A
,
Sanseverino
W
,
Bourgeois
M
,
Mir
G
,
González
VM
,
Hénaff
E
,
Câmara
F
,
Cozzuto
L
,
Lowy
E
, et al.
The genome of melon (Cucumis melo L.)
.
Proc Natl Acad Sci U S A
.
2012
:
109
(
29
):
11872
11877
. https://doi.org/10.1073/pnas.1205415109

Guo
S
,
Zhang
J
,
Sun
H
,
Salse
J
,
Lucas
WJ
,
Zhang
H
,
Zheng
Y
,
Mao
L
,
Ren
Y
,
Wang
Z
, et al.
The draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions
.
Nat Genet
.
2013
:
45
(
1
):
51
58
. https://doi.org/10.1038/ng.2470

Hamilton
JP
,
Godden
GT
,
Lanier
E
,
Bhat
WW
,
Kinser
TJ
,
Vaillancourt
B
,
Wang
H
,
Wood
JC
,
Jiang
J
,
Soltis
PS
, et al.
Generation of a chromosome-scale genome assembly of the insect-repellent terpenoid-producing Lamiaceae species, Callicarpa americana
.
Gigascience
2020
:
9
(
9
):
giaa093
. https://doi.org/10.1093/gigascience/giaa093

Han
D
,
Li
W
,
Hou
Z
,
Lin
C
,
Xie
Y
,
Zhou
X
,
Gao
Y
,
Huang
J
,
Lai
J
,
Wang
L
, et al.
The chromosome-scale assembly of the Salvia rosmarinus genome provides insight into carnosic acid biosynthesis
.
Plant J
.
2023
:
113
(
4
):
819
832
. https://doi.org/10.1111/tpj.16087

Han
Y
,
Zhang
W
,
Zhou
B
,
Zeng
P
,
Tian
Z
,
Cai
J
.
Chromosome-level genome assembly of Welwitschia mirabilis, a unique Namib Desert species
.
Mol Ecol Resour
.
2022
:
22
(
1
):
391
403
. https://doi.org/10.1111/1755-0998.13475

Huang
S
,
Li
R
,
Zhang
Z
,
Li
L
,
Gu
X
,
Fan
W
,
Lucas
WJ
,
Wang
X
,
Xie
B
,
Ni
P
, et al.
The genome of the cucumber, Cucumis sativus L
.
Nat Genet
.
2009
:
41
(
12
):
1275
1281
. https://doi.org/10.1038/ng.475

Huff
M
,
Seaman
J
,
Wu
D
,
Zhebentyayeva
T
,
Kelly
LJ
,
Faridi
N
,
Nelson
CD
,
Cooper
E
,
Best
T
,
Steiner
K
, et al.
A high-quality reference genome for Fraxinus pennsylvanica for ash species restoration and research
.
Mol Ecol Resour
.
2022
:
22
(
4
):
1284
1302
. https://doi.org/10.1111/1755-0998.13545

International Tomato Genome Sequencing Consortium
.
The tomato genome sequence provides insights into fleshy fruit evolution
.
Nature
2012
:
485
(
7400
):
635
641
. https://doi.org/10.1038/nature11119

Jaillon
O
,
Aury
JM
,
Noel
B
,
Policriti
A
,
Clepet
C
,
Casagrande
A
,
Choisne
N
,
Aubourg
S
,
Vitulo
N
,
Jubin
C
, et al.
The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla
.
Nature
2007
:
449
(
7161
):
463
467
. https://doi.org/10.1038/nature06148

Jia
KH
,
Liu
H
,
Zhang
RG
,
Xu
J
,
Zhou
SS
,
Jiao
SQ
,
Yan
XM
,
Tian
XC
,
Shi
TL
,
Luo
H
, et al.
Chromosome-scale assembly and evolution of the tetraploid Salvia splendens (Lamiaceae) genome
.
Hortic Res
.
2021
:
8
(
1
):
177
. https://doi.org/10.1038/s41438-021-00614-y

Jiao
Y
,
Wickett
NJ
,
Ayyampalayam
S
,
Chanderbali
AS
,
Landherr
L
,
Ralph
PE
,
Tomsho
LP
,
Hu
Y
,
Liang
H
,
Soltis
PS
, et al.
Ancestral polyploidy in seed plants and angiosperms
.
Nature
2011
:
473
(
7345
):
97
100
. https://doi.org/10.1038/nature09916

Kumar
S
,
Stecher
G
,
Li
M
,
Knyaz
C
,
Tamura
K
.
MEGA X: molecular evolutionary genetics analysis across computing platforms
.
Mol Biol Evol
.
2018
:
35
(
6
):
1547
1549
. https://doi.org/10.1093/molbev/msy096

Leebens-Mack
JH
,
Barker
MS
,
Carpenter
EJ
,
Deyholos
MK
,
Gitzendanner
MA
,
Graham
SW
,
Grosse
I
,
Li
Z
,
Melkonian
M
,
Mirarab
S
, et al.
One thousand plant transcriptomes and the phylogenomics of green plants
.
Nature
2019
:
574
(
7780
):
679
685
. https://doi.org/10.1038/s41586-019-1693-2

Leitch
AR
,
Leitch
IJ
.
Genomic plasticity and the diversity of polyploid plants
.
Science
2008
:
320
(
5875
):
481
483
. https://doi.org/10.1126/science.1153585

Liu
H
,
Wang
X
,
Wang
G
,
Cui
P
,
Wu
S
,
Ai
C
,
Hu
N
,
Li
A
,
He
B
,
Shao
X
, et al.
The nearly complete genome of Ginkgo biloba illuminates gymnosperm evolution
.
Nat Plants
.
2021
:
7
(
6
):
748
756
. https://doi.org/10.1038/s41477-021-00933-x

Ma
YP
,
Wariss
HM
,
Liao
RL
,
Zhang
RG
,
Yun
QZ
,
Olmstead
RG
,
Chau
JH
,
Milne
RI
,
Van de Peer
Y
,
Sun
WB
.
Genome-wide analysis of butterfly bush (Buddleja alternifolia) in three uplands provides insights into biogeography, demography and speciation
.
New Phytol
.
2021
:
232
(
3
):
1463
1476
. https://doi.org/10.1111/nph.17637

Moller
M
,
Atkins
H
,
Barber
S
,
Purvis
D
.
The living collection at the Royal Botanic Garden Edinburgh illustrates the floral diversity in Streptocarpus (Gesneriaceae)
.
Sibbaldia Int J Bot Gard Hortic
.
2019
:(
17
):
155
177
. https://doi.org/10.24823/Sibbaldia.2019.272

Murat
F
,
Van de Peer
Y
,
Salse
J
.
Decoding plant and animal genome plasticity from differential paleo-evolutionary patterns and processes
.
Genome Biol Evol
.
2012
:
4
(
9
):
917
928
. https://doi.org/10.1093/gbe/evs066

Murat
F
,
Zhang
R
,
Guizard
S
,
Flores
R
,
Armero
A
,
Pont
C
,
Steinbach
D
,
Quesneville
H
,
Cooke
R
,
Salse
J
.
Shared subgenome dominance following polyploidization explains grass genome evolutionary plasticity from a seven protochromosome ancestor with 16 K protogenes
.
Genome Biol Evol
.
2014
:
6
(
1
):
12
33
. https://doi.org/10.1093/gbe/evt200

Murat
F
,
Zhang
R
,
Guizard
S
,
Gavranović
H
,
Flores
R
,
Steinbach
D
,
Quesneville
H
,
Tannier
E
,
Salse
J
.
Karyotype and gene order evolution from reconstructed extinct ancestors highlight contrasts in genome plasticity of modern rosid crops
.
Genome Biol Evol
.
2015
:
7
(
3
):
735
749
. https://doi.org/10.1093/gbe/evv014

Mutinda
ES
,
Mkala
EM
,
Ren
J
,
Kimutai
F
,
Waswa
EN
,
Odago
WO
,
Nanjala
C
,
Gichua
MK
,
Njire
MM
,
Hu
GW
.
A review on the traditional uses, phytochemistry, and pharmacology of the genus Veronicastrum (Plantaginaceae)
.
J Ethnopharmacol
.
2023
:
300
:
115695
. https://doi.org/10.1016/j.jep.2022.115695

Rao
G
,
Zhang
J
,
Liu
X
,
Lin
C
,
Xin
H
,
Xue
L
,
Wang
C
.
De novo assembly of a new Olea europaea genome accession using nanopore sequencing
.
Hortic Res
.
2021
:
8
(
1
):
64
. https://doi.org/10.1038/s41438-021-00498-y

Ruprecht
C
,
Lohaus
R
,
Vanneste
K
,
Mutwil
M
,
Nikoloski
Z
,
Van de Peer
Y
,
Persson
S
.
Revisiting ancestral polyploidy in plants
.
Sci Adv
.
2017
:
3
(
7
):
e1603195
. https://doi.org/10.1126/sciadv.1603195

Salse
J
,
Abrouk
M
,
Bolot
S
,
Guilhot
N
,
Courcelle
E
,
Faraut
T
,
Waugh
R
,
Close
TJ
,
Messing
J
,
Feuillet
C
.
Reconstruction of monocotelydoneous proto-chromosomes reveals faster evolution in plants than in animals
.
Proc Natl Acad Sci U S A
.
2009a
:
106
(
35
):
14908
14913
. https://doi.org/10.1073/pnas.0902350106

Salse
J
,
Abrouk
M
,
Murat
F
,
Quraishi
UM
,
Feuillet
C
.
Improved criteria and comparative genomics tool provide new insights into grass paleogenomics
.
Brief Bioinform
.
2009b
:
10
(
6
):
619
630
. https://doi.org/10.1093/bib/bbp037

Schubert
I
,
Lysak
MA
.
Interpretation of karyotype evolution should consider chromosome structural constraints
.
Trends Genet
.
2011
:
27
(
6
):
207
216
. https://doi.org/10.1016/j.tig.2011.03.004

Soltis
PS
,
Soltis
DE
.
Ancient WGD events as drivers of key innovations in angiosperms
.
Curr Opin Plant Biol
.
2016
:
30
:
159
165
. https://doi.org/10.1016/j.pbi.2016.03.015

Song
X
,
Sun
P
,
Yuan
J
,
Gong
K
,
Li
N
,
Meng
F
,
Zhang
Z
,
Li
X
,
Hu
J
,
Wang
J
, et al.
The celery genome sequence reveals sequential paleo-polyploidizations, karyotype evolution and resistance gene reduction in apiales
.
Plant Biotechnol J
.
2021
:
19
(
4
):
731
744
. https://doi.org/10.1111/pbi.13499

Sun
P
,
Jiao
B
,
Yang
Y
,
Shan
L
,
Li
T
,
Li
X
,
Xi
Z
,
Wang
X
,
Liu
J
.
WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes
.
Mol Plant
.
2022
:
15
(
12
):
1841
1851
. https://doi.org/10.1016/j.molp.2022.10.018

Tang
H
,
Lyons
E
,
Pedersen
B
,
Schnable
JC
,
Paterson
AH
,
Freeling
M
.
Screening synteny blocks in pairwise genome comparisons through integer programming
.
BMC Bioinformatics
2011
:
12
(
1
):
102
. https://doi.org/10.1186/1471-2105-12-102

THE ANGIOSPERM PHYLOGENY Group
.
An update of the angiosperm phylogeny group classification for the orders and families of flowering plants: APG IV
.
Bot J Linn Soc
.
2016
:
181
(
1
):
1
20
. https://doi.org/10.1111/boj.12385

Van de Peer
Y
,
Mizrachi
E
,
Marchal
K
.
The evolutionary significance of polyploidy
.
Nat Rev Genet
.
2017
:
18
(
7
):
411
424
. https://doi.org/10.1038/nrg.2017.26

Vogel
JP
,
Garvin
DF
,
Mockler
TC
,
Schmutz
J
,
Rokhsar
D
,
Bevan
MW
,
Barry
K
,
Lucas
S
,
Harmon-Smith
M
,
Lail
K
, et al.
Genome sequencing and analysis of the model grass Brachypodium distachyon
.
Nature
2010
:
463
(
7282
):
763
768
. https://doi.org/10.1038/nature08747

Wan
T
,
Gong
Y
,
Liu
Z
,
Zhou
Y
,
Dai
C
,
Wang
Q
.
Evolution of complex genome architecture in gymnosperms
.
Gigascience
2022
:
11
:
giac078
. https://doi.org/10.1093/gigascience/giac078

Wang
J
,
Sun
P
,
Li
Y
,
Liu
Y
,
Yang
N
,
Yu
J
,
Ma
X
,
Sun
S
,
Xia
R
,
Liu
X
, et al.
An overlooked paleotetraploidization in Cucurbitaceae
.
Mol Biol Evol
.
2018a
:
35
(
1
):
16
26
. https://doi.org/10.1093/molbev/msx242

Wang
J
,
Yu
J
,
Sun
P
,
Li
C
,
Song
X
,
Lei
T
,
Li
Y
,
Yuan
J
,
Sun
S
,
Ding
H
, et al.
Paleo-polyploidization in Lycophytes
.
Genomics Proteomics Bioinformatics
.
2020
:
18
(
3
):
333
340
. https://doi.org/10.1016/j.gpb.2020.10.002

Wang
J
,
Yuan
M
,
Feng
Y
,
Zhang
Y
,
Bao
S
,
Hao
Y
,
Ding
Y
,
Gao
X
,
Yu
Z
,
Xu
Q
, et al.
A common whole-genome paleotetraploidization in Cucurbitales
.
Plant Physiol
.
2022
:
190
(
4
):
2430
2448
. https://doi.org/10.1093/plphys/kiac410

Wang
JP
,
Yu
JG
,
Li
J
,
Sun
PC
,
Wang
L
,
Yuan
JQ
,
Meng
FB
,
Sun
SR
,
Li
YX
,
Lei
TY
, et al.
Two likely auto-tetraploidization events shaped kiwifruit genome and contributed to establishment of the Actinidiaceae family
.
iScience
2018b
:
7
:
230
240
. https://doi.org/10.1016/j.isci.2018.08.003

Wang
M
,
Liu
C
,
Xing
T
,
Wang
Y
,
Xia
G
.
Asymmetric somatic hybridization induces point mutations and indels in wheat
.
BMC Genomics
2015a
:
16
(
1
):
807
. https://doi.org/10.1186/s12864-015-1974-6

Wang
M
,
Zhang
L
,
Wang
Z
.
Chromosomal-level reference genome of the neotropical tree Jacaranda mimosifolia D. Don
.
Genome Biol Evol
.
2021a
:
13
(6):
evab094
. https://doi.org/10.1093/gbe/evab094

Wang
S
,
Xiao
Y
,
Zhou
ZW
,
Yuan
J
,
Guo
H
,
Yang
Z
,
Yang
J
,
Sun
P
,
Sun
L
,
Deng
Y
, et al.
High-quality reference genome sequences of two coconut cultivars provide insights into evolution of monocot chromosomes and differentiation of fiber content and plant height
.
Genome Biol
.
2021b
:
22
(
1
):
304
. https://doi.org/10.1186/s13059-021-02522-9

Wang
X
,
Jin
D
,
Wang
Z
,
Guo
H
,
Zhang
L
,
Wang
L
,
Li
J
,
Paterson
AH
.
Telomere-centric genome repatterning determines recurring chromosome number reductions during the evolution of eukaryotes
.
New Phytol
.
2015b
:
205
(
1
):
378
389
. https://doi.org/10.1111/nph.12985

Wang
X
,
Wang
J
,
Guo
H
,
Lee
T
,
Liu
T
,
Jin
D
,
Paterson
AH
.
Genome alignment spanning major Poaceae lineages reveals heterogeneous evolutionary rates and alters inferred dates for key evolutionary events
.
Mol Plant
.
2015c
:
8
(
6
):
14
. https://doi.org/10.1016/j.molp.2015.04.004

Wang
Y
,
Tang
H
,
Debarry
JD
,
Tan
X
,
Li
J
,
Wang
X
,
Lee
TH
,
Jin
H
,
Marler
B
,
Guo
H
, et al.
MCScanx: a toolkit for detection and evolutionary analysis of gene synteny and collinearity
.
Nucleic Acids Res
.
2012
:
40
(
7
):
e49
. https://doi.org/10.1093/nar/gkr1293

Wang
Z
,
Wang
J
,
Pan
Y
,
Lei
T
,
Ge
W
,
Wang
L
,
Zhang
L
,
Li
Y
,
Zhao
K
,
Liu
T
, et al.
Reconstruction of evolutionary trajectories of chromosomes unraveled independent genomic repatterning between Triticeae and Brachypodium
.
BMC Genomics
2019
:
20
(
1
):
180
. https://doi.org/10.1186/s12864-019-5566-8

Wendel
JF
.
The wondrous cycles of polyploidy in plants
.
Am J Bot
.
2015
:
102
(
11
):
1753
1756
. https://doi.org/10.3732/ajb.1500320

Xu
S
,
Ding
Y
,
Sun
J
,
Zhang
Z
,
Wu
Z
,
Yang
T
,
Shen
F
,
Xue
G
.
A high-quality genome assembly of Jasminum sambac provides insight into floral trait formation and Oleaceae genome evolution
.
Mol Ecol Resour
.
2022
:
22
(
2
):
724
739
. https://doi.org/10.1111/1755-0998.13497

Xu
W
,
Zhang
L
,
Cunningham
AB
,
Li
S
,
Zhuang
H
,
Wang
Y
,
Liu
A
.
Blue genome: chromosome-scale genome reveals the evolutionary and molecular basis of indigo biosynthesis in Strobilanthes cusia
.
Plant J
.
2020a
:
104
(
4
):
864
879
. https://doi.org/10.1111/tpj.14992

Xu
Z
,
Gao
R
,
Pu
X
,
Xu
R
,
Wang
J
,
Zheng
S
,
Zeng
Y
,
Chen
J
,
He
C
,
Song
J
.
Comparative genome analysis of Scutellaria baicalensis and Scutellaria barbata reveals the evolution of active flavonoid biosynthesis
.
Genom Proteom Bioinform
.
2020b
:
18
(
3
):
230
240
. https://doi.org/10.1016/j.gpb.2020.06.002

Yang
Z
.
PAML 4: phylogenetic analysis by maximum likelihood
.
Mol Biol Evol
.
2007
:
24
(
8
):
1586
1591
. https://doi.org/10.1093/molbev/msm088

Zhang
Z
,
Xiao
J
,
Wu
J
,
Zhang
H
,
Liu
G
,
Wang
X
,
Dai
L
.
ParaAT: a parallel tool for constructing multiple protein-coding DNA alignments
.
Biochem Biophys Res Commun
.
2012
:
419
(
4
):
779
781
. https://doi.org/10.1016/j.bbrc.2012.02.101

Zhao
F
,
Chen
Y-P
,
Salmaki
Y
,
Drew
BT
,
Wilson
TC
,
Scheen
A-C
,
Celep
F
,
Bräuchler
C
,
Bendiksby
M
,
Wang
Q
, et al.
An updated tribal classification of Lamiaceae based on plastome phylogenomics
.
BMC Biol
.
2021
:
19
(
1
):
2
. https://doi.org/10.1186/s12915-020-00931-z

Zhuang
W
,
Chen
H
,
Yang
M
,
Wang
J
,
Pandey
MK
,
Zhang
C
,
Chang
WC
,
Zhang
L
,
Zhang
X
,
Tang
R
, et al.
The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication
.
Nat Genet.
2019
:
51
(
5
):
865
876
. https://doi.org/10.1038/s41588-019-0402-2

Author notes

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (https://dbpia.nl.go.kr/plphys/pages/General-Instructions) is Xiyin Wang ([email protected]).

Conflict of interest statement. None declared.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/pages/standard-publication-reuse-rights)

Supplementary data