-
PDF
- Split View
-
Views
-
Cite
Cite
Yufeng Wu, Zhengge Zhu, Ligeng Ma, Mingsheng Chen, The Preferential Retention of Starch Synthesis Genes Reveals the Impact of Whole-Genome Duplication on Grass Evolution, Molecular Biology and Evolution, Volume 25, Issue 6, June 2008, Pages 1003–1006, https://doi.org/10.1093/molbev/msn052
- Share Icon Share
Abstract
Gene duplication is a major force in evolution. Here, we analyzed the fate of duplicated genes following the ancient whole-genome duplication (WGD) in rice. Polyploidy-derived duplicated genes were found to be preferentially lost from one of each pair of duplicated chromosomal segments, suggesting that the asymmetric gene loss may result from transcriptome dominance of the ancestral allotetraploid genome. Genes involved in synthesis and catabolism of saccharides were found to be preferentially retained in rice, reflecting different trajectories of duplicated genes formed by polyploidy between rice and Arabidopsis. Further studies demonstrated all 3 catalyzing steps in the starch biosynthesis pathway have polyploidy-derived duplicated genes and one copy in each step forms a dominant pathway in the grain filling–stage rice. The new starch biosynthesis pathway reflects one aspect of the impact of WGD on grass evolution.
Gene duplication is a major force in evolution and can provide the genetic material necessary for the origin of new genes with novel functions (Ohno 1970). Polyploidy, which duplicates all genes in the genome, is an important source of biological innovation (Wendel 2000). In paleopolyploids, gene loss is the main fate of duplicated genes formed by whole-genome duplication (WGD). In Arabidopsis, only about 32% of duplicated genes have been retained in sister duplicated regions derived of polyploidy (Blanc et al. 2003).
The loss and retention of polyploidy-derived duplicated genes are nonrandom and related to function. In Arabidopsis, genes involved in transcriptional regulation and signal transduction have been preferentially retained and genes involved in DNA repair have been preferentially lost (Blanc and Wolfe 2004; Seoighe and Gehring 2004). Paterson et al. (2006) found that some gene families have convergent fates in independent WGD events, such as enrichments of myb-like and protein kinase families in plants. Moreover, genes were removed preferentially from one homeolog after WGD in Arabidopsis (Thomas et al. 2006). The overretention of transcriptional regulation and signal transduction–related genes in polyploidy was predicted by the gene balance hypothesis (Papp et al. 2003; Birchler and Veitia 2007).
Data suggest a WGD took place approximately 70 MYA, prior to the divergence of grasses from within the monocotyledonous lineage, providing evidence that all grasses are paleopolyploids (Paterson et al. 2004; Yu et al. 2005). However, the evolutionary impact of this WGD has yet to be elucidated. Here, we analyzed the fate of duplicated genes following the ancient WGD in rice.
We identified 1,657 polyploidy-derived gene pairs in the Nipponbare genome (Oryza sativa L. ssp. japonica cv. Nipponbare) (Supplement 1, Supplementary Material online). On duplicated blocks, only 15.4% genes have been retained as duplicates. There are 8 duplicated chromosomal segments encompassing duplicated blocks formed by the WGD, that is, chr1–5, chr2–4, chr2–6, chr3–7, chr3–10, chr3–12, chr4–8, and chr8-9, and one duplicated segment between chromosomes 11 and 12 formed by segmental duplication (fig. 1). We expect the gene number and size of each pair of duplicated chromosomal segments to be the same immediately following the WGD and the current gene number and size to be similar if the gene loss was random during the diploidization process. Nevertheless, we found that the gene loss between homeologous chromosomal segments was very different. The homeologous chromosomal segment of chr1-5 located on chromosome 1, of chr2-6 on chromosome 6, of chr2-4 on chromosome 4, of chr3-10 on chromosome 10, of chr3-12 on chromosome 3, and of chr8-9 on chromosome 9 have retained significantly more genes than their counterparts (table 1). For larger duplicated blocks, 58% (50 of 86) showed significant asymmetric gene loss (Supplement 1, Supplementary Material online). For example, on chr1-5 that has 469 polyploidy-derived duplicated pairs, the chromosomal segment on chromosome 1 contains 2,065 genes, 631 more than its homeolog on chromosome 5 (table 1).
Duplicated Segments | No. of Retained Gene Pairs | No. of Single Genes | Covered Regions of Duplicated Segments (Mb) | P valuea |
chr1–5 | 469 | 2,065/1,434 | 18.6/13.6 | 0 |
chr2–4 | 261 | 871/1,142 | 9.0/10.0 | 1.88 × 10−09 |
chr2–6 | 292 | 1,039/1,546 | 10.0/14.8 | 0 |
chr3–7 | 210 | 823/787 | 7.1/7.0 | 0.19153 |
chr3–10 | 152 | 468/663 | 4.2/6.6 | 6.63 × 10−09 |
chr3–12 | 42 | 226/181 | 2.4/1.8 | 0.01634 |
chr4–8 | 41 | 200/135 | 1.8/1.4 | 0.00034 |
chr8–9 | 190 | 662/960 | 6.8/7.7 | 2.03 × 10−13 |
chr11–12 | 250 | 328/398 | 4.1/3.9 | 0.00668 |
Duplicated Segments | No. of Retained Gene Pairs | No. of Single Genes | Covered Regions of Duplicated Segments (Mb) | P valuea |
chr1–5 | 469 | 2,065/1,434 | 18.6/13.6 | 0 |
chr2–4 | 261 | 871/1,142 | 9.0/10.0 | 1.88 × 10−09 |
chr2–6 | 292 | 1,039/1,546 | 10.0/14.8 | 0 |
chr3–7 | 210 | 823/787 | 7.1/7.0 | 0.19153 |
chr3–10 | 152 | 468/663 | 4.2/6.6 | 6.63 × 10−09 |
chr3–12 | 42 | 226/181 | 2.4/1.8 | 0.01634 |
chr4–8 | 41 | 200/135 | 1.8/1.4 | 0.00034 |
chr8–9 | 190 | 662/960 | 6.8/7.7 | 2.03 × 10−13 |
chr11–12 | 250 | 328/398 | 4.1/3.9 | 0.00668 |
P value corrected by Benjamini and Hochberg (1995) method.
Duplicated Segments | No. of Retained Gene Pairs | No. of Single Genes | Covered Regions of Duplicated Segments (Mb) | P valuea |
chr1–5 | 469 | 2,065/1,434 | 18.6/13.6 | 0 |
chr2–4 | 261 | 871/1,142 | 9.0/10.0 | 1.88 × 10−09 |
chr2–6 | 292 | 1,039/1,546 | 10.0/14.8 | 0 |
chr3–7 | 210 | 823/787 | 7.1/7.0 | 0.19153 |
chr3–10 | 152 | 468/663 | 4.2/6.6 | 6.63 × 10−09 |
chr3–12 | 42 | 226/181 | 2.4/1.8 | 0.01634 |
chr4–8 | 41 | 200/135 | 1.8/1.4 | 0.00034 |
chr8–9 | 190 | 662/960 | 6.8/7.7 | 2.03 × 10−13 |
chr11–12 | 250 | 328/398 | 4.1/3.9 | 0.00668 |
Duplicated Segments | No. of Retained Gene Pairs | No. of Single Genes | Covered Regions of Duplicated Segments (Mb) | P valuea |
chr1–5 | 469 | 2,065/1,434 | 18.6/13.6 | 0 |
chr2–4 | 261 | 871/1,142 | 9.0/10.0 | 1.88 × 10−09 |
chr2–6 | 292 | 1,039/1,546 | 10.0/14.8 | 0 |
chr3–7 | 210 | 823/787 | 7.1/7.0 | 0.19153 |
chr3–10 | 152 | 468/663 | 4.2/6.6 | 6.63 × 10−09 |
chr3–12 | 42 | 226/181 | 2.4/1.8 | 0.01634 |
chr4–8 | 41 | 200/135 | 1.8/1.4 | 0.00034 |
chr8–9 | 190 | 662/960 | 6.8/7.7 | 2.03 × 10−13 |
chr11–12 | 250 | 328/398 | 4.1/3.9 | 0.00668 |
P value corrected by Benjamini and Hochberg (1995) method.

The distribution of polyploidy-derived duplicated genes in the rice genome. The duplicated gene pairs were linked by gray lines.
In Arabidopsis, genes were removed preferentially from one duplicated block after WGD (Thomas et al. 2006). But what is the situation on whole chromosome level? The duplication relationship between chromosomes in rice looks more ordered than that in Arabidopsis. In this study, we observed the asymmetric gene loss of homeologous chromosomes. One likely explanation of the asymmetric gene loss is that the transcriptome of one subgenome can be dominant to the other in an allopolyploid. In plant and animal hybrids, the rRNA genes from one parental genome are transcribed, whereas many of them inherited from the other parent are silenced (nucleolar dominance) (Pikaard 1999). In allotetraploid Arabidopsis, the progenitor-dependent gene regulation occurs on a genome-wide scale (Wang et al. 2006). The expression patterns of genes from Arabidopsis arenosa (one progenitor of the allotetraploid Arabidopsis) are dominant, whereas genes from the other progenitor Arabidopsis thaliana are more often recessive (Wang et al. 2006). The transcriptome dominance was observed in several other species as well, such as Tragopogon miscellus (Tate et al. 2006) and tetraploid cotton (Adams et al. 2004). Therefore, the asymmetric gene loss may result from transcriptome dominance of the ancestral genome.
We further used Gene Ontology (Ashburner et al. 2000) to classify rice genes into 2,746 functional categories. In all, 54 functional categories were significantly overrepresented and 11 functional categories were significantly underrepresented (Supplement 2, Supplementary Material online). In addition to genes involved in expression regulation and signal transduction, genes related to synthesis and catabolism of saccharides were found to be overrepresented, such as enzymes involved in glycolysis, amylopectin biosynthesis, and trehalose biosynthesis (Supplement 2, Supplementary Material online).
In higher plants, 3 enzymes have been found to be directly involved in starch biosynthesis, including ADP-glucose pyrophosphorylase (EC: 2.7.7.27), starch synthase (EC: 2.4.1.21), and starch-branching enzyme (EC: 2.4.1.18) (fig. 2B) (Smith et al. 1997; James et al. 2003). All the 3 enzymes have polyploidy-derived gene pairs in rice (fig. 2A). To further understand the function of polyploid-derived duplicated genes involved in starch synthesis, we used the microarray data from 93-11 (Oryza sativa L. ssp. indica cv. 93-11) to study expression profiles (Ma et al. 2005). The expression of duplicates OsAGPS1 and OsAGPS2, OsSSIIa and OsSSIIb, and OsBEIIa and OsBEIIb were divergent from one another (fig. 2C). The expression of one copy in each pair was upregulated in grain filling–stage of rice panicles (OsAGPS2, OsSSIIa, and OsBEIIb). Moreover, OsAGPS2, OsSSIIa, and OsBEIIb are highly coexpressed, evidenced by the Pearson correlation coefficient (OsAGPS2–OsSSIIa: 0.893; OsSSIIa–OsBEIIb: 0.921; and OsAGPS2–OsBEIIb: 0.987). The mutant of OsAGPS2 (sh-2) displayed a reduction in starch synthesis in the endosperm, represented by shrunken grains (Lee et al. 2007). The variances of OsSSIIa (alk) showed significant differences in the gelatinization temperature of the rice grains, which is related to amylopectin structure (Gao et al. 2003). In contrast, the expression of the other copy of each duplicated pair has not been specifically upregulated in these tissues (fig. 2C). Accordingly, OsAGPS2–OsSSIIa–OsBEIIb forms a dominant pathway of starch biosynthesis in the grain filling–stage of rice panicles and composes of one copy of each polyploidy-derived gene pair.

The starch biosynthesis pathway in rice. (A) The Neighbor-Joining trees of the polyploidy-derived duplicated genes involved in starch biosynthesis in rice. (B) The starch biosynthesis pathway in rice. (C) The expression profiles of the polyploidy-derived duplicated genes involved in starch biosynthesis in rice. The number on the x axis (1–12) represents rice tissues of roots, shoots, grain filling–stage panicles, heading-stage panicles, seedlings, lodicules, pistils before insemination, pistils 24 h after insemination, lemmas, stamens, glumes, and paleae, respectively. The number on the y axis indicates the signal intensity.
We further discovered that the orthologous genes are conserved in maize (fig. 3). The ZmAGPS2–ZmSSIIa–ZmBEIIb (orthologs of OsAGPS2–OsSSIIa–OsBEIIb) have endosperm-specific expression profiles, and their mutants (brittle-2, mutant of ZmAGPS2; sugary2, mutant of ZmSSIIa; and amylose-extender1, mutant of ZmBEIIb) show significant changes in starch content and properties in maize kernels, indicating that this pathway is vital in endosperm development (Giroux and Hannah 1994; Fisher et al. 1996; Zhang et al. 2004). In addition, the brittle-2 and amylose-extender1 were under strong selection during maize domestication and improvement, suggesting that brittle-2 and amylose-extender1 are important for starch production (Whitt et al. 2002). All the above suggest the formation of a dominant starch synthesis pathway in endosperm resulted from a WGD, which have contributed the genetic material for the evolution of an important agronomic trait of rice and maize and likely all cereals.

Phylogenetic trees of starch biosynthesis genes. (A) The Neighbor-Joining tree of ADP-glucose pyrophosphorylases. (B) The Neighbor-Joining tree of starch synthases. (C) The Neighbor-Joining tree of starch-branching enzymes.
Methods
The data set of the Nipponbare proteome included 49, 472 gene models (ftp://pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_4.0). We identified polyploidy-derived duplicated genes according to Tian et al. (2005) with some modifications (see Supplement 1, Supplementary Material online). We used the InterProScan database version 11 to annotate the gene function of the Nipponbare on the whole-genome scale (Zdobnov and Apweiler 2001). The statistical methods for overrepresented and underrepresented functional categories are included in Supplement 2 (Supplementary Material online). Phylogenetic trees were constructed employing the Neighbor-Joining method with MEGA version 3.1 (Kumar et al. 2004).
This work was supported by Chinese Academy of Sciences (grants number KSCX2-YW-N-028 and CXTD-S2005-2) and National Natural Science Foundation of China (grant numbers 30621001 and 30770143). We thank anonymous reviewers for their constructive and critical comments on the manuscript.
References
Author notes
Franz Lang, Associate Editor