Abstract

Gene duplication is a major force in evolution. Here, we analyzed the fate of duplicated genes following the ancient whole-genome duplication (WGD) in rice. Polyploidy-derived duplicated genes were found to be preferentially lost from one of each pair of duplicated chromosomal segments, suggesting that the asymmetric gene loss may result from transcriptome dominance of the ancestral allotetraploid genome. Genes involved in synthesis and catabolism of saccharides were found to be preferentially retained in rice, reflecting different trajectories of duplicated genes formed by polyploidy between rice and Arabidopsis. Further studies demonstrated all 3 catalyzing steps in the starch biosynthesis pathway have polyploidy-derived duplicated genes and one copy in each step forms a dominant pathway in the grain filling–stage rice. The new starch biosynthesis pathway reflects one aspect of the impact of WGD on grass evolution.

Gene duplication is a major force in evolution and can provide the genetic material necessary for the origin of new genes with novel functions (Ohno 1970). Polyploidy, which duplicates all genes in the genome, is an important source of biological innovation (Wendel 2000). In paleopolyploids, gene loss is the main fate of duplicated genes formed by whole-genome duplication (WGD). In Arabidopsis, only about 32% of duplicated genes have been retained in sister duplicated regions derived of polyploidy (Blanc et al. 2003).

The loss and retention of polyploidy-derived duplicated genes are nonrandom and related to function. In Arabidopsis, genes involved in transcriptional regulation and signal transduction have been preferentially retained and genes involved in DNA repair have been preferentially lost (Blanc and Wolfe 2004; Seoighe and Gehring 2004). Paterson et al. (2006) found that some gene families have convergent fates in independent WGD events, such as enrichments of myb-like and protein kinase families in plants. Moreover, genes were removed preferentially from one homeolog after WGD in Arabidopsis (Thomas et al. 2006). The overretention of transcriptional regulation and signal transduction–related genes in polyploidy was predicted by the gene balance hypothesis (Papp et al. 2003; Birchler and Veitia 2007).

Data suggest a WGD took place approximately 70 MYA, prior to the divergence of grasses from within the monocotyledonous lineage, providing evidence that all grasses are paleopolyploids (Paterson et al. 2004; Yu et al. 2005). However, the evolutionary impact of this WGD has yet to be elucidated. Here, we analyzed the fate of duplicated genes following the ancient WGD in rice.

We identified 1,657 polyploidy-derived gene pairs in the Nipponbare genome (Oryza sativa L. ssp. japonica cv. Nipponbare) (Supplement 1, Supplementary Material online). On duplicated blocks, only 15.4% genes have been retained as duplicates. There are 8 duplicated chromosomal segments encompassing duplicated blocks formed by the WGD, that is, chr1–5, chr2–4, chr2–6, chr3–7, chr3–10, chr3–12, chr4–8, and chr8-9, and one duplicated segment between chromosomes 11 and 12 formed by segmental duplication (fig. 1). We expect the gene number and size of each pair of duplicated chromosomal segments to be the same immediately following the WGD and the current gene number and size to be similar if the gene loss was random during the diploidization process. Nevertheless, we found that the gene loss between homeologous chromosomal segments was very different. The homeologous chromosomal segment of chr1-5 located on chromosome 1, of chr2-6 on chromosome 6, of chr2-4 on chromosome 4, of chr3-10 on chromosome 10, of chr3-12 on chromosome 3, and of chr8-9 on chromosome 9 have retained significantly more genes than their counterparts (table 1). For larger duplicated blocks, 58% (50 of 86) showed significant asymmetric gene loss (Supplement 1, Supplementary Material online). For example, on chr1-5 that has 469 polyploidy-derived duplicated pairs, the chromosomal segment on chromosome 1 contains 2,065 genes, 631 more than its homeolog on chromosome 5 (table 1).

Table 1

Asymmetric Gene Loss/Retention of Sister Duplicated Chromosomal Segments

Duplicated SegmentsNo. of Retained Gene PairsNo. of Single GenesCovered Regions of Duplicated Segments (Mb)P valuea
chr1–54692,065/1,43418.6/13.60
chr2–4261871/1,1429.0/10.01.88 × 10−09
chr2–62921,039/1,54610.0/14.80
chr3–7210823/7877.1/7.00.19153
chr3–10152468/6634.2/6.66.63 × 10−09
chr3–1242226/1812.4/1.80.01634
chr4–841200/1351.8/1.40.00034
chr8–9190662/9606.8/7.72.03 × 10−13
chr11–12250328/3984.1/3.90.00668
Duplicated SegmentsNo. of Retained Gene PairsNo. of Single GenesCovered Regions of Duplicated Segments (Mb)P valuea
chr1–54692,065/1,43418.6/13.60
chr2–4261871/1,1429.0/10.01.88 × 10−09
chr2–62921,039/1,54610.0/14.80
chr3–7210823/7877.1/7.00.19153
chr3–10152468/6634.2/6.66.63 × 10−09
chr3–1242226/1812.4/1.80.01634
chr4–841200/1351.8/1.40.00034
chr8–9190662/9606.8/7.72.03 × 10−13
chr11–12250328/3984.1/3.90.00668
a

P value corrected by Benjamini and Hochberg (1995) method.

Table 1

Asymmetric Gene Loss/Retention of Sister Duplicated Chromosomal Segments

Duplicated SegmentsNo. of Retained Gene PairsNo. of Single GenesCovered Regions of Duplicated Segments (Mb)P valuea
chr1–54692,065/1,43418.6/13.60
chr2–4261871/1,1429.0/10.01.88 × 10−09
chr2–62921,039/1,54610.0/14.80
chr3–7210823/7877.1/7.00.19153
chr3–10152468/6634.2/6.66.63 × 10−09
chr3–1242226/1812.4/1.80.01634
chr4–841200/1351.8/1.40.00034
chr8–9190662/9606.8/7.72.03 × 10−13
chr11–12250328/3984.1/3.90.00668
Duplicated SegmentsNo. of Retained Gene PairsNo. of Single GenesCovered Regions of Duplicated Segments (Mb)P valuea
chr1–54692,065/1,43418.6/13.60
chr2–4261871/1,1429.0/10.01.88 × 10−09
chr2–62921,039/1,54610.0/14.80
chr3–7210823/7877.1/7.00.19153
chr3–10152468/6634.2/6.66.63 × 10−09
chr3–1242226/1812.4/1.80.01634
chr4–841200/1351.8/1.40.00034
chr8–9190662/9606.8/7.72.03 × 10−13
chr11–12250328/3984.1/3.90.00668
a

P value corrected by Benjamini and Hochberg (1995) method.

The distribution of polyploidy-derived duplicated genes in the rice genome. The duplicated gene pairs were linked by gray lines.
FIG. 1.—

The distribution of polyploidy-derived duplicated genes in the rice genome. The duplicated gene pairs were linked by gray lines.

In Arabidopsis, genes were removed preferentially from one duplicated block after WGD (Thomas et al. 2006). But what is the situation on whole chromosome level? The duplication relationship between chromosomes in rice looks more ordered than that in Arabidopsis. In this study, we observed the asymmetric gene loss of homeologous chromosomes. One likely explanation of the asymmetric gene loss is that the transcriptome of one subgenome can be dominant to the other in an allopolyploid. In plant and animal hybrids, the rRNA genes from one parental genome are transcribed, whereas many of them inherited from the other parent are silenced (nucleolar dominance) (Pikaard 1999). In allotetraploid Arabidopsis, the progenitor-dependent gene regulation occurs on a genome-wide scale (Wang et al. 2006). The expression patterns of genes from Arabidopsis arenosa (one progenitor of the allotetraploid Arabidopsis) are dominant, whereas genes from the other progenitor Arabidopsis thaliana are more often recessive (Wang et al. 2006). The transcriptome dominance was observed in several other species as well, such as Tragopogon miscellus (Tate et al. 2006) and tetraploid cotton (Adams et al. 2004). Therefore, the asymmetric gene loss may result from transcriptome dominance of the ancestral genome.

We further used Gene Ontology (Ashburner et al. 2000) to classify rice genes into 2,746 functional categories. In all, 54 functional categories were significantly overrepresented and 11 functional categories were significantly underrepresented (Supplement 2, Supplementary Material online). In addition to genes involved in expression regulation and signal transduction, genes related to synthesis and catabolism of saccharides were found to be overrepresented, such as enzymes involved in glycolysis, amylopectin biosynthesis, and trehalose biosynthesis (Supplement 2, Supplementary Material online).

In higher plants, 3 enzymes have been found to be directly involved in starch biosynthesis, including ADP-glucose pyrophosphorylase (EC: 2.7.7.27), starch synthase (EC: 2.4.1.21), and starch-branching enzyme (EC: 2.4.1.18) (fig. 2B) (Smith et al. 1997; James et al. 2003). All the 3 enzymes have polyploidy-derived gene pairs in rice (fig. 2A). To further understand the function of polyploid-derived duplicated genes involved in starch synthesis, we used the microarray data from 93-11 (Oryza sativa L. ssp. indica cv. 93-11) to study expression profiles (Ma et al. 2005). The expression of duplicates OsAGPS1 and OsAGPS2, OsSSIIa and OsSSIIb, and OsBEIIa and OsBEIIb were divergent from one another (fig. 2C). The expression of one copy in each pair was upregulated in grain filling–stage of rice panicles (OsAGPS2, OsSSIIa, and OsBEIIb). Moreover, OsAGPS2, OsSSIIa, and OsBEIIb are highly coexpressed, evidenced by the Pearson correlation coefficient (OsAGPS2OsSSIIa: 0.893; OsSSIIaOsBEIIb: 0.921; and OsAGPS2OsBEIIb: 0.987). The mutant of OsAGPS2 (sh-2) displayed a reduction in starch synthesis in the endosperm, represented by shrunken grains (Lee et al. 2007). The variances of OsSSIIa (alk) showed significant differences in the gelatinization temperature of the rice grains, which is related to amylopectin structure (Gao et al. 2003). In contrast, the expression of the other copy of each duplicated pair has not been specifically upregulated in these tissues (fig. 2C). Accordingly, OsAGPS2OsSSIIaOsBEIIb forms a dominant pathway of starch biosynthesis in the grain filling–stage of rice panicles and composes of one copy of each polyploidy-derived gene pair.

The starch biosynthesis pathway in rice. (A) The Neighbor-Joining trees of the polyploidy-derived duplicated genes involved in starch biosynthesis in rice. (B) The starch biosynthesis pathway in rice. (C) The expression profiles of the polyploidy-derived duplicated genes involved in starch biosynthesis in rice. The number on the x axis (1–12) represents rice tissues of roots, shoots, grain filling–stage panicles, heading-stage panicles, seedlings, lodicules, pistils before insemination, pistils 24 h after insemination, lemmas, stamens, glumes, and paleae, respectively. The number on the y axis indicates the signal intensity.
FIG. 2.—

The starch biosynthesis pathway in rice. (A) The Neighbor-Joining trees of the polyploidy-derived duplicated genes involved in starch biosynthesis in rice. (B) The starch biosynthesis pathway in rice. (C) The expression profiles of the polyploidy-derived duplicated genes involved in starch biosynthesis in rice. The number on the x axis (1–12) represents rice tissues of roots, shoots, grain filling–stage panicles, heading-stage panicles, seedlings, lodicules, pistils before insemination, pistils 24 h after insemination, lemmas, stamens, glumes, and paleae, respectively. The number on the y axis indicates the signal intensity.

We further discovered that the orthologous genes are conserved in maize (fig. 3). The ZmAGPS2ZmSSIIaZmBEIIb (orthologs of OsAGPS2OsSSIIaOsBEIIb) have endosperm-specific expression profiles, and their mutants (brittle-2, mutant of ZmAGPS2; sugary2, mutant of ZmSSIIa; and amylose-extender1, mutant of ZmBEIIb) show significant changes in starch content and properties in maize kernels, indicating that this pathway is vital in endosperm development (Giroux and Hannah 1994; Fisher et al. 1996; Zhang et al. 2004). In addition, the brittle-2 and amylose-extender1 were under strong selection during maize domestication and improvement, suggesting that brittle-2 and amylose-extender1 are important for starch production (Whitt et al. 2002). All the above suggest the formation of a dominant starch synthesis pathway in endosperm resulted from a WGD, which have contributed the genetic material for the evolution of an important agronomic trait of rice and maize and likely all cereals.

Phylogenetic trees of starch biosynthesis genes. (A) The Neighbor-Joining tree of ADP-glucose pyrophosphorylases. (B) The Neighbor-Joining tree of starch synthases. (C) The Neighbor-Joining tree of starch-branching enzymes.
FIG. 3.—

Phylogenetic trees of starch biosynthesis genes. (A) The Neighbor-Joining tree of ADP-glucose pyrophosphorylases. (B) The Neighbor-Joining tree of starch synthases. (C) The Neighbor-Joining tree of starch-branching enzymes.

Methods

The data set of the Nipponbare proteome included 49, 472 gene models (ftp://pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_4.0). We identified polyploidy-derived duplicated genes according to Tian et al. (2005) with some modifications (see Supplement 1, Supplementary Material online). We used the InterProScan database version 11 to annotate the gene function of the Nipponbare on the whole-genome scale (Zdobnov and Apweiler 2001). The statistical methods for overrepresented and underrepresented functional categories are included in Supplement 2 (Supplementary Material online). Phylogenetic trees were constructed employing the Neighbor-Joining method with MEGA version 3.1 (Kumar et al. 2004).

This work was supported by Chinese Academy of Sciences (grants number KSCX2-YW-N-028 and CXTD-S2005-2) and National Natural Science Foundation of China (grant numbers 30621001 and 30770143). We thank anonymous reviewers for their constructive and critical comments on the manuscript.

References

Adams
KL
Percifield
R
Wendel
JF
,
Organ-specific silencing of duplicated genes in a newly synthesized cotton allotetraploid
Genetics
,
2004
, vol.
168
(pg.
2217
-
2226
)
Ashburner
M
Ball
CA
Blake
JA
et al.
(20 co-authors)
,
Gene ontology: tool for the unification of biology
Nat Genet
,
2000
, vol.
25
(pg.
25
-
29
)
Benjamini
Y
Hochberg
Y
,
Controlling the false discovery rate—a practical and powerful approach to multiple testing
J R Stat Soc Ser B Methodol
,
1995
, vol.
57
(pg.
289
-
300
)
Birchler
JA
Veitia
RA
,
The gene balance hypothesis: from classical genetics to modern genomics
Plant Cell
,
2007
, vol.
19
(pg.
395
-
402
)
Blanc
G
Hokamp
K
Wolfe
KH
,
A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis Genome
Genome Res
,
2003
, vol.
13
(pg.
137
-
144
)
Blanc
G
Wolfe
KH
,
Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution
Plant Cell
,
2004
, vol.
16
(pg.
1679
-
1691
)
Fisher
DK
Gao
M
Kim
KN
Boyer
CD
Guiltinan
MJ
,
Allelic analysis of the maize amylose-extender locus suggests that independent genes encode starch-branching enzymes IIa and IIb
Plant Physiol
,
1996
, vol.
110
(pg.
611
-
619
)
Gao
Z
Zeng
D
Cui
X
Zhou
Y
Yan
M
Huang
D
Li
J
Qian
Q
,
Map-based cloning of the ALK gene, which controls the gelatinization temperature of rice
Sci China C Life Sci
,
2003
, vol.
46
(pg.
661
-
668
)
Giroux
MJ
Hannah
LC
,
ADP-glucose pyrophosphorylase in shrunken-2 and brittle-2 mutants of maize
Mol Gen Genet
,
1994
, vol.
243
(pg.
400
-
408
)
James
MG
Denyer
K
Myers
AM
,
Starch synthesis in the cereal endosperm
Curr Opin Plant Biol
,
2003
, vol.
6
(pg.
215
-
222
)
Kumar
S
Tamura
K
Nei
M
,
MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment
Brief Bioinform
,
2004
, vol.
5
(pg.
150
-
163
)
Lee
S-K
Hwang
S-K
Han
M
et al.
(13 co-authors)
,
Identification of the ADP-glucose pyrophosphorylase isoforms essential for starch synthesis in the leaf and seed endosperm of rice (Oryza sativa L.)
Plant Mol Biol
,
2007
, vol.
65
(pg.
531
-
546
)
Ma
L
Chen
C
Liu
X
et al.
(20 co-authors)
,
A microarray analysis of the rice transcriptome and its comparison to Arabidopsis
Genome Res
,
2005
, vol.
15
(pg.
1274
-
1283
)
Ohno
S
Evolution by gene duplication
,
1970
Berlin
Springer
Papp
B
Pal
C
Hurst
LD
,
Dosage sensitivity and the evolution of gene families in yeast
Nature
,
2003
, vol.
424
(pg.
194
-
197
)
Paterson
AH
Bowers
JE
Chapman
BA
,
Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics
Proc Natl Acad Sci USA
,
2004
, vol.
101
(pg.
9903
-
9908
)
Paterson
AH
Chapman
BA
Kissinger
JC
Bowers
JE
Feltus
FA
Estill
JC
,
Many gene and domain families have convergent fates following independent whole-genome duplication events in Arabidopsis, Oryza, Saccharomyces and Tetraodon
Trends Genet
,
2006
, vol.
22
(pg.
597
-
602
)
Pikaard
CS
,
Nucleolar dominance and silencing of transcription
Trends Plant Sci
,
1999
, vol.
4
(pg.
478
-
483
)
Seoighe
C
Gehring
C
,
Genome duplication led to highly selective expansion of the Arabidopsis thaliana proteome
Trends Genet
,
2004
, vol.
20
(pg.
461
-
464
)
Smith
AM
Denyer
K
Martin
C
,
The synthesis of the starch granule
Annu Rev Plant Physiol Plant Mol Biol
,
1997
, vol.
48
(pg.
67
-
87
)
Tate
JA
Ni
Z
Scheen
A-C
Koh
J
Gilbert
CA
Lefkowitz
D
Chen
ZJ
Soltis
PS
Soltis
DE
,
Evolution and expression of homeologous loci in Tragopogon miscellus (Asteraceae), a recent and reciprocally formed allopolyploid
Genetics
,
2006
, vol.
173
(pg.
1599
-
1611
)
Thomas
BC
Pedersen
B
Freeling
M
,
Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes
Genome Res
,
2006
, vol.
16
(pg.
934
-
946
)
Tian
C
Xiong
Y
Liu
T
Sun
S
Chen
L
Chen
M
,
Evidence for an ancient whole-genome duplication event in rice and other cereals
Acta Genet Sinica
,
2005
, vol.
32
(pg.
519
-
527
)
Wang
J
Tian
L
Lee
H-S
et al.
(11 co-authors)
,
Genomewide nonadditive gene regulation in Arabidopsis allotetraploids
Genetics
,
2006
, vol.
172
(pg.
507
-
517
)
Wendel
JF
,
Genome evolution in polyploids
Plant Mol Biol
,
2000
, vol.
42
(pg.
225
-
249
)
Whitt
SR
Wilson
LM
Tenaillon
MI
Gaut
BS
Buckler
ES
,
Genetic diversity and selection in the maize starch pathway
PNAS
,
2002
, vol.
99
(pg.
12959
-
12962
)
Yu
JJ
Wang
W
Lin
S
et al.
(116 co-authors)
,
The genomes of Oryza sativa: a history of duplications
PLoS Biol
,
2005
, vol.
3
pg.
e38
Zdobnov
EM
Apweiler
R
,
InterProScan—an integration platform for the signature-recognition methods in InterPro
Bioinformatics
,
2001
, vol.
17
(pg.
847
-
848
)
Zhang
X
Colleoni
C
Ratushna
V
Sirghie-colleoni
M
James
M
Myers
A
,
Molecular characterization demonstrates that the Zea mays gene sugary2 codes for the starch synthase isoform SSIIa
Plant Mol Biol
,
2004
, vol.
54
(pg.
865
-
879
)

Author notes

Franz Lang, Associate Editor

Supplementary data