-
PDF
- Split View
-
Views
-
Cite
Cite
Sylvain Forêt, François Seneca, Danielle de Jong, Annette Bieller, Georg Hemmrich, Rene Augustin, David C. Hayward, Eldon E. Ball, Thomas C.G. Bosch, Kiyokazu Agata, Monika Hassel, David J. Miller, Phylogenomics Reveals an Anomalous Distribution of USP Genes in Metazoans, Molecular Biology and Evolution, Volume 28, Issue 1, January 2011, Pages 153–161, https://doi.org/10.1093/molbev/msq183
- Share Icon Share
Abstract
Members of the universal stress protein (USP) family were originally identified in stressed bacteria on the basis of a shared domain, which has since been reported in a phylogenetically diverse range of prokaryotes, fungi, protists, and plants. Although not previously characterized in metazoans, here we report that USP genes are distributed in animal genomes in a unique pattern that reflects frequent independent losses and independent expansions. Multiple USP loci are present in urochordates as well as all Cnidaria and Lophotrochozoa examined, but none were detected in any of the available ecdysozoan or non-urochordate deuterostome genome data. The vast majority of the metazoan USPs are short, single-domain proteins and are phylogenetically distinct from the prokaryotic, plant, protist, and fungal members of the protein family. Whereas most of the metazoan USP genes contain introns, with few exceptions those in the cnidarian Hydra are intronless and cluster together in phylogenetic analyses. Expression patterns were determined for several cnidarian USPs, including two genes belonging to the intronless clade, and these imply diverse functions. The apparent paradox of implied diversity of roles despite high overall levels of sequence (and implied structural) similarity parallels the situation in bacteria. The absence of USP genes in ecdysozoans and most deuterostomes may be a consequence of functional redundancy or specialization in taxon-specific roles.
Introduction
The universal stress protein A (USPA) domain, originally identified in the product of the Escherichia coli uspA gene, is the archetype of what is now a family of prokaryotic and plant proteins defined as COG0589 and Pfam PF00582 (Kvint et al. 2003). Proteins containing the USP domain were originally identified in the context of the bacterial stress response; E. coli contains six such genes (uspA, C–G), which are expressed in response to a wide variety of stress states, including nutrient starvation and exposure to heat, acid, heavy metals, oxidative agents, osmotic stress, antibiotics, and uncouplers of oxidative phosphorylation (Nystrom and Neidhardt 1992, 1993, 1994). In Mycobacterium smegmatis, three USPs are induced in response to oxygen starvation (O'Toole et al. 2003), and in Pseudomonas aeruginosa, USPs are required for anaerobic growth (Schreiber et al. 2006). Although some USPs are clearly involved in bacterial stress responses, mutagenesis studies imply other primary functions as well. In E. coli, UspC and UspE are implicated in cell adhesion as well as the production of flagella; these two proteins decrease adhesion and promote motility, whereas UspF and UspG have the opposite effect (Nachin et al. 2005). In addition to being required for defense against superoxide-generating agents, E. coli UspD functions in intracellular iron homeostasis (Nachin et al. 2005). Both plants and fungi also have proteins containing the USP domain, and some of the plant USP genes are stress induced; for example, tomato ER6 is induced by ethylene (Zegzouti et al. 1999), a plant hormone often associated with stress (Druege 2006).
The USP domain is an alpha-beta-alpha fold (fig. 1), and the USP family belongs to the adenine nucleotide alpha hydrolase superfamily, which also includes the electron transport flavoprotein family, the N-type ATP protein phosphatases and ATP sulfhydrylases. Although the USP fold is associated with ATP binding, not all proteins in the USP family bind ATP. The crystal structures of a number of USPs have been solved, including those from Methanococcus jannaschii (Zarembinski et al. 1998) and Haemophilus influenzae (Sousa and McKay 2001). Although these structural features imply a fundamental distinction between the bacterial UspF/UspG types, which bind ATP, and the UspA type, which does not, the functional significance of this is unclear.

The USP domain. This alignment includes all the Hydra magnipapillata sequences and the bacterial 1MJH. The two Hydra sequences indicated in red are those lacking introns. The secondary structure is shown above the alignment, and the residues involved in ATP binding are indicated by a “*” under the alignment. The color coding of residues is based on that used in the Jalview implementation of ClustalX (see www.jalview.org/help/help.html) as follows: blue, A, I, L, M, F, W, V, C; red, R, K; green, N, Q, S, T; pink, C; magenta, E, D; orange, G; cyan, H, Y; yellow, P. Thresholds for color coding were optimized to clearly delineate the boundaries of secondary structure features.
Understanding USP function is also complicated by the fact that many of these proteins can form homo- as well as heterodimers (Nachin et al. 2008). Moreover, although most bacterial USPs are small proteins containing only either one (14–15 kD) or two (ca., 30 kD) USP domains, the domain also occurs in multidomain proteins from bacteria, Archaea, and plants (Kvint et al. 2003). In bacteria, the domain occurs in a family of osmosensitive K+ channel histidine kinases, and in both Archaea and bacteria, a family of Na+/K+ antiporters. In plants, the USP domain occurs in a number of serine/threonine kinases.
Although known from the other kingdoms of life, USPs have been thought of as “nonmetazoan” genes and have probably been underreported in expressed sequence tag (EST) data sets as suspected contaminants. However, in the process of characterizing EST data sets for two anthozoan cnidarians, we identified ESTs encoding clear members of the USP family in both the coral Acropora millepora and the sea anemone Nematostella vectensis (Technau et al. 2005). These were the first metazoan USPs to be reported; however, clearly related sequences were also detected in Schistomona japonicum (Technau et al. 2005). Here, we report the presence of extensive USP gene families in the genomes of a phylogenetically diverse range of animals, including lophotrochozoans, cnidarians, and urochordates. USP genes were detected neither in any of the ecdysozoans for which whole-genome data are available nor in vertebrates or other non-urochordate deuterostomes. The pattern of distribution of USP genes, at least five independent losses and multiple independent expansions having occurred during animal evolution, is so far unique. The metazoan USPs grouped together in phylogenetic analyses, consistent with the hypothesis that these represent ancient genes present in the common metazoan ancestor and were not acquired later via lateral gene transfer. Molecular phylogenetics indicates that metazoan USPs have evolved via taxon-specific duplications from a small ancestral repertoire, and expression data for several cnidarian USP genes imply diverse roles. The patchy phylogenetic distribution of USP genes is consistent with the idea of widespread gene loss across the Metazoa and suggests that other molecules may be capable of fulfilling their roles or that these functions are no longer necessary in the species where they are missing. Alternatively, the diversity of these proteins in some clades and their absence in others might indicate that they have evolved to fulfill taxon-specific roles.
Methods
Sequences
Sequences were obtained for a number of eukaryotes, focusing primarily on those with fully sequenced and annotated genomes. Due to the very large number of bacterial USP sequences, only those available from the protein data bank (pdb.org) with a resolved 3D structure were used in our analyses. Scanning for USP domains was carried out using HMMER (Eddy 1998) version 2.3.2 with the USP profiles available on PFAM (version 23.0). The gene models of the predicted proteins containing a USP domain were then inspected to confirm domain structures and intron–exon boundaries. Details of the database versions used and sequence names identified are provided in supplementary table 1, Supplementary Material online. We adhered to the standard nomenclature practice: the name of each USP starts with the first letter of the genus name, followed by a three-letter reduction of the species name, followed by a number. When several protein sequences from the same species were identical, a single representative was used for further analysis, hence the use of some nonsequential identifiers. Genes containing two USP domains were split, with “a” and “b” appended to the first and second domain, respectively (e.g., Sman5a is the first domain of Schistosoma mansonii USP gene number 5).
Phylogenetics
Sequences were aligned with MAFFT 6.717b (Katoh et al. 2005), using the accurate L-INS-I method. Positions containing over 95% gaps were removed from the alignment. A colored version of the alignment with the intron positions is shown in supplementary figure 7, Supplementary Material online. Maximum likelihood trees were inferred with PhyML 3.0 (Guindon and Gascuel 2003) using the LG amino acid substitution model (Le and Gascuel 2008), with four substitution rate categories approximating a gamma distribution whose rate was estimated and an invariant category. The starting trees were computed using BioNJ, and the topologies were optimized by nearest neighbor interchange and subtree pruning and regrafting. The branch support was estimated using approximate likelihood tests (Shimodaira and Hasegawa 1999) and with the bootstrap procedure, using 100 replicates.
Phylogenetic trees were also inferred with a Bayesian approach using MrBayes 3.2-cvs (Ronquist and Huelsenbeck 2003) that we modified to incorporate the LG model. The amino acid substitution model was chosen by optimization and converged rapidly to the LG model. The program was run for 100,000,000 generations, sampling every 1,000 generations, using two runs and four chains per run. The heterogeneity in rates was modeled by a gamma distribution with four categories and one invariant category. The criteria for convergence were an average standard deviation of split frequencies lower than 0.05 and potential scale reduction factor for the estimated parameters between 0.995 and 1.005. The first 25% observations were removed as burn-in. Quartet puzzling was carried out with Tree-Puzzle 5.2 (Schmidt et al. 2002) in likelihood mapping mode.
In Situ Hybridization
For assessment of gene expression patterns in Hydra, whole-mount in situ hybridization was carried out as previously described (Augustin et al. 2006).
Embryos, planula larvae, and postmetamorphic specimens of Acropora were fixed as described in Anctil et al. (2007). Prior to the hybridization procedure, the specimens were cleared in xylene for 2 h before being rehydrated to phosphate buffered saline (PBS) containing 0.1% Triton X-100 (PBS-T). Remaining lipids were removed by treating specimens in the RIPA detergent cocktail (Rosen and Beddington 1993), overnight at 4 °C, followed by rinses in PBS-T. Whole-mount hybridization proceeded as described in Kucharski et al. (2000). Hybridization was carried out at 55 °C for 72 h. The templates for runoff transcription of antisense RNA probes were generated from cloned cDNAs by polymerase chain reaction. Control specimens, prehybridized with an excess of unlabeled runoff antisense transcript, failed to show staining, demonstrating the specificity of the observed patterns. Following dehydration and clearing through a graded glycerol series, specimens were mounted in 90% glycerol. Digital images were obtained using a SPOT digital camera mounted on a Wild PhotoMakroskop M400.
Results
The Phylogenetic Distribution of USP Genes
The results of scanning the available whole-genome data using a hidden Markov model for the USP domain are summarized in figure 2, with the numbers of USP genes indicated for representative taxa. USP genes are present in slime molds, fungi, and the choanoflagellate Monosiga, but relatively few loci were detected in each case. In the animal kingdom, the distribution of USPs is patchy—none were detected in the placozoan Trichoplax or any ecdysozoans for which whole-genome sequences are available or in the vast majority of deuterostomes. However, several USPs were detected in urochordates (nine in Ciona intestinalis) as well as in all the cnidarians and lophotrochozoans examined. In these latter cases, the numbers of USP genes were higher than in all the nonmetazoan unikonts (eukaryotes that are either amoeboid or bear a single cilium) examined. Preliminary analyses indicate that sponge genomes also encode USPs, but the publicly available data do not yet permit estimation of the numbers of genes present.

Phylogenomic distribution of USP genes. For each clade, the name of a representative species is indicated and colored in red if the genome of that species encodes USPs and in black if it does not. The number of USP genes found in each genome is given to the right of the species name. Branches where losses of the entire USP family have occurred are highlighted by red dots. The Cnidaria are highlighted in blue, the Lophotrochozoa in purple, the Ecdysozoa in green, and the Deuterostomia in brown. LUCA, last universal common ancestor; LECA, last eukaryotic common ancestor.
The distribution pattern of USPs within Metazoa requires multiple losses—at least five independent losses have occurred during animal evolution (fig. 2).
Use of the OrthoMCL database (Li et al. 2003) allowed the identification of USPs as one of 13 orthologous groups of genes with similar phylogenetic distribution in Metazoa, but most of these are metazoan specific. Adding the further constraints that the domain also be present in the range of nonmetazoan groups in which it is known illustrates the unique nature of the USP distribution pattern; adding the constraint that the cluster be present in a choanoflagellate reduced the number of groups found to five, and adding the requirement for presence in bacteria reduced the number identified to two—USPs and amidohydrolase. Requiring that the group also be present in Archea (i.e., the real distribution) made the USP cluster unique. This domain therefore has a highly unusual (unique to date) phyletic distribution pattern; it has a very ancient origin and has been lost on many independent occasions in the Metazoa.
General Characteristics of the Metazoan USPs
With very few exceptions, the predicted metazoan USPs are short, single-domain proteins. By contrast, in plants, a substantial proportion (approximately half in Arabidopsis; Kerk et al. 2003) of proteins containing the USP domain also contain a protein kinase domain. The majority of the metazoan USPs are predicted to have the hydrophobic beta 5 region (fig. 1 and supplementary fig. 7, Supplementary Material online) and are thus presumably capable of dimerization. In both animals and fungi, the main exceptions to the single-domain general pattern are genes encoding two (i.e., duplicated) USP domains. Based on the presence of key residues implicated in ATP binding in USPA from Methanocaldococcus jannaschii (1MJH; Zarembinski et al. 1998), it is likely that at least some animal USPs bind ATP (fig. 1 and supplementary fig. 7, Supplementary Material online).
Phylogenetic Analysis
Phylogenetic analysis of the sequences of the full USP complement from a number of species was conducted using maximum likelihood and Bayesian approaches. Figure 3 shows the results of the Bayesian inference of all the animal sequences identified in fully sequenced genomes and in the coral A. millepora; the results of other inference methods are provided as supplementary figures 1–5, Supplementary Material online. Although there are some (mostly minor) disagreements between the trees resulting from the application of different methods and the level of support varies, consistent trends can be identified.

Phylogenetic analysis of animal USP sequences. The analyses were based on the complete USP complements of the animals with fully sequenced genomes (fig. 1) plus the coral Acropora millepora. The tree shown is the result of Bayesian analysis, with posterior probabilities of the nodes indicated. Keys to identifiers: Amil A. millepora (Cnidaria); Nvec Nematostella vectensis (Cnidaria); Hmag Hydra magnipapillata (Cnidaria); Sman Schistosoma mansoni (Platyhelminthes); Lgig Lottia gigantea (Mollusca); Capi Capitella telata (Annelida); Hrob Helobdella robusta (Annelida); Cins Ciona intestinalis (Urochordata). The well-resolved clade indicated by the blue background contains all but two of the Hydra USP sequences, and every member of this clade lacks introns. The two Hydra sequences containing introns are indicated as a green branch; these groups with two other cnidarian sequences, both of which also contain introns.
The evolution of the eukaryotic USP superfamily has been characterized by many lineage-specific expansions. Most of the land plant sequences grouped together, whereas those from Ostreococcus and Chlamydomonas were found in paraphyletic groups with fungal, slime mold, and ciliate sequences (supplementary figs. 3–Supplementary Data, Supplementary Material online).
With the exception of Nvec08 (see below), the metazoan USPs formed a monophyletic clade regardless of the method of phylogenetic analysis employed. The posterior probability and SH-like likelihood ratio tests (Shimodaira and Hasegawa 1999) strongly support monophyly of the animal USP sequences. Although this node was not well supported in terms of bootstrap support, it was strongly supported by quartet puzzling (supplementary fig. 6, Supplementary Material online) in which 79.1% of 10,000 random quartets favored monophyly of animal USPs.
The N. vectensis sequence Nvec08 is almost certainly a contaminant, as it clusters within the prokaryotic clade with high support, its best Blast hit is a sequence from Flavobacteria bacterium and the genomic scaffold that it is located on is made of a single short contig, containing a single other gene (gi:5496368), which is also intronless and most similar to another F. bacterium sequence (hypothetical protein FBBAL38_06985). Three USP sequences were found in the genome of the choanoflagellate Monosiga brevis, but these sequences did not cluster within the animal group.
With the exception of those from S. mansoni, most of the sequences from the lophotrochozoans Lottia gigantea (a gastropod mollusc), Capitella telata (a polychaete annelid), and Helobdella robusta (a hirudinean annelid) clustered together with moderate support in both maximum likelihood and Bayesian analyses (fig. 3 and supplementary figs. 1–Supplementary Data, Supplementary Material online). Within this clade, two main expansions were consistently identified—one consisting exclusively of sequences from Lottia and the other containing only annelid sequences. Most of the deuterostome (Ciona) sequences also formed a single clade. Many shallow nodes of the tree were well supported by all the methods of analysis. These represent either orthologs between species belonging to the same class or phylum or paralogous expansions. The most striking example of such an expansion is found in Hydra (see below).
Most of the metazoan USP loci are typical eukaryotic genes in that they contain a number of introns, one of which is characteristically at approximately the same position (supplementary fig. 7, Supplementary Material online) in plant and animal genes but is not present in fungal or protist genes. In common with the other cnidarian USP loci, two of the Hydra sequences contain introns (including one at the conserved site), whereas the remaining 22 Hydra USP loci are devoid of them. The fact that these intronless genes form a well-supported monophyletic clade suggests that they are the products of a single retrotransposition event that occurred after the anthozoan/hydrozoan divergence.
Heterogeneity of USP Expression Patterns
Although many of the metazoan USPs are represented in EST data sets, in situ expression data are available in only a few cases. In order to investigate their possible functions, the expression patterns of selected cnidarian USPs were determined by in situ hybridization. Given that the majority of Hydra USP genes are likely to be the result of a retrotransposition event, the expression patterns of representatives of this clade were of particular interest. Each of three Hydra USP genes examined gave a distinct expression pattern in adult polyps (fig. 4A–C); two of the genes for which expression data were obtained (Hmag01 and Hmag10/teba1) are intronless, whereas the third (Hmag05) contains introns. Hmag01 was expressed throughout the body column endoderm (fig. 4A), but no signal was detected in the tentacles. Hmag05, on the other hand, is expressed in a narrow ring of endodermal epithelial cells very close to the basal disk (fig. 4B). In the case of Hmag10/teba1, in situ analyses of whole polyps detected expression of messenger RNA (mRNA) in endodermal cells of the proximal part of tentacles (fig. 4C). This gene, named teba1 (tentacle base 1) because it is the first to be expressed in the tentacle base, is expressed both during bud evagination and during head regeneration.

Expression patterns of Hydra and Acropora USP genes. (A) Expression of Hmag01 in Hydra magnipapillata. This gene is expressed throughout the trunk endoderm, but transcripts are absent from the tentacle. (B) In H. magnipapillata, Hmag05 is strongly expressed in an endodermal stripe across the base of the polyp. (C) Expression of Hmag10/teba1 in Hydra vulgaris. This gene is strongly expressed in the endoderm at the tentacle base. Note that H. vulgaris and H. magnipapillata are closely related sister taxa (Hemmrich et al. 2007), and the (H. magnipapillata) Hmag10 and (H. vulgaris) teba1 proteins are identical. (D) Expression of the Acropora millepora USP gene Amil10. (D1) This gene is first expressed as the planula is settling in the region that will form the basal plate. (D2) A recently settled polyp viewed from the aboral surface (the side against the substratum) shows a ring of strong expression near the periphery of the base surrounding a zone of weaker expression, which would overlie the forming basal plate. (D3) A slightly older polyp shows the basal expression fading as expression along the protosepta appears (arrows). (D4–D6) Expression along the septa (arrows) in older polyps viewed from oral (D4, D6) and aboral (D5). Arrowheads in (D6) mark tissue associated with the synapticular ring connecting the septa.
For comparative purposes, the expression pattern of one of the Acropora USP genes was determined (fig. 4D). During the settlement process, Amil10 is expressed in the region that will form the basal plate (fig. 4D1). Following metamorphosis, Amil10 mRNA becomes progressively more restricted in distribution, with expression initially in the calicoblast cells forming the basal plate and later associated with developing mesenteries of the polyp, presumably in the calcifying cells that are producing the septa (fig. 4D2–D6).
Discussion
USP genes are likely to have been present in the genome of Urmetazoa—the common ancestor of all animals—and phylogenetics supports the idea that the pattern of presence and absence of these genes in bilaterians with fully sequenced genomes reflects gene loss rather than lateral gene transfer. The genomes of the urochordates, cnidarians, and lophotrochozoans examined contain 8–26 USP genes, but none are present in the placozoan Trichoplax or in any ecdysozoan or non-urochordate deuterostome. All the predicted metazoan USPs are small, single-domain proteins, whereas many of the plant and some of the bacterial USPs contain additional domains (summarized in Kvint et al. 2003). Many of the USPs in flowering plants also contain a protein kinase (PK) domain; for example, 48 USPs are present in Arabidopsis and 23 of these are the USP/PK type (Kerk et al. 2003). The diverse expression patterns, and implied diversity of roles, of the single-domain Hydra proteins present an apparent paradox in that these proteins are short and very similar throughout their lengths (fig. 1). For example, the highly divergent expression patterns of Hmag01 and Hmag10 (fig. 4A and C) suggest distinct functions, but the proteins have 43% identity and 59% similarity overall. Although these issues were not explored in the corresponding papers, analyses of published microarray data also imply heterogeneous roles for USPs; in corals, different USP genes respond differently both during development (Grasso et al. 2008; Voolstra et al. 2009) and in adults exposed to thermal stress (Desalvo et al. 2008; Seneca F unpublished data). This situation parallels that in bacteria, however, where USPs are implicated in a similarly wide range of processes, including cell adhesion and motility as well as being modulators of stress responses (Nachin et al. 2005, 2008). The USP domain appears to permit a wide range of functions, but these may be redundant. We note that despite diverse spatial expression patterns, all three of the Hydra USP genes for which we present expression data are expressed in the endodermal epithelium, a highly potent chemical barrier for protection against intruding microbes (Bosch et al. 2009). Are USPs contributing to this defensive barrier?
Most of the Hydra USP genes lack introns and are likely to be derived from a single retrotransposition event. By contrast with processed pseudogenes, most or all of the intronless Hydra USP genes are transcribed (ESTs have been identified in most cases) and code for proteins; the subset studied are expressed in specific patterns during growth and development. Independent expansions are common in evolution, and there are many known from Hydra—for example, the PPOD family of peroxidases (Thomsen and Bosch 2006) and NLR proteins (Lange C, Hemmrich G, Klostermeier U, Miller DJ, Rahn T, Weiss Y, Bosch T, P. Rosentiel, submitted) have undergone extensive duplication in Hydra. Likely precedents for retrotransposition in Hydra include the HvirAPX1 ascorbate peroxidase (Habetha and Bosch 2005). Although the significance of the expression patterns for two other genes is not clear, one of the Hydra USPs, teba1, is an early marker for tentacle development and its expression precedes any obvious morphological changes (not shown). The tentacle base of hydra is a region in which epithelial cells start to undergo dramatic changes in shape and function (Bode et al. 1986), and it may be that teba1 is involved in this process.
The implication of the data presented here is that Urmetazoa—the common animal ancestor—shared one or a few USP genes with members of the other kingdoms of life. One outstanding question is why USP genes are abundant in the genomes of some animals but poorly represented or absent from others. Although there are many examples of loss of single genes, there are few direct precedents for the kind of distribution reported here, where an ancestral domain is absent from all members of one of the three bilaterian lineages (Ecdysozoa) and the majority of another (Deuterostomia) but with an expanded representation in other animals. The Lophotrochozoa are assumed to have undergone fewer gene losses than have Ecdysozoa (see, e.g., Moroz et al. 2006), so the absence of USP genes from members of the latter superphylum is not surprising. There are examples of domain distribution that follow the expected pattern of greater loss from ecdysozoans. For example, the metazoan RAG1 core and N-terminal domains are both present in at least some lophotrochozoans (Moroz et al. 2006), as well as cnidarians and many deuterostomes, but lacking in the Ecdysozoa (Kapitonov and Jurka 2005). The absence of USP genes from vertebrates and most other deuterostomes is unexpected given that the vertebrate gene complement has undergone relatively few losses during evolution.
Gene loss is ubiquitous, and the evidence suggests that very few genes are indispensible. In a comparison based on insects and vertebrates, more than one third (40%) of ancient orthologous genes were shown to have been lost in at least one of the ten species examined (Wyder et al. 2007), and often genes or whole pathways presumed to be essential are missing—Drosophila manages perfectly well without CpG methylation and Caenorhabditis without either CpG methylation or hedgehog signaling. Conversely, genes initially assumed to be taxon specific often turn out not to be when more whole-genome sequences become available. For example, proteins related to the green fluorescent protein of the jellyfish Aequoria victoria were assumed to be restricted to cnidarians but have recently been identified in copepods (Shagin et al. 2004) and amphioxus (Deheyn et al. 2007), and the perforin domain protein apextrin was first identified as a taxonomically restricted gene specific to echinoderms (Haag et al. 1999) but has subsequently been identified in some (but not all) cnidarians (Miller et al. 2007) and lophotrochozoans (Moroz et al. 2006; Takahashi et al. 2009) as well as deuterostomes. The most likely evolutionary scenario is that one or a few USP genes were present in the urmetazoan genome, and this small ancestral complement has independently undergone expansion in a number of lineages. These expansions potentially enable the genes to acquire a diverse range of functions, many of which may be taxon specific, and the diversity of expression patterns seen in cnidarians is consistent with this. Few expression data are available for other metazoan USPs, however. The Aniseed database (http://aniseed-ibdm.univ-mrs.fr/) includes in situ data for one of the Ciona USPs (gene ID: ci 0100151159), which is expressed at the tadpole stage in the anterior and posterior sensory vesicles, the neck, and the visceral ganglia. It remains to be seen whether other animal groups display USP expression patterns as diverse as those reported here for Hydra.
Although individual gene losses have occurred in each species examined to date (Foret et al. 2010), cnidarians appear to have maintained much of the ancestral metazoan gene complement (Kortschak et al. 2003; Technau et al. 2005; Putnam et al. 2007) and there are a number of examples of genes and pathways shared by nonmetazoans and cnidarians but present only in a few bilaterians. For example, the enzymes involved in oxylipin biosynthesis, whose products are the jasmonates and volatile compounds resulting in characteristic smells of many fruits and vegetables, are also present in cnidarians, placozoans, and amphioxus but have been lost from all other animals examined to date (Lee et al. 2008). There is also evidence that cnidarians use plant-like signaling molecules such as abscisic acid (Puce et al. 2004).
Comparative analyses such as these are consistent with a genetically complex common ancestor and underscore both the significance of gene loss during evolution and the informative nature of cnidarians in terms of the ancestral gene complement. The emerging picture is that there are no universal rules, only trends to which there are always exceptions. Explanations for the absence of USPs in most deuterostomes and ecdysozoans include that they either may be functionally redundant or may fulfill primarily taxon-specific roles in those organisms that have retained them. Expression data presented here for the coral USP gene Amil10 are consistent with the idea of a taxon-specific role, but comprehensive expression analyses in a range of animals (lophotrochozoans are of particular interest) will be required to address this issue.
Conclusions
One or a few USP genes are likely to have been present in the common metazoan ancestor and, rather than their presence in some lineages being a consequence of lateral gene transfer, their absence from ecdysozoans and most deuterostomes reflects gene loss. Within the animal kingdom, there have been a number of independent lineage-specific expansions of the USP gene family, most clearly in Hydra where 22 of the 24 USP genes originate from a single retrotransposition event. The diversity of observed expression patterns implies a corresponding diversity of roles for the metazoan USPs despite these being short, single-domain proteins with moderate-to-high sequence similarity. This situation parallels that in bacteria. Explanations for the absence of USP genes in ecdysozoans and most deuterostomes include functional redundancy or the genes being recruited primarily to taxon-specific roles in those taxa that have retained them. Alternatively, the losses and expansions in this gene family could simply be the product of a stochastic birth and death process. To test this in a rigorous quantitative framework (De Bie et al. 2006) will require more data on rates of gene loss and gain in Cnidaria and Lophotrochozoa.
The work was supported in Australia by grants from the Australian Research Council (ARC) via the ARC Centre of Excellence for Coral Reef Studies, the ARC Centre for the Molecular Genetics of Development, and the Discovery Grants Program (grant DP1095343), and in Germany by the Deutsche Forschungsgemeinschaft (grant DFG SFB 617-A1) and via the DFG Cluster of Excellence programs “The Future Ocean” and “Inflammation at Interfaces” (to T.C.G.B). D.J.M. also gratefully acknowledges the receipt of a Japan Society for the Promotion of Science short-term visiting fellowship.
References
Author notes
These authors contributed equally to this work.
Associate editor: Claudia Kappen