-
PDF
- Split View
-
Views
-
Cite
Cite
Sheng Guo, Junhyong Kim, Molecular Evolution of Drosophila Odorant Receptor Genes, Molecular Biology and Evolution, Volume 24, Issue 5, May 2007, Pages 1198–1207, https://doi.org/10.1093/molbev/msm038
- Share Icon Share
Abstract
A total of 752 odorant receptor (Or) genes, including pseudogenes, were identified in 11 Drosophila species and named after their orthologs in Drosophila melanogaster. The 813 Or genes, including 61 from D. melanogaster, were classified into 59 orthologous groups that are well supported by gene phylogeny. By reconciling with the gene family phylogeny, we estimated the number of gene duplication/loss events and intron gain/loss events in the species phylogeny. We found that these events are particularly frequent in Drosophila grimshawi, Drosophila willistoni, and obscura group. More than half of the duplicated genes stay as tandem arrays, whose size range from 2 to 8. These genes vary in sequence and some likely underwent positive selection, indicating that the gene duplication was important for flies to acquire new olfactory functions. We hypothesize that Or genes conferred the basic olfactory repertoire to ancestral flies before the speciation of the Drosophila and Sophophora subgenera about 40 Mya. This repertoire has been largely maintained in the current species, whereas lineage-specific gene duplication seems to have led to additional specialization in some species in response to specific ecological conditions.
Introduction
Animals rely on sensory neurons to detect environmental odors. The external chemical stimuli are transformed into electrical pulses and then relayed and processed by the nervous system. Having evolved in enormously different environments, the olfactory systems diverge greatly in extant species with distinct specificity and sensitivity (Buck 1996; Hildebrand and Shepherd 1997). Because of its anatomical simplicity, the availability of genetic information, and the well-established physiological and behavioral analysis techniques, Drosophila has become an ideal model organism for studying olfaction (de Bruyne et al. 2001). Drosophila has 2 pairs of peripheral olfactory organs. The antenna house about 1,200 olfactory receptor neurons (ORNs), and the maxillary palps host about 120 ORNs (Stocker 1994; Shanbhag 1999, 2000). These ORNs are compartmentalized into sensilla, protruded hair-like structures filled with proteineous fluid that bathes the dendrites of ORNs (Xu et al. 2003). Odors are initially received by ORNs and the signal propagates through ORN axons and reaches glomeruli in antennal lobe for extensive processing before passing to higher brain regions. Experiments have shown that ORNs possess different response profiles and dynamics (de Bruyne et al. 2001; Elmore et al. 2003) and such diversity is determined by the odorant receptors residing in the ORNs (Hallem et al. 2004; Kreher et al. 2005).
The odorant receptors in Drosophila melanogaster have been identified by (Clyne et al. 1999) using a bioinformatics-driven approach (Kim et al. 2000) and independently by others using experimental approaches (Gao and Chess 1999; Vosshall et al. 1999). These receptors have about 400 amino acid residues and belong to the superfamily of G-protein–coupled receptors with 7 transmembrane domains. Recently, it has been suggested that they may adopt an unconventional topology with intracellular N-termini (Benton et al. 2006). Surprisingly, D. melanogaster has a rather small Or gene repertoire with only 61 members discovered so far, of which Or98P is a pseudogene (Robertson et al. 2003), whereas the estimated number of Or genes in vertebrates ranges from about 100 in fish (Ngai et al. 1993) to about 1,000 in mouse and rat (Mombaerts 1999). However, it should be noted that recent evidence suggests the existence of more divergent members that have not been cloned (Carlson J, personal communication). The odorant receptors in D. melanogaster are extremely divergent at the sequence level with the most divergent pair being about 10% identical in their protein sequences. Nevertheless, the functional role of annotated odorant receptors has been validated using genetic and physiological approaches (Hallem et al. 2004; Kreher et al. 2005). Evidence shows that the odorant receptor determines many properties of ORNs, including spontaneous firing rate, signaling mode, onset and termination dynamics. Furthermore, each receptor has a characteristic response spectrum to different chemical stimuli.
The original identification of Or genes was enabled by the D. melanogaster genome sequencing project allowing the use of a computational approach to solve a long-standing empirical problem (Kim and Carlson 2002). Subsequent combination of sequence and functional analysis has given us many insights into the fruit fly's olfaction system. The expanded genome projects in the Drosophila genus yielded 12 whole-genome sequences distributed across the genus. In this work, we identified Or genes in 11 Drosophila species and investigated the origin and diversification of Or genes in this genus.
Materials and Methods
Identification of Odorant Receptors
The gene, transcript, and protein sequences of 61 odorant receptors in D. melanogaster were downloaded from FlyBase on August 2005. The pseudogene Or98P has only the gene sequence. The comparative analysis freeze 1 (CAF1) of 12 Drosophila genome assemblies were downloaded from the Lawrence Berkeley National Lab Web site for “Assembly/Alignment/Annotation of 12 related Drosophila species” (referred as AAA thereafter) on June 2006. Files mapping CAF1 scaffold and chromosome names to GenBank accession versions were downloaded from AAAWiki (http://rana.lbl.gov/drosophila/wiki) on January 2007. All annotation in our data set used GenBank accession number. CAF1 includes assembly release 4.3 for D. melanogaster and assemblies for 11 other Drosophila species including Drosophila simulans, Drosophila sechellia, Drosophila yakuba, Drosophila erecta, Drosophila ananassae, Drosophila pseudoobscura, Drosophila persimilis, Drosophila willistoni, Drosophila mojavensis, Drosophila virilis, and Drosophila grimshawi. There were 2 very similar assembly versions for D. pseudoobscura and D. yakuba. We used the nonreconciled versions. Protein sequences of known Or genes were used by TBlastN to search against the genome assemblies to find new Or genes. The exon–intron boundaries of newly identified Or genes were identified based on their alignment to the known ones using the CHAOS + DIALIGN Web server (Brudno et al. 2003). Whenever the alignment was insufficient to recognize splice sites, assistance was sought from the splice site prediction Web server at Berkeley Drosophila Genome Project. Newly identified receptors served as seeds in the next TBlastN search. An E value cutoff of 0.01 was used in all searches. The iteration was continued until no further Or gene was found. All sequences that are subsequences to other longer ones were excluded. We were unable to get the full-length sequences of some Or genes because either they resided on the scaffold boundaries or parts of them were not sequenced. Such Or genes were tentatively designated as incomplete unless stop codons exist inside their sequences. Or83b in D. simulans initially appeared to be a pseudogene. This is probably caused by a sequencing error because Or83b is known to be indispensable in olfaction and is unlikely to cease function in D. simulans. This gene, though having a stop codon in the 2nd exon, can be aligned perfectly with its orthologs in other species and shows no sign of excessive sequence divergence that is normal for pseudogenes under no selective pressure. Therefore, the sequence was corrected and this gene was treated as a functional gene.
Nomenclature of Odorant Receptors
The 61 Or genes in D. melanogaster were classified into 57 orthologous groups. Four groups each include 2 genes. They are Or19a and Or19b, Or22a and Or22b, Or33a and Or33b, and Or98a and Or98P. Or genes in other genomes were assigned to one of the groups based on sequence similarity and genic structure. The relative genomic/scaffold location was used to assist assignment if ambiguity arose. Two new groups were added because they do not have obvious orthologs in D. melanogaster.
Originally, Or genes in D. melanogaster were named by their positions in the cytological map (DORN Committee 2000). For example, Or22a and Or22b are both in band 22. This scheme was expanded to accommodate the new situation as follows: 1) the original 61 Or genes in D. melanogaster were given the prefix “Dmel”; 2) genes in other species were named after their orthologous gene in D. melanogaster with a 4-letter species prefix. If a D. melanogaster gene has multiple orthologous genes in another species, these copies were distinguished by a hyphen and a number suffix. For example, DsimOr2a is the ortholog to DmelOr2a; DyakOr67a-1 and DyakOr67a-2 are the 2 orthologs to DmelOr67a. 3) The new orthologous groups were named OrN1 and OrN2, preceded by the appropriate species prefix. The annotated sequences can be downloaded at http://kim.bio.upenn.edu/wiki/html/Public/Downloads.htm.
Phylogenetic and Evolutionary Analysis of Odorant Receptors
The protein sequences of 730 Or proteins were aligned by MUSCLE (Edgar 2004). Pseudogenes and short incomplete genes were not used. The alignment was filtered so that only columns with less than 10% gaps were retained. Similarly, a sequence was kept only if it had amino acids in more than 90% of the retained columns. As a result, 727 sequences were kept with 338 positions after filtration. The filtered alignment was used to compute the pairwise genetic distance by PROTDIST in PHYLIP3.66 (Felsenstein 2006). The program used JTT matrix with a gamma distribution of rates among positions. The parameter alpha was set to 1. A Neighbor-Joining (NJ) tree was constructed based on the distance matrix. The phylogeny quality was assessed by 1,000 bootstrap replicates.
The nonsynonymous to synonymous substitution ratios (dN/dS) for orthologous groups were computed by “codeml” in PAML (Yang 1997), assuming a homogeneous ratio among lineages. Two pairs of site models in PAML were chosen to test positive selection using likelihood ratio test (LRT) and to identify positively selected sites in an orthologous group using both naive empirical bayes (NEB) and bayes empirical bayes (BEB) estimation methods. The 1st pair of models was M1a (NearlyNeutral) and M2a (PositiveSelection), the 2nd pair was M7 (beta) and M8 (beta &ω). Positively selected sites were mapped to the topologies of corresponding receptors, which were predicted by the PolyPhobius server using aligned orthologous sequences (Kall et al. 2005).
Within each orthologous group, gene duplication and loss events were identified using GeneTree (Page and Charleston 1997), assisted by the known species tree and the relative genomic/scaffold positions of the duplicated genes. The alignment of orthologous genes was used to infer intron gains and losses. We simulated Or gene evolution on the species tree using a birth–death model by assuming that 1) the ancestral species had 60 Or genes, 2) these genes evolved independently, and 3) the gene duplication (birth) and loss (death) rates were homogeneous within the species tree. The duplication/loss rates were estimated using the counts of events described above and the estimated molecular tree calibrated by the estimated divergence time since the separation of the 2 subgenera (fig. 1). The simulation was run 1,000,000 times to generate a distribution of the variance of gene numbers in 12 species to assess whether the observed variation in gene numbers in each species is greater or less than expected under a null model of random duplication/loss.

Species tree of 12 Drosophila species used in this study. Divergence time is given in Mya and was estimated from a linearized Adh molecular clock (Russo et al. 1995; Powell 1997). Four types of evolutionary events were studied: gene loss (L), gene duplication (D), intron loss (IL), and intron gain (IG). Numbers in brackets count the estimated occurrence.
Results
Identification of Or Genes in Drosophila Genomes and Patterns of Divergence
Based on our search method, we identified 752 Or genes in 11 Drosophila genome assemblies (fig. 1). (Currently, there are 68 D. pseudoobscura Or genes annotated in FlyBase [www.flybase.org]. Our annotations cover all 68 with an addition of 3 more pseudogenes; 1 DpseOr98a and 2 DpseOr65b's). Together with the 61 known Or genes in D. melanogaster, we have the sequence information of 813 Or genes in the Drosophila genus. Of these genes, 73 are pseudogenes that lost the protein-coding ability because of frameshift or premature stop codons as indicated by homologous alignments. Another 19 were tentatively annotated as incomplete because parts of their coding regions were not sequenced and no evidence of coding problem was found in the sequenced portions. AAA recently released gene models for the 12 Drosophila genomes with 523 Or genes, of which 333 genes have annotated protein-coding sequences. There were all included in our data set.
The 813 Or genes were clustered into 59 orthologous groups based on their sequence similarity and their physical location on chromosomes or scaffolds (table 1). An orthologous group may contain multiple inparalogous genes in a species (Sonnhammer and Koonin 2002). For example, group Or22a consists of Or22a and Or22b from D. melanogaster, 2 genes from D. simulans, 2 from D. sechellia, 6 from D. ananassae, and 14 from the other 8 species. Members in the same orthologous group had identical or similar genic structure. The genes usually have the same number of exons, and the lengths of corresponding exons are close to each other. Extra and missing introns were easily recognized from the alignment.
Dmel | Dsim | Dsec | Dyak | Dere | Dana | Dpse | Dper | Dwil | Dmoj | Dvir | Dgri | MPPI | |
Or1a | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 61 |
Or2a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 31 |
Or7a | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1[1] | 1 | 0 | 0 | 0 | 67 |
Or9a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or10a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 52 |
Or13a | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1(1) | 1 | 1 | 1[1] | 1 | 54 |
Or19a | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 31 |
Or22a | 2 | 2 | 2[1] | 1 | 1 | 6[2] | 3[1] | 3[1] | 1 | 1 | 1 | 2 | 43 |
Or22c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 45 |
Or23a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 3[2] | 37 |
Or24a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 75 |
Or30a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 76 |
Or33a | 2 | 2 | 2 | 2 | 2 | 5 | 3 | 3 | 1 | 0 | 0 | 0 | 47 |
Or33c | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 1 | 1 | 44 |
Or35a | 1 | 1(1) | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or42a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 2 | 3[1] | 1 | 61 |
Or42b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 11[3] | 68 |
Or43a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 56 |
Or43b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 51 |
Or45a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 58 |
Or45b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 65 |
Or46a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 46 |
Or47a | 1 | 2[1] | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 41 |
Or47b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 47 |
Or49a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 2 | 39 |
Or49b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 66 |
Or56a | 1 | 1(1) | 1[1] | 1 | 1 | 1 | 2 | 2[2] | 1 | 1 | 1 | 2 | 48 |
Or59a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 1 | 3 | 4(1)[1] | 48 |
Or59b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 4(1)[1] | 22 |
Or59c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 56 |
Or63a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or65a | 1 | 1 | 1[1] | 1 | 1 | 3 | 0 | 0 | 2 | 1 | 1[1] | 0 | 31 |
Or65b | 1 | 1 | 1 | 1 | 1 | 0 | 7[2] | 6[4] | 0 | 0 | 0 | 0 | 79 |
Or65c | 1 | 1 | 1[1] | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 48 |
Or67a | 1 | 3[1] | 2[1] | 2 | 2[1] | 1 | 1 | 0 | 3 | 2 | 2 | 2 | 32 |
Or67b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 53 |
Or67c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 78 |
Or67d | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 5 | 24 |
Or69a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 2(1) | 31 |
Or71a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 42 |
Or74a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 49 |
Or82a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 1 | 1 | 1 | 54 |
Or83a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 67 |
Or83b | 1 | 1 | 1 | 1 | 1 | 3(1)[1] | 1 | 1 | 1 | 1 | 1 | 1 | 85 |
Or83c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 1 | 41 |
Or85a | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 1 | 2 | 0 | 0 | 24 |
Or85b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 0 | 1[1] | 53 |
Or85c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 2 | 1 | 1 | 53 |
Or85d | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 61 |
Or85e | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 53 |
Or85f | 1 | 2(2) | 3(2) | 1 | 1 | 1 | 1 | 1 | 8[2] | 1[1] | 1 | 1 | 39 |
Or88a | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or92a | 1 | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 36 |
Or94a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 51 |
Or94b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 59 |
Or98a | 2[1] | 2[1] | 1 | 2 | 2 | 3 | 4[1] | 4[1] | 8[4] | 3 | 3 | 4(1)[1] | 23 |
Or98b | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1[1] | 1[1] | 54 |
OrN1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1[1] | 3 | 1 | 1 | 1 | 49 |
OrN2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 6[4] | 2[1] | 48 |
Total | 61[1] | 66(9)[6] | 63(4)[8] | 62(1) | 60[1] | 71(1)[5] | 71[5] | 70(1)[16] | 80[8] | 62[1] | 64[11] | 83(4)[11] |
Dmel | Dsim | Dsec | Dyak | Dere | Dana | Dpse | Dper | Dwil | Dmoj | Dvir | Dgri | MPPI | |
Or1a | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 61 |
Or2a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 31 |
Or7a | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1[1] | 1 | 0 | 0 | 0 | 67 |
Or9a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or10a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 52 |
Or13a | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1(1) | 1 | 1 | 1[1] | 1 | 54 |
Or19a | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 31 |
Or22a | 2 | 2 | 2[1] | 1 | 1 | 6[2] | 3[1] | 3[1] | 1 | 1 | 1 | 2 | 43 |
Or22c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 45 |
Or23a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 3[2] | 37 |
Or24a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 75 |
Or30a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 76 |
Or33a | 2 | 2 | 2 | 2 | 2 | 5 | 3 | 3 | 1 | 0 | 0 | 0 | 47 |
Or33c | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 1 | 1 | 44 |
Or35a | 1 | 1(1) | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or42a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 2 | 3[1] | 1 | 61 |
Or42b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 11[3] | 68 |
Or43a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 56 |
Or43b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 51 |
Or45a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 58 |
Or45b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 65 |
Or46a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 46 |
Or47a | 1 | 2[1] | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 41 |
Or47b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 47 |
Or49a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 2 | 39 |
Or49b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 66 |
Or56a | 1 | 1(1) | 1[1] | 1 | 1 | 1 | 2 | 2[2] | 1 | 1 | 1 | 2 | 48 |
Or59a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 1 | 3 | 4(1)[1] | 48 |
Or59b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 4(1)[1] | 22 |
Or59c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 56 |
Or63a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or65a | 1 | 1 | 1[1] | 1 | 1 | 3 | 0 | 0 | 2 | 1 | 1[1] | 0 | 31 |
Or65b | 1 | 1 | 1 | 1 | 1 | 0 | 7[2] | 6[4] | 0 | 0 | 0 | 0 | 79 |
Or65c | 1 | 1 | 1[1] | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 48 |
Or67a | 1 | 3[1] | 2[1] | 2 | 2[1] | 1 | 1 | 0 | 3 | 2 | 2 | 2 | 32 |
Or67b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 53 |
Or67c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 78 |
Or67d | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 5 | 24 |
Or69a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 2(1) | 31 |
Or71a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 42 |
Or74a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 49 |
Or82a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 1 | 1 | 1 | 54 |
Or83a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 67 |
Or83b | 1 | 1 | 1 | 1 | 1 | 3(1)[1] | 1 | 1 | 1 | 1 | 1 | 1 | 85 |
Or83c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 1 | 41 |
Or85a | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 1 | 2 | 0 | 0 | 24 |
Or85b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 0 | 1[1] | 53 |
Or85c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 2 | 1 | 1 | 53 |
Or85d | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 61 |
Or85e | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 53 |
Or85f | 1 | 2(2) | 3(2) | 1 | 1 | 1 | 1 | 1 | 8[2] | 1[1] | 1 | 1 | 39 |
Or88a | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or92a | 1 | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 36 |
Or94a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 51 |
Or94b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 59 |
Or98a | 2[1] | 2[1] | 1 | 2 | 2 | 3 | 4[1] | 4[1] | 8[4] | 3 | 3 | 4(1)[1] | 23 |
Or98b | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1[1] | 1[1] | 54 |
OrN1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1[1] | 3 | 1 | 1 | 1 | 49 |
OrN2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 6[4] | 2[1] | 48 |
Total | 61[1] | 66(9)[6] | 63(4)[8] | 62(1) | 60[1] | 71(1)[5] | 71[5] | 70(1)[16] | 80[8] | 62[1] | 64[11] | 83(4)[11] |
Note.—Species: Dmel (Drosophila melanogaster), Dsim (Drosophila simulans), Dsec (Drosophila sechellia), Dyak (Drosophila yakuba), Dere (Drosophila erecta), Dana (Drosophila ananassae), Dpse (Drosophila pseudoobscura), Dper (Drosophila persimilis), Dwil (Drosophila willistoni), Dmoj (Drosophila mojavensis), Dvir (Drosophila virilis), and Dgri (Drosophila grimshawi). MPPI: minimal pairwise peptide identity within an orthologous group. Total: the number of pseudogenes is given in the brackets. The number of incompletely sequenced genes is given in the parentheses. Both are included in the total number of genes.
Dmel | Dsim | Dsec | Dyak | Dere | Dana | Dpse | Dper | Dwil | Dmoj | Dvir | Dgri | MPPI | |
Or1a | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 61 |
Or2a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 31 |
Or7a | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1[1] | 1 | 0 | 0 | 0 | 67 |
Or9a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or10a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 52 |
Or13a | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1(1) | 1 | 1 | 1[1] | 1 | 54 |
Or19a | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 31 |
Or22a | 2 | 2 | 2[1] | 1 | 1 | 6[2] | 3[1] | 3[1] | 1 | 1 | 1 | 2 | 43 |
Or22c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 45 |
Or23a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 3[2] | 37 |
Or24a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 75 |
Or30a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 76 |
Or33a | 2 | 2 | 2 | 2 | 2 | 5 | 3 | 3 | 1 | 0 | 0 | 0 | 47 |
Or33c | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 1 | 1 | 44 |
Or35a | 1 | 1(1) | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or42a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 2 | 3[1] | 1 | 61 |
Or42b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 11[3] | 68 |
Or43a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 56 |
Or43b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 51 |
Or45a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 58 |
Or45b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 65 |
Or46a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 46 |
Or47a | 1 | 2[1] | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 41 |
Or47b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 47 |
Or49a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 2 | 39 |
Or49b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 66 |
Or56a | 1 | 1(1) | 1[1] | 1 | 1 | 1 | 2 | 2[2] | 1 | 1 | 1 | 2 | 48 |
Or59a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 1 | 3 | 4(1)[1] | 48 |
Or59b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 4(1)[1] | 22 |
Or59c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 56 |
Or63a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or65a | 1 | 1 | 1[1] | 1 | 1 | 3 | 0 | 0 | 2 | 1 | 1[1] | 0 | 31 |
Or65b | 1 | 1 | 1 | 1 | 1 | 0 | 7[2] | 6[4] | 0 | 0 | 0 | 0 | 79 |
Or65c | 1 | 1 | 1[1] | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 48 |
Or67a | 1 | 3[1] | 2[1] | 2 | 2[1] | 1 | 1 | 0 | 3 | 2 | 2 | 2 | 32 |
Or67b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 53 |
Or67c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 78 |
Or67d | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 5 | 24 |
Or69a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 2(1) | 31 |
Or71a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 42 |
Or74a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 49 |
Or82a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 1 | 1 | 1 | 54 |
Or83a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 67 |
Or83b | 1 | 1 | 1 | 1 | 1 | 3(1)[1] | 1 | 1 | 1 | 1 | 1 | 1 | 85 |
Or83c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 1 | 41 |
Or85a | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 1 | 2 | 0 | 0 | 24 |
Or85b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 0 | 1[1] | 53 |
Or85c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 2 | 1 | 1 | 53 |
Or85d | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 61 |
Or85e | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 53 |
Or85f | 1 | 2(2) | 3(2) | 1 | 1 | 1 | 1 | 1 | 8[2] | 1[1] | 1 | 1 | 39 |
Or88a | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or92a | 1 | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 36 |
Or94a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 51 |
Or94b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 59 |
Or98a | 2[1] | 2[1] | 1 | 2 | 2 | 3 | 4[1] | 4[1] | 8[4] | 3 | 3 | 4(1)[1] | 23 |
Or98b | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1[1] | 1[1] | 54 |
OrN1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1[1] | 3 | 1 | 1 | 1 | 49 |
OrN2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 6[4] | 2[1] | 48 |
Total | 61[1] | 66(9)[6] | 63(4)[8] | 62(1) | 60[1] | 71(1)[5] | 71[5] | 70(1)[16] | 80[8] | 62[1] | 64[11] | 83(4)[11] |
Dmel | Dsim | Dsec | Dyak | Dere | Dana | Dpse | Dper | Dwil | Dmoj | Dvir | Dgri | MPPI | |
Or1a | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 61 |
Or2a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 31 |
Or7a | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1[1] | 1 | 0 | 0 | 0 | 67 |
Or9a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or10a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 52 |
Or13a | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1(1) | 1 | 1 | 1[1] | 1 | 54 |
Or19a | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 31 |
Or22a | 2 | 2 | 2[1] | 1 | 1 | 6[2] | 3[1] | 3[1] | 1 | 1 | 1 | 2 | 43 |
Or22c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 45 |
Or23a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 3[2] | 37 |
Or24a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 75 |
Or30a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 76 |
Or33a | 2 | 2 | 2 | 2 | 2 | 5 | 3 | 3 | 1 | 0 | 0 | 0 | 47 |
Or33c | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 1 | 1 | 44 |
Or35a | 1 | 1(1) | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or42a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 2 | 3[1] | 1 | 61 |
Or42b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 11[3] | 68 |
Or43a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 56 |
Or43b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 51 |
Or45a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 58 |
Or45b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 65 |
Or46a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 46 |
Or47a | 1 | 2[1] | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 41 |
Or47b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 47 |
Or49a | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 2 | 39 |
Or49b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 66 |
Or56a | 1 | 1(1) | 1[1] | 1 | 1 | 1 | 2 | 2[2] | 1 | 1 | 1 | 2 | 48 |
Or59a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 1 | 3 | 4(1)[1] | 48 |
Or59b | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3[1] | 4(1)[1] | 22 |
Or59c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 56 |
Or63a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 54 |
Or65a | 1 | 1 | 1[1] | 1 | 1 | 3 | 0 | 0 | 2 | 1 | 1[1] | 0 | 31 |
Or65b | 1 | 1 | 1 | 1 | 1 | 0 | 7[2] | 6[4] | 0 | 0 | 0 | 0 | 79 |
Or65c | 1 | 1 | 1[1] | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 48 |
Or67a | 1 | 3[1] | 2[1] | 2 | 2[1] | 1 | 1 | 0 | 3 | 2 | 2 | 2 | 32 |
Or67b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 53 |
Or67c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 78 |
Or67d | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 5 | 24 |
Or69a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2[1] | 1 | 1 | 2(1) | 31 |
Or71a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 42 |
Or74a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 49 |
Or82a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 1 | 1 | 1 | 54 |
Or83a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 67 |
Or83b | 1 | 1 | 1 | 1 | 1 | 3(1)[1] | 1 | 1 | 1 | 1 | 1 | 1 | 85 |
Or83c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 1 | 41 |
Or85a | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 1 | 2 | 0 | 0 | 24 |
Or85b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 0 | 0 | 0 | 1[1] | 53 |
Or85c | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 2 | 1 | 1 | 53 |
Or85d | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 61 |
Or85e | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 53 |
Or85f | 1 | 2(2) | 3(2) | 1 | 1 | 1 | 1 | 1 | 8[2] | 1[1] | 1 | 1 | 39 |
Or88a | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 42 |
Or92a | 1 | 1 | 1 | 2(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 36 |
Or94a | 1 | 1(1) | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 51 |
Or94b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1[1] | 1 | 1 | 1 | 1 | 59 |
Or98a | 2[1] | 2[1] | 1 | 2 | 2 | 3 | 4[1] | 4[1] | 8[4] | 3 | 3 | 4(1)[1] | 23 |
Or98b | 1 | 1[1] | 1[1] | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1[1] | 1[1] | 54 |
OrN1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1[1] | 3 | 1 | 1 | 1 | 49 |
OrN2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5 | 6[4] | 2[1] | 48 |
Total | 61[1] | 66(9)[6] | 63(4)[8] | 62(1) | 60[1] | 71(1)[5] | 71[5] | 70(1)[16] | 80[8] | 62[1] | 64[11] | 83(4)[11] |
Note.—Species: Dmel (Drosophila melanogaster), Dsim (Drosophila simulans), Dsec (Drosophila sechellia), Dyak (Drosophila yakuba), Dere (Drosophila erecta), Dana (Drosophila ananassae), Dpse (Drosophila pseudoobscura), Dper (Drosophila persimilis), Dwil (Drosophila willistoni), Dmoj (Drosophila mojavensis), Dvir (Drosophila virilis), and Dgri (Drosophila grimshawi). MPPI: minimal pairwise peptide identity within an orthologous group. Total: the number of pseudogenes is given in the brackets. The number of incompletely sequenced genes is given in the parentheses. Both are included in the total number of genes.
The protein sequences for receptors in a group are also highly similar to each other. The average pairwise peptide identity in an orthologous group ranges from 46 to 91%. The most conserved gene is Or83b. Even between DsecOr83b and DgriOr83b, the 2 most divergent members of this group, their protein sequences are 85% identical. This reflected the vital role of Or83b in olfaction. This receptor forms heterodimers with other receptors and is essential for Drosophila olfaction (Benton et al. 2006). In comparison to the diversity of the ortholog groups, the average peptide identity for all odorant receptors is only 15%.
Previous work reconstructed the phylogeny of Or genes in D. melanogaster with evidence showing that these genes were as ancient as the origin of the arthropods (Robertson et al. 2003). We reconstructed a phylogeny for 727 receptors using NJ method (Saitou and Nei 1987). These receptors include isoforms of Or22a, Or46a, and Or69a created by alternative splicing. The phylogeny is shown in figure 2. The backbone of this phylogeny, that is, the subtree of the most recent common ancestors for orthologous groups, largely resembles the phylogeny of Or genes in D. melanogaster as expected. Local variations exist, but almost all at the branches where bootstrap supports are low. This is most likely caused by 2 factors: our method for reconstructing phylogeny was slightly different from the previous work and our data set is much larger.

The phylogeny of Or gene family in 12 Drosophila species. Clades with the same color are the orthologous genes with nearby labels. Internal nodes with high bootstrap support (>90%) were marked by solid squares and these with moderate support (>70%) were marked by empty cycles. Branches within orthologous genes are not marked because they nearly always have high bootstrap supports. The phylogeny is rooted using Or83b. The branch length is not in proportion to the actual genetic distance.
The phylogeny supports our classification of Or genes into orthologous groups. For 54 orthologous groups, their members are exclusively clustered into monophyletic groups. Or65a, Or65b, and Or65c are mostly tandem duplicated in nearly all species; however, they share higher identity within than among species. Consequently, they tend to be placed together in the phylogeny. Mixture clades of Or98a and Or85a are also found as these 2 groups share high sequence similarity. When the phylogeny is examined in detail (supplementary document 1, Supplementary Material online), it can be seen that these orthologous groups not only vary in size but that there are many inparalogs, indicating a complicated evolutionary process that includes gene duplication, gene loss, and pseudogenization.
Evolution of Genomic Location for Or Genes
All Or genes in D. melanogaster were anchored and oriented onto chromosomes in Robertson et al. (2003). These genes are broadly dispersed over 5 chromosome arms (also called Muller elements)— X, 2L, 2R, 3L, and 3R— reflecting the old age of this gene family. In our study, we mapped Or genes onto 3 additional genomes, D. simulans, D. yakuba, and D. pseudoobscura. The scaffolds in other 8 assemblies were not placed onto chromosomes, and we were not able to place them. The location and orientation of Or genes in these 3 genome were compared with their orthologs in D. melanogaster (fig. 3). We then inferred 3 types of genomic rearrangements: paracentric inversion (inversion confined to one arm of chromosome), pericentric inversion (inversion spanning centromere), and translocation (Powell 1997).

Comparison of genomic location and orientation of Or genes in Drosophila melanogaster and 3 other species. (A) D. melanogaster versus Drosophila simulans. (B) D. melanogaster versus Drosophila yakuba. (C). D. melanogaster versus Drosophila pseudoobscura. Dotted boxes are the 5 Muller elements. Each Muller element is normalized to have unit length. The x-y coordinates of an Or gene are its relative positions in the corresponding Muller element. Orthologous Or genes with different orientation are represented by solid squares, whereas those with the same orientation by open circles. Labeled Or genes outside the boxes are translocation between Muller elements. Translocations and inversions within a Muller element can be identified by the off-diagonal squares and circles. For example, a paracentric inversion is obvious in chromosome 3R in D. simulans. In D. pseudoobscura, there are several inparaologs of Or67b and Or98a. Only one for each was represented in the graph.
As expected, we saw more rearrangement events in pairs of more distantly related species. Translocation across the 5 Muller elements was observed in D. yakuba and D. pseudoobscura, using D. melanogaster as reference. We were able to infer the rearrangement events in D. simulans and D. yakuba by assuming the most parsimonious edit path between the compared genomes (table 2). D. pseudoobscura had much larger extent of genomic rearrangements and it became difficult to count and categorize individual events and therefore was not used for the analysis.
Genome Shuffling Changed Or Gene Location and Orientation Using D. melanogaster as Reference
Event | Chromosome Arm | Or Genes | |
D. simulans | Paracentric inversion | 3R | Or92a, Or88a, Or85b, Or85c, Or85d, Or85e |
D. yakuba | Paracentric translocation | 2L | Or30a |
Paracentric inversion | 2L | Or35a | |
Paracentric inversion | 2R | Or49a, Or49b | |
Paracentric inversion | 2R | Or56a | |
Paracentric translocation + inversion | 3L | Or63a | |
Paracentric inversion | 3L | Or74a | |
Paracentric translocation + inversion | 3L | Or67a, Or67b | |
Paracentric translocation | 3R | Or92a | |
Paracentric inversion | 3R | Or88a | |
Paracentric translocation | 3R | Or83c | |
Paracentric inversion | 3R | Or98a, Or98b | |
Paracentric inversion + translocation | X | Or2a | |
Paracentric inversion + translocation | X | Or7a | |
Paracentric translocation | X | Or13a | |
Paracentric inversion + translocation | X | Or19a | |
Paracentric inversion + translocation | X | Or10a | |
Pericentric translocation | 2R → 2L | Or43a, Or43b, Or45a, Or45b, Or46a | |
Pericentric translocation + inversion | 2R → 2L | Or42b |
Event | Chromosome Arm | Or Genes | |
D. simulans | Paracentric inversion | 3R | Or92a, Or88a, Or85b, Or85c, Or85d, Or85e |
D. yakuba | Paracentric translocation | 2L | Or30a |
Paracentric inversion | 2L | Or35a | |
Paracentric inversion | 2R | Or49a, Or49b | |
Paracentric inversion | 2R | Or56a | |
Paracentric translocation + inversion | 3L | Or63a | |
Paracentric inversion | 3L | Or74a | |
Paracentric translocation + inversion | 3L | Or67a, Or67b | |
Paracentric translocation | 3R | Or92a | |
Paracentric inversion | 3R | Or88a | |
Paracentric translocation | 3R | Or83c | |
Paracentric inversion | 3R | Or98a, Or98b | |
Paracentric inversion + translocation | X | Or2a | |
Paracentric inversion + translocation | X | Or7a | |
Paracentric translocation | X | Or13a | |
Paracentric inversion + translocation | X | Or19a | |
Paracentric inversion + translocation | X | Or10a | |
Pericentric translocation | 2R → 2L | Or43a, Or43b, Or45a, Or45b, Or46a | |
Pericentric translocation + inversion | 2R → 2L | Or42b |
Genome Shuffling Changed Or Gene Location and Orientation Using D. melanogaster as Reference
Event | Chromosome Arm | Or Genes | |
D. simulans | Paracentric inversion | 3R | Or92a, Or88a, Or85b, Or85c, Or85d, Or85e |
D. yakuba | Paracentric translocation | 2L | Or30a |
Paracentric inversion | 2L | Or35a | |
Paracentric inversion | 2R | Or49a, Or49b | |
Paracentric inversion | 2R | Or56a | |
Paracentric translocation + inversion | 3L | Or63a | |
Paracentric inversion | 3L | Or74a | |
Paracentric translocation + inversion | 3L | Or67a, Or67b | |
Paracentric translocation | 3R | Or92a | |
Paracentric inversion | 3R | Or88a | |
Paracentric translocation | 3R | Or83c | |
Paracentric inversion | 3R | Or98a, Or98b | |
Paracentric inversion + translocation | X | Or2a | |
Paracentric inversion + translocation | X | Or7a | |
Paracentric translocation | X | Or13a | |
Paracentric inversion + translocation | X | Or19a | |
Paracentric inversion + translocation | X | Or10a | |
Pericentric translocation | 2R → 2L | Or43a, Or43b, Or45a, Or45b, Or46a | |
Pericentric translocation + inversion | 2R → 2L | Or42b |
Event | Chromosome Arm | Or Genes | |
D. simulans | Paracentric inversion | 3R | Or92a, Or88a, Or85b, Or85c, Or85d, Or85e |
D. yakuba | Paracentric translocation | 2L | Or30a |
Paracentric inversion | 2L | Or35a | |
Paracentric inversion | 2R | Or49a, Or49b | |
Paracentric inversion | 2R | Or56a | |
Paracentric translocation + inversion | 3L | Or63a | |
Paracentric inversion | 3L | Or74a | |
Paracentric translocation + inversion | 3L | Or67a, Or67b | |
Paracentric translocation | 3R | Or92a | |
Paracentric inversion | 3R | Or88a | |
Paracentric translocation | 3R | Or83c | |
Paracentric inversion | 3R | Or98a, Or98b | |
Paracentric inversion + translocation | X | Or2a | |
Paracentric inversion + translocation | X | Or7a | |
Paracentric translocation | X | Or13a | |
Paracentric inversion + translocation | X | Or19a | |
Paracentric inversion + translocation | X | Or10a | |
Pericentric translocation | 2R → 2L | Or43a, Or43b, Or45a, Or45b, Or46a | |
Pericentric translocation + inversion | 2R → 2L | Or42b |
We observed far more chromosomal reshufflings in D. yakuba than in D. simulans. In D. simulans, the only visible rearrangement is the large inversion of chromosome 3R, which occurred at the breakpoints 84F1 and 93F6–7 (Ashburner and Lemeunier 1975). In D. yakuba, we observed all 3 types of rearrangement events. Most of these events were simple inversions or translocations. Some are more complicated, as illustrated by the location and orientation of 5 Or genes on chromosome X of D. melanogaster and D. yakuba. These 5 genes are Or7a, Or9a, Or10a, Or13a, and Or19a (supplementary fig. 5, Supplementary Material online). The most parsimonious rearrangement involves 2 inversions and 2 translocations to obtain the edit pattern of these 5 genes. In another significant translocation in D. yakuba, a segment of at least 3 Mb moved from chromosome 2R to 2L, relocating 6 Or genes— Or42b, Or43a, Or43b, Or45a, Or45b, and Or46a— further reshuffling occurred after the relocation as we can see from the location of Or35a (supplementary fig. 1, Supplementary Material online).
Gene Duplication, Gene Loss, and Evolution of Genic Structure
Gene duplication is the major way for species to acquire new function (Ohno 1970) and was very common in the Drosophila Or gene family. By reconciling the species phylogeny with the gene phylogeny and by using genomic/scaffold location of Or genes, we estimated the gene duplication/loss events and intron loss/gain events that are summarized in figure 1. We note that although duplication events are likely to be more reasonably estimated, the estimates of gene loss are subject to the compounding effect of sequence divergence resulting in erroneous lack of ortholog identification. The prevalence of pseudogenes and the ambiguity of classifying some Or genes added to the difficulty of reconstructing gene phylogeny and inferring events duplication.
We observed that the distribution of receptor gain/loss events varies greatly in the species tree. The Hawaiian fly D. grimshawi seems to have undergone the most dramatic gene duplication and loss events. D. willistoni, D. ananassae, and the 2 flies in the obscura group also had high numbers of gene duplication and loss (fig. 1). In particular, D. persimilis, the sister species of D. pseudoobscura, suffered 13 gene loss events. This is surprising given the relatively recent speciation of these 2 species. In terms of gene copy numbers, the Or gene family is most stable in D. erecta and D. yakuba.
The genomic organization and sequence information of duplicated genes can give us insights on evolutionary dynamics. In the 240 duplicated genes, about 60% exist as tandem arrays, some of which are rather large. The longest tandem array is in D. willistoni with a battery of 8 Or85f genes, of which the 3rd and the 8th member are pseudogenes. In D. grimshawi, 11 copies of Or42b were separated into 3 tandem arrays of size 7, 2, and 2, respectively. Each of these 3 tandem clusters contains 1 pseudogene. The 8 putatively functional Or42b genes encode receptors that share 87% protein sequence identity. These receptors are very likely to be functionally divergent (see Discussion). Other evolutionary events occurred within some duplicated genes including pseudogenization and alternative splicing. An example is shown in figure 4 for DanaOr22a in D. ananassae. In this tandem array of 6 genes, the 1st and the 5th genes are pseudogenes, the 2nd shows large change in the length of the 3rd intron, and there is evidence of alternative splicing in the 4th gene.

The tandem array of 6 Or22a genes in Drosophila ananassae. The rectangles represent exons, these in pseudogenes by open rectangles. Arrows indicate the gene orientation. A functional DanaOr22a gene has 4 exons. Alternative splicing was found for DanaOr22a-4 that has 2 forms of the 1st exon, labeled as “A” and “B.” The 2 first-exon isoforms both encode 42-amino acid segments that differ in 16 positions.
Or genes also differ in intron numbers and it has been speculated that the ancestral Or gene contained 3 introns. The subsequent adding and removal of introns have created Or genes with different genic structures where the intron number varies from 1 to 9 (Robertson et al. 2003). We observed that the genic structure is preserved in most orthologous Or genes, and we found only 8 intron gain events and 11 intron loss events affecting 13 orthologous groups. Most ortholog groups incurred either none or single intron gain/loss event, and only 3 groups were affected by multiple events. Or13a lost 4 introns in the obscura group; Or94a gained 1 intron in D. grimshawi and lost 1 in the melanogaster subgroup. In Or49b, all 3 species in the Drosophila subgenus have 5 introns whereas the 2nd and 4th introns are absent in the 9 species of the Sophophora subgenus. In addition, DwilOr49b lost the 3rd intron. Therefore, either 2 intron gain events occurred in the Drosophila lineage or 2 intron loss events occurred in the Sophophora lineage. A 3rd intron loss event subsequently occurred in the D. willistoni group. The maximal number of intron in an Or gene is still 9, whereas the minimal is 0, DwilOr2a is an intronless gene. In comparison to the stability of intron numbers, the size of corresponding introns in orthologous genes can differ in several folds, and the intron size seems not to correlate with genome size.
Selective Forces on Or Genes
Ors possess unique functional profiles, including different spontaneous firing rates, signal models, and response spectra to odorants (Hallem et al. 2004). We expect these molecules to be under stabilizing selection for maintaining their signaling properties while simultaneously the molecules will be under diversifying selection to develop response profiles specific to different ecological contexts of each species. Here, we used sequence-based analysis to investigate purifying and positive selection on Or genes. The results are shown in table 3, where the dN/dS ratio and the P value from the log-LRT of M7/M8 are shown, along with false discovery rate (FDR) correction for multiple tests (Benjamini and Hochberg 1995). The tests contrasting the simpler homogeneous models M1a against M2a resulted in a corrected P value of 1 for nearly all receptors suggesting a lack of power and we do not report the results here.
Gene | dN/dS | P Value | Corrected P Value |
Or1a | 0.1636 | 0.02946 | 0.1580 |
Or2a | 0.1263 | 0.48554 | 1.0000 |
Or7a | 0.1198 | 0.00722 | 0.0532 |
Or9a | 0.1122 | 0.00129 | 0.0178* |
Or10a | 0.2243 | 0.00294 | 0.0248* |
Or13a | 0.1380 | 1.00000 | 1.0000 |
Or19a | 0.3455 | 0.00114 | 0.0178* |
Or22a | 0.2137 | 1.00000 | 1.0000 |
Or22c | 0.1946 | 0.00982 | 0.0644 |
Or23a | 0.1871 | 1.00000 | 1.0000 |
Or24a | 0.1013 | 0.12435 | 0.3914 |
Or30a | 0.0912 | 0.11275 | 0.3914 |
Or33a | 0.1749 | 0.90791 | 1.0000 |
Or33c | 0.2441 | 0.63464 | 1.0000 |
Or35a | 0.1154 | 1.00000 | 1.0000 |
Or42a | 0.1262 | 0.12955 | 0.3914 |
Or42b | 0.1177 | 0.33204 | 0.7836 |
Or43a | 0.1942 | 0.00181 | 0.0178* |
Or43b | 0.1232 | 1.00000 | 1.0000 |
Or45a | 0.1537 | 0.46300 | 1.0000 |
Or45b | 0.1232 | 0.93833 | 1.0000 |
Or46a | 0.1688 | 0.12573 | 0.3914 |
Or47a | 0.0615 | 1.00000 | 1.0000 |
Or47b | 0.1945 | 0.20904 | 0.5606 |
Or49a | 0.2324 | 0.20512 | 0.5606 |
Or49b | 0.1171 | 0.94990 | 1.0000 |
Or56a | 0.1628 | 0.00020 | 0.0059* |
Or59a | 0.1412 | 1.00000 | 1.0000 |
Or59b | 0.1041 | 1.00000 | 1.0000 |
Or59c | 0.2031 | 0.05914 | 0.2492 |
Or63a | 0.1120 | 0.03505 | 0.1591 |
Or65a | 0.2306 | 0.74630 | 1.0000 |
Or65b | 0.2192 | 0.52647 | 1.0000 |
Or65c | 0.2774 | 0.40529 | 0.9197 |
Or67a | 0.2444 | 1.00000 | 1.0000 |
Or67b | 0.0878 | 1.00000 | 1.0000 |
Or67c | 0.0672 | 1.00000 | 1.0000 |
Or67d | 0.1290 | 0.99991 | 1.0000 |
Or69a | 0.2283 | 0.01290 | 0.0761 |
Or71a | 0.1862 | 1.00000 | 1.0000 |
Or74a | 0.1516 | 0.03506 | 0.1591 |
Or82a | 0.1481 | 1.00000 | 1.0000 |
Or83a | 0.1054 | 1.00000 | 1.0000 |
Or83b | 0.0349 | 1.00000 | 1.0000 |
Or83c | 0.1832 | 1.00000 | 1.0000 |
Or85a | 0.1083 | 1.00000 | 1.0000 |
Or85b | 0.1487 | 1.00000 | 1.0000 |
Or85c | 0.1552 | 1.00000 | 1.0000 |
Or85d | 0.1455 | 0.22201 | 0.5695 |
Or85e | 0.1624 | 0.58634 | 1.0000 |
Or85f | 0.1959 | 0.24938 | 0.6131 |
Or88a | 0.1682 | 0.93885 | 1.0000 |
Or92a | 0.0582 | 0.13267 | 0.3914 |
Or94a | 0.1330 | 0.07040 | 0.2769 |
Or94b | 0.1190 | 0.70799 | 1.0000 |
Or98a | 0.2456 | 1.00000 | 1.0000 |
Or98b | 0.2269 | 0.97730 | 1.0000 |
OrN1 | 2.3295 | 0.00003 | 0.0018* |
OrN2 | 0.3492 | 0.00165 | 0.0178* |
Gene | dN/dS | P Value | Corrected P Value |
Or1a | 0.1636 | 0.02946 | 0.1580 |
Or2a | 0.1263 | 0.48554 | 1.0000 |
Or7a | 0.1198 | 0.00722 | 0.0532 |
Or9a | 0.1122 | 0.00129 | 0.0178* |
Or10a | 0.2243 | 0.00294 | 0.0248* |
Or13a | 0.1380 | 1.00000 | 1.0000 |
Or19a | 0.3455 | 0.00114 | 0.0178* |
Or22a | 0.2137 | 1.00000 | 1.0000 |
Or22c | 0.1946 | 0.00982 | 0.0644 |
Or23a | 0.1871 | 1.00000 | 1.0000 |
Or24a | 0.1013 | 0.12435 | 0.3914 |
Or30a | 0.0912 | 0.11275 | 0.3914 |
Or33a | 0.1749 | 0.90791 | 1.0000 |
Or33c | 0.2441 | 0.63464 | 1.0000 |
Or35a | 0.1154 | 1.00000 | 1.0000 |
Or42a | 0.1262 | 0.12955 | 0.3914 |
Or42b | 0.1177 | 0.33204 | 0.7836 |
Or43a | 0.1942 | 0.00181 | 0.0178* |
Or43b | 0.1232 | 1.00000 | 1.0000 |
Or45a | 0.1537 | 0.46300 | 1.0000 |
Or45b | 0.1232 | 0.93833 | 1.0000 |
Or46a | 0.1688 | 0.12573 | 0.3914 |
Or47a | 0.0615 | 1.00000 | 1.0000 |
Or47b | 0.1945 | 0.20904 | 0.5606 |
Or49a | 0.2324 | 0.20512 | 0.5606 |
Or49b | 0.1171 | 0.94990 | 1.0000 |
Or56a | 0.1628 | 0.00020 | 0.0059* |
Or59a | 0.1412 | 1.00000 | 1.0000 |
Or59b | 0.1041 | 1.00000 | 1.0000 |
Or59c | 0.2031 | 0.05914 | 0.2492 |
Or63a | 0.1120 | 0.03505 | 0.1591 |
Or65a | 0.2306 | 0.74630 | 1.0000 |
Or65b | 0.2192 | 0.52647 | 1.0000 |
Or65c | 0.2774 | 0.40529 | 0.9197 |
Or67a | 0.2444 | 1.00000 | 1.0000 |
Or67b | 0.0878 | 1.00000 | 1.0000 |
Or67c | 0.0672 | 1.00000 | 1.0000 |
Or67d | 0.1290 | 0.99991 | 1.0000 |
Or69a | 0.2283 | 0.01290 | 0.0761 |
Or71a | 0.1862 | 1.00000 | 1.0000 |
Or74a | 0.1516 | 0.03506 | 0.1591 |
Or82a | 0.1481 | 1.00000 | 1.0000 |
Or83a | 0.1054 | 1.00000 | 1.0000 |
Or83b | 0.0349 | 1.00000 | 1.0000 |
Or83c | 0.1832 | 1.00000 | 1.0000 |
Or85a | 0.1083 | 1.00000 | 1.0000 |
Or85b | 0.1487 | 1.00000 | 1.0000 |
Or85c | 0.1552 | 1.00000 | 1.0000 |
Or85d | 0.1455 | 0.22201 | 0.5695 |
Or85e | 0.1624 | 0.58634 | 1.0000 |
Or85f | 0.1959 | 0.24938 | 0.6131 |
Or88a | 0.1682 | 0.93885 | 1.0000 |
Or92a | 0.0582 | 0.13267 | 0.3914 |
Or94a | 0.1330 | 0.07040 | 0.2769 |
Or94b | 0.1190 | 0.70799 | 1.0000 |
Or98a | 0.2456 | 1.00000 | 1.0000 |
Or98b | 0.2269 | 0.97730 | 1.0000 |
OrN1 | 2.3295 | 0.00003 | 0.0018* |
OrN2 | 0.3492 | 0.00165 | 0.0178* |
Corrected P value below 0.05.
Gene | dN/dS | P Value | Corrected P Value |
Or1a | 0.1636 | 0.02946 | 0.1580 |
Or2a | 0.1263 | 0.48554 | 1.0000 |
Or7a | 0.1198 | 0.00722 | 0.0532 |
Or9a | 0.1122 | 0.00129 | 0.0178* |
Or10a | 0.2243 | 0.00294 | 0.0248* |
Or13a | 0.1380 | 1.00000 | 1.0000 |
Or19a | 0.3455 | 0.00114 | 0.0178* |
Or22a | 0.2137 | 1.00000 | 1.0000 |
Or22c | 0.1946 | 0.00982 | 0.0644 |
Or23a | 0.1871 | 1.00000 | 1.0000 |
Or24a | 0.1013 | 0.12435 | 0.3914 |
Or30a | 0.0912 | 0.11275 | 0.3914 |
Or33a | 0.1749 | 0.90791 | 1.0000 |
Or33c | 0.2441 | 0.63464 | 1.0000 |
Or35a | 0.1154 | 1.00000 | 1.0000 |
Or42a | 0.1262 | 0.12955 | 0.3914 |
Or42b | 0.1177 | 0.33204 | 0.7836 |
Or43a | 0.1942 | 0.00181 | 0.0178* |
Or43b | 0.1232 | 1.00000 | 1.0000 |
Or45a | 0.1537 | 0.46300 | 1.0000 |
Or45b | 0.1232 | 0.93833 | 1.0000 |
Or46a | 0.1688 | 0.12573 | 0.3914 |
Or47a | 0.0615 | 1.00000 | 1.0000 |
Or47b | 0.1945 | 0.20904 | 0.5606 |
Or49a | 0.2324 | 0.20512 | 0.5606 |
Or49b | 0.1171 | 0.94990 | 1.0000 |
Or56a | 0.1628 | 0.00020 | 0.0059* |
Or59a | 0.1412 | 1.00000 | 1.0000 |
Or59b | 0.1041 | 1.00000 | 1.0000 |
Or59c | 0.2031 | 0.05914 | 0.2492 |
Or63a | 0.1120 | 0.03505 | 0.1591 |
Or65a | 0.2306 | 0.74630 | 1.0000 |
Or65b | 0.2192 | 0.52647 | 1.0000 |
Or65c | 0.2774 | 0.40529 | 0.9197 |
Or67a | 0.2444 | 1.00000 | 1.0000 |
Or67b | 0.0878 | 1.00000 | 1.0000 |
Or67c | 0.0672 | 1.00000 | 1.0000 |
Or67d | 0.1290 | 0.99991 | 1.0000 |
Or69a | 0.2283 | 0.01290 | 0.0761 |
Or71a | 0.1862 | 1.00000 | 1.0000 |
Or74a | 0.1516 | 0.03506 | 0.1591 |
Or82a | 0.1481 | 1.00000 | 1.0000 |
Or83a | 0.1054 | 1.00000 | 1.0000 |
Or83b | 0.0349 | 1.00000 | 1.0000 |
Or83c | 0.1832 | 1.00000 | 1.0000 |
Or85a | 0.1083 | 1.00000 | 1.0000 |
Or85b | 0.1487 | 1.00000 | 1.0000 |
Or85c | 0.1552 | 1.00000 | 1.0000 |
Or85d | 0.1455 | 0.22201 | 0.5695 |
Or85e | 0.1624 | 0.58634 | 1.0000 |
Or85f | 0.1959 | 0.24938 | 0.6131 |
Or88a | 0.1682 | 0.93885 | 1.0000 |
Or92a | 0.0582 | 0.13267 | 0.3914 |
Or94a | 0.1330 | 0.07040 | 0.2769 |
Or94b | 0.1190 | 0.70799 | 1.0000 |
Or98a | 0.2456 | 1.00000 | 1.0000 |
Or98b | 0.2269 | 0.97730 | 1.0000 |
OrN1 | 2.3295 | 0.00003 | 0.0018* |
OrN2 | 0.3492 | 0.00165 | 0.0178* |
Gene | dN/dS | P Value | Corrected P Value |
Or1a | 0.1636 | 0.02946 | 0.1580 |
Or2a | 0.1263 | 0.48554 | 1.0000 |
Or7a | 0.1198 | 0.00722 | 0.0532 |
Or9a | 0.1122 | 0.00129 | 0.0178* |
Or10a | 0.2243 | 0.00294 | 0.0248* |
Or13a | 0.1380 | 1.00000 | 1.0000 |
Or19a | 0.3455 | 0.00114 | 0.0178* |
Or22a | 0.2137 | 1.00000 | 1.0000 |
Or22c | 0.1946 | 0.00982 | 0.0644 |
Or23a | 0.1871 | 1.00000 | 1.0000 |
Or24a | 0.1013 | 0.12435 | 0.3914 |
Or30a | 0.0912 | 0.11275 | 0.3914 |
Or33a | 0.1749 | 0.90791 | 1.0000 |
Or33c | 0.2441 | 0.63464 | 1.0000 |
Or35a | 0.1154 | 1.00000 | 1.0000 |
Or42a | 0.1262 | 0.12955 | 0.3914 |
Or42b | 0.1177 | 0.33204 | 0.7836 |
Or43a | 0.1942 | 0.00181 | 0.0178* |
Or43b | 0.1232 | 1.00000 | 1.0000 |
Or45a | 0.1537 | 0.46300 | 1.0000 |
Or45b | 0.1232 | 0.93833 | 1.0000 |
Or46a | 0.1688 | 0.12573 | 0.3914 |
Or47a | 0.0615 | 1.00000 | 1.0000 |
Or47b | 0.1945 | 0.20904 | 0.5606 |
Or49a | 0.2324 | 0.20512 | 0.5606 |
Or49b | 0.1171 | 0.94990 | 1.0000 |
Or56a | 0.1628 | 0.00020 | 0.0059* |
Or59a | 0.1412 | 1.00000 | 1.0000 |
Or59b | 0.1041 | 1.00000 | 1.0000 |
Or59c | 0.2031 | 0.05914 | 0.2492 |
Or63a | 0.1120 | 0.03505 | 0.1591 |
Or65a | 0.2306 | 0.74630 | 1.0000 |
Or65b | 0.2192 | 0.52647 | 1.0000 |
Or65c | 0.2774 | 0.40529 | 0.9197 |
Or67a | 0.2444 | 1.00000 | 1.0000 |
Or67b | 0.0878 | 1.00000 | 1.0000 |
Or67c | 0.0672 | 1.00000 | 1.0000 |
Or67d | 0.1290 | 0.99991 | 1.0000 |
Or69a | 0.2283 | 0.01290 | 0.0761 |
Or71a | 0.1862 | 1.00000 | 1.0000 |
Or74a | 0.1516 | 0.03506 | 0.1591 |
Or82a | 0.1481 | 1.00000 | 1.0000 |
Or83a | 0.1054 | 1.00000 | 1.0000 |
Or83b | 0.0349 | 1.00000 | 1.0000 |
Or83c | 0.1832 | 1.00000 | 1.0000 |
Or85a | 0.1083 | 1.00000 | 1.0000 |
Or85b | 0.1487 | 1.00000 | 1.0000 |
Or85c | 0.1552 | 1.00000 | 1.0000 |
Or85d | 0.1455 | 0.22201 | 0.5695 |
Or85e | 0.1624 | 0.58634 | 1.0000 |
Or85f | 0.1959 | 0.24938 | 0.6131 |
Or88a | 0.1682 | 0.93885 | 1.0000 |
Or92a | 0.0582 | 0.13267 | 0.3914 |
Or94a | 0.1330 | 0.07040 | 0.2769 |
Or94b | 0.1190 | 0.70799 | 1.0000 |
Or98a | 0.2456 | 1.00000 | 1.0000 |
Or98b | 0.2269 | 0.97730 | 1.0000 |
OrN1 | 2.3295 | 0.00003 | 0.0018* |
OrN2 | 0.3492 | 0.00165 | 0.0178* |
Corrected P value below 0.05.
Or83b has the smallest dN/dS ratio of 0.0349, suggesting the effect of very strong purifying selection. Or83b is also the most conserved receptor with an average sequence identity of 92%. This receptor forms heterodimers with other receptors (Benton et al. 2006), and it seems that the unique and indispensable role of this receptor in olfaction exerted the most stringent pressure on its evolution. With the exception of OrN1, a putative Or gene newly identified in this study, all Or genes have low dN/dS values with the highest being 0.3492 for OrN2 and 0.3455 for Or19a.
However, dN/dS ratio could be misleading if only a small proportion of sites are under positive selection. To this end, we used LRT to detect positive selection by employing 2 pairs of site models in PAML, M1a versus M2a, and M7 versus M8. As mentioned, the comparison of M1a to M2a did not reveal any orthologous group under positive selection. The test using M7 and M8, which allows for beta-distributed site-specific dN/dS ratio, detected 7 groups under possible positive selection at 0.05 significance level with corrections for multiple tests using FDR. They are Or9a, Or10a, Or19a, Or43a, Or56a, OrN1, and OrN2. We also used the NEB and BEB estimation methods in model M8 (Zhang et al. 2005) to identify sites under positive selection. We found only 10 such sites at 0.05 significance level. These findings are consistent with the observation that broad purifying selection acts on olfactory processes and the orthologous genes probably have the same or similar functional properties.
In the above analysis, we had to assume a time-homogeneous dN/dS ratio for the orthologous gene phylogeny and lineage-specific evolution may be obscured. To account for this, we examined the pairwise dN/dS ratio within orthologous groups. We found that 94% of pairs have small dN/dS ratio (<0.2), whereas 34 pairs have dN/dS greater than 0.5 of which 30 are inparalog pairs. Thus, we found support for the hypothesis that duplicated receptors within a given genome experienced directional selection for novel function. Three pairs show high dN/dS ratio greater than 1: 3.24 for DgriOr42b-1 versus DgriOr42b-7, 2.05 for DgriOr42b-5 versus DgriOr42b-6, and 1.32 for DgriOr42b-7 versus DgriOr42b-8. It is interesting to note that Or42b and Or67d genes in D. grimshawi generally have relatively high dN/dS ratios. All 28 pairs between the inparalogous DgriOr42b have dN/dS greater than 0.3, and 20 pairs have dN/dS ratio greater than 0.5. In DgriOr67d, the 5 inparalogous genes form 2 tandem arrays. We found 4 pairs of DgriOr67d have dN/dS greater than 0.7. Sporadic high dN/dS ratios were also spotted in Or67a, Or22a, Or33a, Or47b, Or49a, Or59a, and Or59c.
Discussion
Odorant receptors in flies were long evasive from investigation until their identification in D. melanogaster. In this study, we expanded the effort to additional 11 Drosophila species. We found that in all species, the number of Or genes differs moderately among the genomes. As shown in table 1, the number of putatively functional genes in each genome varies from 53 to 72 with a median number 60. We tested whether this variability is more or less than expected under the model of random gene loss and gain over the species phylogeny by estimating the gain/loss rates and simulating the process over the tree. The gain rate was estimated to be 0.370/Myr, and loss rate was estimated to be 0.284/Myr. Using the variance of gene numbers as our statistic and 1,000,000 repeated simulations for reference distribution, we found that the variation in the Or gene numbers was significantly high at P = 0.04. Thus, the number of Or genes in each genome varies more than expected by chance; and, in fact, much of the variation can be attributed to inparalogs (genome-specific duplications). This result is consistent with the idea that receptor gain/loss might be functionally related to each species' chemical niche.
Ors are stimulated by odors, small volatile organic compounds. It is found that although odors vary greatly in their chemical structure and physical properties, receptors seem to identify them mainly by several odor characteristics including functional groups and backbone chain size (Hallem et al. 2004). Such decomposition of odors can be mirrored in the receptors. A systematic study in D. melanogaster found that although individual receptors possess unique spectra, these spectra frequently overlapped each other (Hallem and Carlson 2006). Through such combinations, a pool of approximately 60 receptors enables the flies to distinguish a much higher number of odors. That is, the odor receptors act as a “basis” set in the multidimensional world of odorants in the same manner as the 3 light receptors in humans establish color vision. The odors could include those that are indispensable to all Drosophila species and those that are idiosyncratic to particular species. In fact, in the larva phase, the pool of active receptors may be much smaller (Kreher et al. 2005). Thus, one hypothesis about the distribution of odor receptors in the Drosophila species is that a core (but variable) set of receptors form a stable functional core whereas species-specific duplicated genes elaborate the odor space for niche specific chemicals.
We have 2 additional lines of evidence to support this functional stability hypothesis. First, in all 12 species, nearly all receptor genes have orthologs in some other species. At the coarse level, only 8 Or genes in the Drosophila subgenus do not have detectable orthologs in the Sophophora subgenus. More than 80% of the orthologous groups have members in the 2 subgenera. Furthermore, these orthologs are highly conserved in sequence. In the phylogeny, members in an orthologous group form monophyletic clades and share the most recent common ancestor. Second, in the analysis of positive selection, we found very weak evidence of functional divergence within the orthologous groups that suggests in most cases orthologous receptors in different species may have spectra largely resembling each other. Most orthologous groups have more than 10 reasonably divergent sequences that can be well aligned, and the branch lengths in the phylogeny of these sequences are generally not trivially small, lending favorable conditions for our selection tests. The statistical power in detecting positive selection was also boosted by assuming varying dN/dS ratio among sites. In our test of pair of models (M7 and M8), model M8 assumes a beta distribution of dN/dS ratio plus one for positive selection. After correction for multiple testing, only 7 receptors showed significant evidence of positive selection at 0.05 significance level. With the exception of OrN1, dN/dS ratio is smaller than 0.35 for all groups. Besides these sequence-based analysis and inference, there also exists experimental evidence. In an in vivo electrophysiological study of olfaction in 9 species of melanogaster subgroup, the response profiles of corresponding ORNs were found to be very similar (Stensmyr et al. 2003).
Site-specific estimates of positive selection revealed 9 sites in D. melanogaster with reliable estimates (well-aligned sequences) and significant posterior probabilities. These are amino acid position 51 in DmelOr7a, position 115 in DmelOr9a, positions 15, 128, 183, and 246 in DmelOr10a, position 228 in DmelOr19a, position 258 in DmelOr22c, and position 49 in DmelOr59c. In addition, we found evidence for positive selection in position 54 in the newly annotated OrN1. Of these, 7 positions were mapped to the intracellular domains, 2 to the TM domains, and 1 to the extracellular domain using the structure model of Benton et al. (2006). If we pose that odorant receptors have the standard structural conformation rather than the inverted conformation suggested by Benton et al. (2006), all of our positively selected intracellular sites would be considered extracellular, and the 2 sites in the TM domains would be in their lumenal halves. If the positive selection is for novel ligand interaction, it would seem more likely to involve extracellular domains rather than intracellular domains. A study of odorant receptors in rodents also supports this hypothesis by finding that 75% positively selected sites are either in the extracellular parts or the lumenal halves of TM domains (Emes et al. 2004). Thus, these results throw caution to the idea that odorant receptors have atypical structures.
Stability of odorant repertoire does not mean that the Or genes are evolutionarily static. Or genes evolved in response to different ecological contexts. For example, D. sechellia is endemic to islands away from the Africa continent and oviposits only on morinda fruit. This fly was found to be particularly sensitive to methyl hexanoate; likely, some Or has evolved in adaptation to this environment (Dekker et al. 2006). However, the number of detectable functional genes is relatively low at 55, whereas the genome seems to have rather high proportion of pseudogenes (12% compared with 9% average for all others).
At the genome level, we examined the changes of chromosomal location and orientation of Or genes in 4 species. A recent study suggested that Drosophila genomes have undergone some of the most dramatic chromosomal evolution among all eukaryotes (Ranz et al. 2001). Species in this genus may have different karyotypes. Besides a dot chromosome, D. pseudoobscura has 1 metacentric chromosome and 3 telocentric chromosomes, whereas the other 3 species have 1 telocentric and 2 metacentric chromosomes. Nevertheless, these 5 chromosome arms, so called the Muller elements, can be mapped to each other, and it is hypothesized that genes largely mobilize within the same Muller element (Powell 1997). Our study of Or genes is consistent with the general preservation of the Muller elements with only a few Or genes showing evidence of movement across different Muller elements.
Gene duplications and losses were observed in about half of the orthologous groups and in most species as well as in almost all branches of the species tree. More than half of the duplicated genes stay as tandem arrays. In one extreme case, we inferred 27 duplication events and 14 loss events in D. grimshawi. Some of the inference must be interpreted with caution because loss events are affected by our ability to detect the homologs. However, we note that homologs for a given family is less divergent than Or genes as a whole, and the entire gene set is detectable with the recursive TBlastN search employed in this study. It is also the case that although duplication events are likely to be true positives, assessment of species-specific duplication (i.e., inparalogs) is affected by taxon sampling. If we had more species closely related to D. grimshawi, we may find other orthologs. Regardless, with 72 functional genes and 11 pseudogenes, this species has the largest number of Or genes in the 12 species. D. grimshawi belong to the Hawaiian lineage that encompasses about 1,000 species and dispersed in Hawaiian Islands. Thus, potentially, the unique ecological condition likely contributed to the diversifying of Or genes in this species, as illustrated by DgriOr42b whose 11 genes split into 3 tandem arrays each with very unbalanced number of copies.
The Or genes show multimodal evolution patterns with changes in their genic structure (gain and loss of introns), chromosomal positions, duplication and gain of new functions, as well as loss of function. Overall, the Or genes in Drosophila seem to have undergone dynamic evolution through gene duplication and loss while maintaining a core set of functions that help establish a fundamental odorant space for the group.
We thank John Carlson, Fangjun Tang, and Elissa Hallem for discussions and access to preliminary results. We thank 2 anonymous reviewers for suggestions that improved this paper. This work has been supported in part by National Science Foundation grant EF-0334866, EF-0331654, and National Institutes of Health grant P20-GM-6912-1 to J.K.
Funding to pay the Open Access publication charges for this article was provided by Penn funds.
References
Author notes
Adriana Briscoe, Associate Editor