Abstract

Dinoflagellates in the family Symbiodiniaceae are taxonomically diverse, predominantly symbiotic lineages that are well-known for their association with corals. The ancestor of these taxa is believed to have been free-living. The establishment of symbiosis (i.e. symbiogenesis) is hypothesized to have occurred multiple times during Symbiodiniaceae evolution, but its impact on genome evolution of these taxa is largely unknown. Among Symbiodiniaceae, the genus Effrenium is a free-living lineage that is phylogenetically positioned between two robustly supported groups of genera within which symbiotic taxa have emerged. The apparent lack of symbiogenesis in Effrenium suggests that the ancestral features of Symbiodiniaceae may have been retained in this lineage. Here, we present de novo assembled genomes (1.2–1.9 Gbp in size) and transcriptome data from three isolates of Effrenium voratum and conduct a comparative analysis that includes 16 Symbiodiniaceae taxa and the other dinoflagellates. Surprisingly, we find that genome reduction, which is often associated with a symbiotic lifestyle, predates the origin of Symbiodiniaceae. The free-living lifestyle distinguishes Effrenium from symbiotic Symbiodiniaceae vis-à-vis their longer introns, more-extensive mRNA editing, fewer (~30%) lineage-specific gene sets, and lower (~10%) level of pseudogenization. These results demonstrate how genome reduction and the adaptation to distinct lifestyles intersect to drive diversification and genome evolution of Symbiodiniaceae.

Introduction

Dinoflagellates (Dinophyceae, Alveolata) are a diverse group of microbial eukaryotes that are ubiquitous in aquatic environments. Species in the family Symbiodiniaceae [1] comprise photosynthetic taxa that form symbioses with diverse marine organisms. Of particular importance to modern coral reefs, Symbiodiniaceae provide photosynthates via fixed carbon and essential nutrients to their cnidarian hosts. The Symbiodiniaceae ancestor is believed to have been free-living [1], with members of this group forming symbiotic associations with corals as early as 230 million years ago (MYA) [2]. Symbiogenesis, or the establishment of a symbiotic relationship between two or more taxa [3], can drastically influence lineage evolution, adaptation, and speciation as observed in obligate parasites and diverse symbiotic taxa [4]. This phenomenon is termed the resident genome syndrome and was previously hypothesized to explain the observed patterns of Symbiodiniaceae genome evolution [5].

Based on current divergence time estimates for Symbiodiniaceae, using large subunit rRNA via a clock model calibrated using coral fossil data [1], the split of genera Symbiodinium and Philozoon [1, 6] from other lineages occurred 166 ± 46 MYA; the later-branching symbiotic lineages diversified 109 ± 30 MYA [1]. If the emergence of symbiogenesis coincides with the earliest fossil evidence from 230 MYA [2], Symbiodiniaceae lineages would have arisen and diversified during major global geological events. These include the switch from aragonite to calcite seas (~190 MYA [7]), the breakup of Pangea (150–230 MYA [8]), the diversification or extinction of potential hosts such as rudists 66 MYA [9, 10], and the change in coral morphology from the Triassic (201–252 MYA) to the Cretaceous (66–145 MYA) [11, 12].

Symbiogenesis is expected to impact the genome evolution of symbionts within a broad spectrum of “facultativeness” that reflects the nature of the host association (i.e. with obligate free-living and obligate symbiont at opposing extremes). The symbiotic state underpins evolutionary processes such as genetic drift, expansion/contraction of mobile elements, pseudogenization, gene loss, and variation of mutation rates [5]. Previous studies that have investigated the effects of symbiogenesis on Symbiodiniaceae genomes have focused almost entirely on symbiotic genera, with the polar-dwelling, highly specialized Polarella glacialis [13], a sister of Symbiodiniaceae and within order Suessiales, comprising the sole, free-living outgroup.

Haploid genome sizes of Symbiodiniaceae taxa and P. glacialis are estimated to be <2 Gbp based on sequencing data [13–16], and <5 Gbp based on DNA staining and qPCR analysis of marker sequences [17, 18]. The taxonomically diverse dinoflagellates external to the Symbiodiniaceae are predominantly free-living and, in comparison, have massive genome sizes, e.g. 4.8 Gbp estimated from sequencing data for the bloom-forming Prorocentrum cordatum [19] and 200 Gbp based on DNA staining, for Alexandrium tamarense [20].

Symbiodiniaceae comprises at least 15 clades, with 11 named genera thus far [1, 6, 21, 22], supported by molecular, morphological, and ecological data [1]. Among Symbiodiniaceae taxa, Effrenium is a genus considered to be free-living [1]. The sole species, E. voratum, is globally distributed in temperate oceans with seasonal mean temperatures 15–26°C [1, 23]. Ubiquitous in the water column and on macroalgal surfaces [24], E. voratum can potentially form blooms [23]. Although occasionally found on the surface of marine organisms, E. voratum is not known to colonize any hosts intracellularly [23, 25]. Attempts to establish a symbiotic relationship with the anemone Exaiptasia pallida have been unsuccessful [26, 27]. Current understanding of Symbiodiniaceae evolutionary history suggests that E. voratum diverged 147 ± 40 MYA from the largely symbiotic genera of Symbiodinium and Philozoon during the early evolutionary history of Symbiodiniaceae [1]. Whereas genome data of other free-living species (e.g. Symbiodinium natans) are available, these taxa belong to genera that also include symbiotic species and thus might have experienced a symbiotic lifestyle at some point in their history. Based on these data, we posit that the genus Effrenium has remained unaffected by the influence of symbiogenesis and retains the free-living lifestyle, and therefore the ancestral genome features of Symbiodiniaceae. These genomic features provide a critical reference for understanding how multiple symbiogenesis events have contributed to the evolution of Symbiodiniaceae lineages to become successful symbionts in a wide range of hosts.

Here, we present de novo assembled genome and transcriptome data for three isolates of E. voratum. Incorporating publicly available genome-scale data from 16 Symbiodiniaceae taxa plus four free-living taxa external to the Symbiodiniaceae in a comparative genomic analysis, we examine genomic features in E. voratum. These include mobile elements, gene structures, gene-families, and pseudogenization to gain insights into ancestral features of Symbiodiniaceae.

Materials and Methods

Extraction of genomic DNA and total RNA

Cell cultures of E. voratum RCC1521, rt-383, and CCMP421 were provided by the LaJeunesse Lab (Pennsylvania State University). They were maintained using Daigo’s IMK medium (25°C, 14:10 h light–dark cycles).

For RCC1521 and rt-383, cells were pelleted by centrifugation (300 g, 5 min, room temperature [RT]), and resuspended in 100–500 μL pre-warmed (60°C) lysis buffer (100 mM Tris–HCl, 20 mM EDTA, 4% CTAB (w/v), 1.4 M NaCl, 1% PVP (w/v), 2% β-mercaptoethanol). This mixture was ground using a pre-chilled mortar and pestle, and high molecular-weight genomic DNA (gDNA) was extracted (https://dx.doi.org/10.17504/protocols.io.b5qyq5xw). For CCMP421, cells were pelleted and snap frozen in liquid nitrogen and ground (425–600 μm glass beads). DNA was extracted using the 2 × CTAB method [13]. The DNA was precipitated using chilled isopropanol, washed using chilled 70% ethanol, and stored in Tris–HCl (10 mM, pH 8) at −20°C until sequencing (Supplementary Methods, Supplementary Table 1).

To extract total RNA from RCC1521 for Iso-Seq sequencing, cell pellets were lysed (5× freeze–thaw cycles, 425–600 μm glass beads), before QIAGEN RNeasy Plant Mini Kit was used. To increase transcriptome diversity, we extracted more RNA using a second method (Supplementary Methods, Supplementary Table 1).

Transcriptome assembly and processing

For RNA-Seq data, upon removal of adapters and unique molecular identifiers using bcl2fastq v2.20.0.422, the reads were trimmed and filtered using fastp v0.20.0 (-A -L 35 -g -x --cut_front --cut_window_size 4 --cut_mean_quality 15) and assembled using Trinity v2.9.1 [28] in “de novo” (--SS_lib_type RF --trimmomatic) and “genome-guided” modes; for “genome-guided”, RNA-Seq reads were mapped to the assembled genome using HISAT2 [29] before Trinity was run (--SS_lib_type RF --genome_guided_bam --genome_guided_max_intron 70 000). Raw Iso-Seq sequences underwent CCS generation and demultiplexing using the standalone modules CCS v4.2.0 and Lima v1.11.0, and high-quality transcripts were identified using the IsoSeq pipeline v3.3.0.

De novo genome assembly

De novo genome assemblies were generated by combining Illumina, PacBio, and Nanopore data using MaSuRCA [30] v4.0.1 for RCC1521, and v3.4.2 for rt-383, with the built-in CABOG as the final assembler; the distinction between these two versions is the 6-fold decrease in run-time in v4.0.1 with negligible impact on the results. For CCMP421, the de novo genome assembly was generated from 10× Genomics linked-read sequencing data using Supernova v2.1.1 [31].

RCC1521 and rt-383 assemblies were scaffolded with L_RNA_scaffolder [32], using IsoSeq transcripts and de novo assembled transcripts from RNA-Seq (above). For CCMP421, linked-read distance information was used to refine the assembly with ARBitR v0.2 (-m 27 k -s 10 k) [33]. Due to the low quality of the publicly available transcriptome data of CCMP421 (i.e. MMETSP1110 [34] with only 54% mapped to the corresponding assembled genome; Supplementary Table 2), we used the de novo assembled transcripts from RCC1521 and rt-383 to scaffold the CCMP421 genome assembly via L_RNA_scaffolder.

We identified and removed putative sequences of bacterial or archaeal sources following a decision tree based on analysis using BlobTools v1.1 [35], yielding the final assembly for each isolate. Organellar genomic sequences were identified following a published method [13] (Supplementary Methods, Supplementary Tables 35). Completeness of each assembly was assessed using BUSCO v5.1.2 [36] against the alveolata_odb10 database (genome mode). Pairwise genome-sequence similarity was assessed using nucmer (--mum) implemented in MUMmer 4.0.0beta2 [37] at default setting. To predict protein-coding genes, we used a workflow customized for dinoflagellates [16], incorporating protein and transcriptome evidence and multiple predictors (Supplementary Methods and Supplementary Table 6).

Inference of phylogenetic relationship

To infer species phylogenies using 18S rRNA genes and ITS2 markers, we identified these sequences from the E. voratum genome assemblies using BLASTn. Reference 18S rRNA genes (https://doi.org/10.5061/dryad.1717129 [1]; 79 sequences) and ITS2 sequences (https://symportal.org/ [38]; 8409 “published post-MED sequences”, 10 September 2021) were used for these analyses. For each marker sequence set, a multiple sequence alignment was generated using MAFFT v7.471 [39] (mafft-linsi) and trimmed using trimAl v1.4.rev15 [40] (-automated1), from which a maximum-likelihood tree was inferred using IQ-TREE v2.1.3 [41] (-nm 2000 -bb 2000 -m MFP). Putative orthologous protein sets identified from the 33 dinoflagellate taxa (Supplementary Tables 7 and 8) [19, 34, 42, 43] were used to infer a species tree, whereas whole-genome sequences were used in an alignment-free approach [44] to infer phylogenetic relationships from distinct genomic regions (Supplementary Methods, Supplementary Fig. 1, see online supplementary material for a colour version of this figure).

Analysis of gene evolution of Suessiales taxa

We grouped 21 Suessiales protein sequence datasets into three groups: Ev, S1, and S2, with Po (P. glacialis) as outgroup (Supplementary Table 7). The total 811 661 protein sequences were clustered into homologous sets using OrthoFinder v2.5.4 [45]. Because these protein sets were generated solely based on sequence similarity, the lack of structural information in such an inference of homology may fail to recognize remote (i.e. highly divergent) homologs. However, these protein sets represent a proxy for studying the evolution of gene families, particularly in identifying putative gene gain or loss [45], and lineage-specific protein sets (Supplementary Methods).

Identification of pseudogenes

Following an earlier study [15], pseudogenes were identified based on tBLASTn search using predicted protein sequences as query against the corresponding genome sequences for which gene model sequences were explicitly masked and excluded. Matched regions (≥75% identity) were considered fragments of pseudogenes; those at <1 Kb apart and in the same orientation were considered collectively as a pseudogene.

We focused on 752 954 protein sequences from 19 Suessiales taxa, specifically excluding S. natans and Symbiodinium pilosum from S1 to avoid signatures of free-living lifestyle in these taxa interfering with potential signatures of symbiogenesis. The protein sequences were clustered into homologous sets using OrthoFinder v2.5.4 [45]. We define the extent of pseudogenization, Ψ, as the ratio of the number of putative pseudogenes to the number of putative functional genes in a homologous set. We determined this value independently for Ev (ΨEv) against that for S1 (ΨS1), S2 (ΨS2), and the combined S1 and S2 (ΨS1 + S2); a protein set with ΨS1 > ΨEv indicates a greater extent of pseudogenization in S1 than in Ev.

Results

Genomes of E. voratum isolates

We generated de novo genome assemblies for three isolates of E. voratum (assembly sizes 1.1–1.3 Gbp; Supplementary Methods, Supplementary Table 1), with estimated haploid genome sizes of 1.2–1.9 Gbp (Supplementary Table 9, Supplementary Fig. 2, see online supplementary material for a colour version of this figure), completeness (BUSCO recovery 67.2–77.2%), and number of predicted genes (32102–39 878) (Table 1, Supplementary Tables 7 and 10). The CCMP421 genome assembly, derived from 10× Genomics linked reads, is the most fragmented compared with RCC1521 and rt-383, which were derived from both short and long reads. BUSCO recovery of these assemblies is comparable to other published dinoflagellate genomes (55–70%; Supplementary Table 7).

Table 1

Genome assemblies and gene predictions of E. voratum RCC1521, rt-383, and CCMP421.

IsolateRCC1521rt-383 (=CCMP3420)CCMP421
Location of isolationMediterranean Sea off Blanes, Spain [23]Eastern North Pacific off Santa Barbara, USA [73]Cooks Strait, New Zealand [74]
Genome sequencing technologiesIllumina, PacBio, NanoporeIllumina, PacBio, Nanopore10X Linked-reads
Genome assembly size (Gb)1.21.31.1
Estimated genome size (Gb)1.41.21.9
GC-content of genome assembly (%)50.850.650.9
Total read coverage446×212×153×
Number of genome scaffolds388111 60738 022
N50 of genome assembly (Kb)720252304
Number of predicted genes32 10839 87832 615
% BUSCO recovery (genome) alveolata_odb10; eukaryota_odb1059.1; 29.462.0; 28.255.0; 20.0
% BUSCO recovery (protein) alveolata_odb10; eukaryota_odb1076.1; 52.577.2; 54.167.2; 45.5
IsolateRCC1521rt-383 (=CCMP3420)CCMP421
Location of isolationMediterranean Sea off Blanes, Spain [23]Eastern North Pacific off Santa Barbara, USA [73]Cooks Strait, New Zealand [74]
Genome sequencing technologiesIllumina, PacBio, NanoporeIllumina, PacBio, Nanopore10X Linked-reads
Genome assembly size (Gb)1.21.31.1
Estimated genome size (Gb)1.41.21.9
GC-content of genome assembly (%)50.850.650.9
Total read coverage446×212×153×
Number of genome scaffolds388111 60738 022
N50 of genome assembly (Kb)720252304
Number of predicted genes32 10839 87832 615
% BUSCO recovery (genome) alveolata_odb10; eukaryota_odb1059.1; 29.462.0; 28.255.0; 20.0
% BUSCO recovery (protein) alveolata_odb10; eukaryota_odb1076.1; 52.577.2; 54.167.2; 45.5
Table 1

Genome assemblies and gene predictions of E. voratum RCC1521, rt-383, and CCMP421.

IsolateRCC1521rt-383 (=CCMP3420)CCMP421
Location of isolationMediterranean Sea off Blanes, Spain [23]Eastern North Pacific off Santa Barbara, USA [73]Cooks Strait, New Zealand [74]
Genome sequencing technologiesIllumina, PacBio, NanoporeIllumina, PacBio, Nanopore10X Linked-reads
Genome assembly size (Gb)1.21.31.1
Estimated genome size (Gb)1.41.21.9
GC-content of genome assembly (%)50.850.650.9
Total read coverage446×212×153×
Number of genome scaffolds388111 60738 022
N50 of genome assembly (Kb)720252304
Number of predicted genes32 10839 87832 615
% BUSCO recovery (genome) alveolata_odb10; eukaryota_odb1059.1; 29.462.0; 28.255.0; 20.0
% BUSCO recovery (protein) alveolata_odb10; eukaryota_odb1076.1; 52.577.2; 54.167.2; 45.5
IsolateRCC1521rt-383 (=CCMP3420)CCMP421
Location of isolationMediterranean Sea off Blanes, Spain [23]Eastern North Pacific off Santa Barbara, USA [73]Cooks Strait, New Zealand [74]
Genome sequencing technologiesIllumina, PacBio, NanoporeIllumina, PacBio, Nanopore10X Linked-reads
Genome assembly size (Gb)1.21.31.1
Estimated genome size (Gb)1.41.21.9
GC-content of genome assembly (%)50.850.650.9
Total read coverage446×212×153×
Number of genome scaffolds388111 60738 022
N50 of genome assembly (Kb)720252304
Number of predicted genes32 10839 87832 615
% BUSCO recovery (genome) alveolata_odb10; eukaryota_odb1059.1; 29.462.0; 28.255.0; 20.0
% BUSCO recovery (protein) alveolata_odb10; eukaryota_odb1076.1; 52.577.2; 54.167.2; 45.5

Genome sequences of the three isolates share high similarity (Supplementary Table 11) and exhibit conserved repetitive elements. For an in-depth pairwise genome-sequence comparison, including potential technical issues related to contiguity of these genome assemblies, see [46]. Repetitive regions containing protein-coding genes were highly conserved in E. voratum relative those of other Suessiales (the Order containing Symbiodiniaceae and the earlier branching sister P. glacialis). We identified 98 344 core k-mers (k = 23, all possible 23-base sequences; Supplementary Methods) that are common in genomes of all Suessiales taxa; 95% of core 23-mers were in repetitive regions of E. voratum (Supplementary Fig. 3, see online supplementary material for a colour version of this figure).

Genome-size reduction pre-dated divergence of Symbiodiniaceae

For comparative genomic analysis, we obtained all available genomic data from 23 dinoflagellate taxa: 19 from Symbiodiniaceae [13, 15, 16, 47–51] (Order Suessiales), two sister taxa of P. glacialis [13] (Order Suessiales), and two distantly related free-living dinoflagellate taxa, P. cordatum [19] (Order Prorocentrales) and Amphidinium gibbosum [42] (Order Amphidiniales). Datasets from the 21 Suessiales taxa were grouped into: (i) the free-living outgroup (Po: P. glacialis strains CCMP1383 and CCMP2088) sister to Symbiodiniaceae, and among Symbiodiniaceae, (ii) the earlier-branching, largely symbiotic Symbiodinium (S1: Symbiodinium linucheae, S. natans, Symbiodinium necroappetens, S. pilosum, Symbiodinium tridacnidorum strains CCMP2592 and Sh18, and Symbiodinium microadriaticum strains 04-503SCI.03, Cass KB8, and CCMP2467), (iii) the three free-living E. voratum isolates (Ev), and (iv) the later-branching symbiotic lineages (S2: Breviolum minutum, Cladocopium proliferum, Cladocopium sp. C15, Cladocopium sp. C92, Durusdinium trenchii strains CCMP2556 and SCF082, and Fugacium kawagutii) (Supplementary Table 7). The phylogenetic positions of these groups relative to other dinoflagellates are shown in Fig. 1A, along with light micrographs of representative species in S1, S2, and Ev (Fig. 1B). Cell size of E. voratum (12.2–13.3 μm [23]) is generally larger than S1 (e.g. S. microadriaticum CassKB8; 8.0–11.0 μm [17]) or S2 cells (e.g. D. trenchii CCMP2556; 7.5–10.0 μm [17]).

Species tree of dinoflagellates, showing estimated nuclear genome sizes; (A) the tree was inferred using 91 115 orthologous protein sets derived from 1 420 328 proteins from 33 dinoflagellate taxa, encompassing symbiotic, free-living, and parasitic lifestyles, and the number shown at each node denotes the number of homologous protein sets that are identified in any lineage descendant from that node, regardless of whether other lineages are also represented. The numbers at the terminal branches indicate protein sets that are recovered only in the specific lineages of the tip labels; Suessiales is shown as four dataset groups, i.e. the free-living Po external to the three Symbiodiniaceae clades of S1 (the early-branching, largely symbiotic Symbiodinium), Ev (the free-living E. voratum), and S2 (the later-branching symbiotic taxa). For S1, S2, Ev, and Po, the mean number of taxon-specific protein sets and the mean ± standard deviation of estimated genome sizes are shown. Ancestral nodes for all 33 taxa (N0), for Dinophyceae (core dinoflagellates; N1), for Suessiales (N2), for Symbiodiniaceae (N3), and for Polarella glacialis (N4) are indicated on the tree; and (B) the light micrographs of representative Symbiodiniaceae taxa in the dataset groups of S1, S2, and Ev are shown; scale bar = 10 μm.
Figure 1

Species tree of dinoflagellates, showing estimated nuclear genome sizes; (A) the tree was inferred using 91 115 orthologous protein sets derived from 1 420 328 proteins from 33 dinoflagellate taxa, encompassing symbiotic, free-living, and parasitic lifestyles, and the number shown at each node denotes the number of homologous protein sets that are identified in any lineage descendant from that node, regardless of whether other lineages are also represented. The numbers at the terminal branches indicate protein sets that are recovered only in the specific lineages of the tip labels; Suessiales is shown as four dataset groups, i.e. the free-living Po external to the three Symbiodiniaceae clades of S1 (the early-branching, largely symbiotic Symbiodinium), Ev (the free-living E. voratum), and S2 (the later-branching symbiotic taxa). For S1, S2, Ev, and Po, the mean number of taxon-specific protein sets and the mean ± standard deviation of estimated genome sizes are shown. Ancestral nodes for all 33 taxa (N0), for Dinophyceae (core dinoflagellates; N1), for Suessiales (N2), for Symbiodiniaceae (N3), and for Polarella glacialis (N4) are indicated on the tree; and (B) the light micrographs of representative Symbiodiniaceae taxa in the dataset groups of S1, S2, and Ev are shown; scale bar = 10 μm.

The most striking feature of genome evolution among Suessiales is the marked reduction in genome size that occurred in the common ancestor of this lineage. Whereas free-living dinoflagellates external to the Suessiales have wide-ranging genome-sizes ca. 5–200 Gbp, except the parasitic Amoebophrya ceratii that has a highly reduced genome (0.1 Gbp), all Suessiales genomes fall within a much narrower size range from 0.7 to 2.0 Gbp, estimated using sequencing data (Fig. 1A). In an assessment of 1 603 073 protein sequences from 33 dinoflagellate taxa, we identified 50 753 homologous sets that contain one or more Suessiales taxa (node N2, Fig. 1A), compared with 83 062 sets that contain one or more Dinophyceae taxa (core dinoflagellates) [52] including Suessiales (node N1, Fig. 1A). This observation suggests streamlining of gene inventory and function in the evolution of Suessiales, particularly for Symbiodiniaceae, which may have driven symbiotic associations with cnidarians that offered nutrient-rich and protected habitats within the animal tissues. The facultative lifestyle was likely retained in most Symbiodiniaceae because it offers the benefit of sexual reproduction during the free-living stage [53]. The most substantial recovery from genome streamlining is through whole genome duplication, which has occurred in the Durusdinium lineage [48].

Genomes of E. voratum have higher GC and longer introns than those of symbiotic lineages

Compared with the genomes of symbiotic Symbiodiniaceae (i.e. S1 and S2), E. voratum genomes exhibit higher GC content, a comparable extent of mobile elements, and greater extent of introner elements (IEs). Overall, the GC content of coding regions varied among the Symbiodiniaceae lineages (Fig. 2A) and were lower in S2 (mean 54.2%; P < .05) relative to Ev (61.0%) and S1 (57.7%); in intronic regions, the mean GC is 44.6% (S2), 50.1% (Ev), and 50.4% (S1), whereas that of whole-genome sequences is 45.8% (S2), 50.7% (Ev), and 50.6% (S1). Variation of GC content in dinoflagellate genomes does not appear to correlate to lifestyle; among the free-living species external to Symbiodiniaceae, the genomes of P. glacialis and A. gibbosum have a mean GC content of 46.4%, similar to S2, whereas the genome of P. cordatum has the highest GC content described thus far for any dinoflagellate, at 59.7% [19]. Intracellular bacteria have a mutational bias toward low genomic GC content, e.g. ~20% [54], but intracellular eukaryotes display both low and high extreme GC content patterns, ranging as low as 24% in the malaria parasite Plasmodium falciparum [55] to 67% in the green alga Chlorella variabilis that is a symbiont in the ciliate Paramecium [56]. The lower GC content in S2 genomes than the S1 counterparts underscores the dynamic nature of genomic GC content evolution in intracellular eukaryotes. Based on the proportions of mobile elements in each group, more long interspersed nuclear elements (LINEs) were found in S1 (3.9%) than in Ev (1.8%) and S2 (1.9%) (Fig. 2B, Supplementary Fig. 4, see online supplementary material for a colour version of this figure, Supplementary Table 12). In addition, more (5%) Ev genes contain IEs, than do the S1 (4%) and S2 (3%) genes (Fig. 2C); mobility of these elements are thought to rely on transposases [57]. Our recovery of transposase sequences from most Symbiodiniaceae genomes (Supplementary Table 13, Supplementary Note) suggests a capacity for these IEs to be mobile.

Genome features of E. voratum and other dinoflagellates; features of representative genomes in the four Suessiales groups of S2, Ev, S1, and Po, plus P. cordatum (Pc), in the order from the most-recent to most-ancient divergence, shown for (A) mean GC content in whole-genome, intronic, and CDS regions (A. gibbosum (Ag) was added for dinoflagellate-wide comparison), (B) percentage of mobile elements, and (C) percentage of genes containing IEs; gene features of Suessiales showing (D) mean gene length, (E) average proportions of exons and introns, (F) the relative frequency of introns by length (symbiotic lineages in boldface), and (G) number of introns per gene; in all bar charts, *, **, *** represent P < .05, <.01, and <.001, respectively, based on Wilcoxon rank sum test.
Figure 2

Genome features of E. voratum and other dinoflagellates; features of representative genomes in the four Suessiales groups of S2, Ev, S1, and Po, plus P. cordatum (Pc), in the order from the most-recent to most-ancient divergence, shown for (A) mean GC content in whole-genome, intronic, and CDS regions (A. gibbosum (Ag) was added for dinoflagellate-wide comparison), (B) percentage of mobile elements, and (C) percentage of genes containing IEs; gene features of Suessiales showing (D) mean gene length, (E) average proportions of exons and introns, (F) the relative frequency of introns by length (symbiotic lineages in boldface), and (G) number of introns per gene; in all bar charts, *, **, *** represent P < .05, <.01, and <.001, respectively, based on Wilcoxon rank sum test.

Compared with genes in other Symbiodiniaceae, E. voratum genes are longer and contain longer introns. Significantly (P < .05) longer genes were observed in Ev (mean 20 Kb) than S1 (8 Kb) and S2 (11 Kb) (Fig. 2D), primarily driven by longer intron sizes (introns make up, on average, 92% of a gene [Fig. 2E]; sizes peak at 1 Kb [Fig. 2F]) and higher intron density per gene (mean of 18 for Ev, 14 for S1, 14 for S2) (Fig. 2G).

Symbiogenesis shaped the evolution of Symbiodiniaceae genes

To examine the effect of a symbiotic lifestyle on the evolution of Symbiodiniaceae genes, we inferred 53 173 homologous sets from 811 611 protein sequences encoded by the genes in the 21 Suessiales genomes (see Materials and Methods). Dinoflagellate taxa external to Suessiales, for which genome data are limited, were excluded from this analysis. Most protein sets (47 353 of 53 173 [89%]) were shared among the four groups (S1 + S2 + Ev + Po). With respect to functions annotated in all homologous sets, these sets were enriched in functions such as cellular motility, biosynthetic processes for rRNA, antibiotics, and glycosides (Supplementary Table 14). There were more lineage-specific protein sets in S1 (6389) and S2 (4056) than in Ev (1734; Fig. 3A). The 3357 protein sets found only in S1 + S2 that split from each other over 40 million years of evolution may reflect convergent evolution due to the symbiotic lifestyle. These protein sets were enriched in diverse functions that include signalling, apoptosis, protein splicing, photosynthesis, cell adhesion, and various transferase activities (Fig. 3B). The protein sets specific to Ev were enriched in functions such as regulation of mitochondrial mRNA stability, metabolic processing of organic compounds, glutathione oxidoreductase, and the binding of calmodulin, metal ions, and nucleotides. This result highlights the importance of regulating energetic needs, metabolizing organic compounds, and the sequestration of metal ions in E. voratum, as a free-living Symbiodiniaceae. Incidentally, Po shared more protein sets with symbiotic lineages (1290; S1 + S2 + Po) than with Ev (221; Ev + Po) based on the datasets we analyzed here; the S1 + S2 + Po sets were enriched in functions such as autophagy and microtubule organization (Fig. 3B). Genetic duplication was found to drive intraspecific genomic divergence of Symbiodiniaceae [46], in which genes encoding functions related to photosynthesis were tandemly duplicated in genomes of the free-living E. voratum and P. glacialis. Although the number of lineage-specific homologous sets we recovered here is affected by the number and divergence of taxa represented in each group (Supplementary Methods and Supplementary Table 15), our observations reflect a higher divergence of genes within S1 and S2 (both comprising different species and genera) than in Ev (comprising a single species).

Evolution of Symbiodiniaceae genes; (A) the number of homologous protein sets is shown above each node and branch and represents those that are shared among or specific to S1, Ev, S2, and/or Po; number of protein sets that are exclusive to Ev, to S1, to Ev + Po, to S2 + S1, and to S2 + S1 + Po were highlighted, and (B) enriched gene ontology (GO) terms for genes in the five distinct groups relative to all GO terms in the corresponding taxa, arranged in decreasing order of significance from top to bottom within the categories: Cellular Component, Biological Process, and Molecular Function.
Figure 3

Evolution of Symbiodiniaceae genes; (A) the number of homologous protein sets is shown above each node and branch and represents those that are shared among or specific to S1, Ev, S2, and/or Po; number of protein sets that are exclusive to Ev, to S1, to Ev + Po, to S2 + S1, and to S2 + S1 + Po were highlighted, and (B) enriched gene ontology (GO) terms for genes in the five distinct groups relative to all GO terms in the corresponding taxa, arranged in decreasing order of significance from top to bottom within the categories: Cellular Component, Biological Process, and Molecular Function.

Genomes of E. voratum exhibit less pseudogenization and greater RNA editing than those of symbiotic lineages

To test the hypothesis that facultative Symbiodiniaceae display higher levels of pseudogenization [5], we identified putative pseudogenes in Symbiodiniaceae genomes (Supplementary Table 16) based on shared sequence similarity of non-coding genomic regions to the predicted genes [15] (see Materials and Methods). In E. voratum RCC1521, rt-383, and CCMP421, we identified 42 462, 78 762, and 17 822 pseudogenes (Supplementary Table 16), compared with 32 108, 39 878, and 32 615 protein-coding genes (Table 1). For each taxon group, we assessed the level of pseudogenization as Ψ, the ratio of the number of pseudogenes to the number of genes in a homologous set (see Materials and Methods). We compared Ψ independently for Ev (ΨEv) against that for S1 (ΨS1), S2 (ΨS2), and the combined S1 and S2 (ΨS1 + S2), then identified protein sets that exhibited significant difference (P < .05) of this value. More protein sets display ΨS1 > ΨEv (336) and ΨS2 > ΨEv (273; Fig. 4A), compared with ΨS1 < ΨEv (300) and ΨS2 < ΨEv (126; Fig. 4B). There was 9-fold more protein sets exhibiting ΨS1 + S2 > ΨEv (229; Fig. 4A) than vice versa (25; Fig. 4B). These pseudogenes are associated with a wide range of functions, including cell cycle processes and stimuli response (Fig. 4C and D). The protein sets that display significantly higher Ψ in the symbiotic lineages are mostly mutually exclusive from the 3357 sets that putatively experienced convergent evolution (only 16–23 sets are represented in ΨS1, ΨS2, ΨS1 + S2). We found negligible technical biases in the clustering of homologous sequences that may affect our inference of pseudogenes (Supplementary Fig. 5, see online supplementary material for a colour version of this figure). These results suggest that in addition to potential convergent evolution in the symbiotic lineages, these Symbiodiniaceae have experienced a greater extent of pseudogenization than has the free-living Ev. Here, we excluded the free-living S. natans and S. pilosum from S1 to avoid signatures of their free-living lifestyle interfering with potential signatures of symbiogenesis in this group. We recovered fewer pseudogenes in these taxa (49 509 in S. natans and 16 607 in S. pilosum; Supplementary Table 16), when compared with the genomes of seven other symbiotic Symbiodinium taxa (mean 58 502). This result lends further support to the hypothesis of greater pseudogenization in genomes of symbiotic taxa when compared with those free-living, despite the variable quality of genome assemblies in our dataset (Supplementary Table 7).

Pseudogenization in Symbiodiniaceae; the number of homologous sequence sets with significantly different levels of pseudogenization, Ψ, shown for (A) those with greater extent in symbiotic lineages (i.e. higher Ψ in S1, S2, or S1 + S2, relative to Ev), and (B) those with greater extent in the free-living Ev (i.e. higher Ψ in Ev relative to S1, S2, or S1 + S2); the black dots on the upset plots indicate taxa groups with higher Ψ than those with grey dots; the associated GO terms are shown for (C) those where ΨS1 + S2 > ΨEv and (D) those where ΨEv > ΨS1 + S2.
Figure 4

Pseudogenization in Symbiodiniaceae; the number of homologous sequence sets with significantly different levels of pseudogenization, Ψ, shown for (A) those with greater extent in symbiotic lineages (i.e. higher Ψ in S1, S2, or S1 + S2, relative to Ev), and (B) those with greater extent in the free-living Ev (i.e. higher Ψ in Ev relative to S1, S2, or S1 + S2); the black dots on the upset plots indicate taxa groups with higher Ψ than those with grey dots; the associated GO terms are shown for (C) those where ΨS1 + S2 > ΨEv and (D) those where ΨEv > ΨS1 + S2.

We recovered greater extent of mRNA editing in E. voratum (45 009 sites implicating 28.5% of protein-coding genes) compared with the symbiotic taxa (e.g. 4227 sites implicating 7.6% of genes in D. trenchii; Supplementary Note, Supplementary Table 17, and Supplementary Fig. 6, see online supplementary material for a colour version of this figure). This observation is consistent with data from another free-living dinoflagellate [19], suggesting a more-pronounced role of mRNA editing in generating functional diversity in free-living versus symbiotic dinoflagellate taxa. However, the function and potential regulatory roles of these mRNA edited sites will need to be validated using targeted experiments [58]. Our alignment-free phylogenies (Supplementary Note, Supplementary Figs 7 and 8, see online supplementary material for a colour version of these figures) of the non-coding and repetitive genomic regions showed divergence of the branch containing Ev to be earlier than S1/S2 with robust node support, in contrast to the phylogeny inferred using the standard molecular markers of 18S rRNA (Supplementary Fig. 7A, see online supplementary material for a colour version of this figure), ITS2 (Supplementary Fig. 8A, see online supplementary material for a colour version of this figure), and multiple homologous protein sets (Fig. 1A). This result clearly indicates differential selective pressure acting on coding versus non-coding regions in Symbiodiniaceae genomes [44] driven by adaptation that involves incomplete lineage sorting, horizontal gene transfer, hybridization, and/or convergent GC-biased gene conversion [59, 60]. It may also reflect the retention of ancestral non-coding regions in the E. voratum genomes and/or the loss of some non-coding regions in symbiotic lineages due to genome streamlining. However, the direct impact of niche specialization of E. voratum on our observations remains to be investigated when more genome-scale data are available.

Discussion

The shared and distinct genomic features we observed between early- and later-branching symbiotic lineages of Symbiodiniaceae suggest an interplay between the geological eras during which they arose, and the corresponding coral morphology and ocean chemistry (Fig. 5). Ancestral Symbiodiniaceae inhabited stony corals presumably as early as 230 MYA in the late Triassic and may have driven the Norian-Rhaetian reef bloom [61]. These early Scleractinian corals (e.g. Retiophyllia) tended to be uniserial, i.e. possessing one corallite per branch and phaceloid with thick walls [62], and thus were less efficient at harvesting light [63]. The ability of the extant Symbiodinium to thrive under high or variable light [1] may be a trait inherited from their ancestor living in these ancient corals. Because these early Symbiodiniaceae adapted to different hosts, they likely underwent genome streamlining [5], experienced high pseudogenization and a reduction in mRNA editing and intron sizes (Fig. 5), as observed here and in other studies [15, 64]. Although these trends were also observed in the later-branching symbionts, Symbiodinium uniquely retains ancestral LINE repeats (Fig. 2C) which were lost in later-branching Symbiodiniaceae, including E. voratum.

Timeline of Symbiodiniaceae genome evolution and coral evolution; the estimated divergence timeline of the family Symbiodiniaceae is shown at the top, indicating representative taxa of S1, Ev, and S2; grey bars represent 95% confidence intervals of divergence times, and key genome signatures for each group related to pseudogenization, mRNA editing, and intron size are shown along the branch; the dotted line represents the yet-unknown timeline of Suessiales divergence from the rest of dinoflagellates, and evolutionary timescale along the different eras highlighting key geological events relevant to coral evolution (adapted from Pandolfi and Kiessling [11]), aligning with Symbiodiniaceae divergence, is shown at the bottom; mass of preserved corals and sponges represented by black line (left y-axis), low latitude (30°S–30°N) ocean temperature in red smoothed line (right y-axis), ocean chemistry of Mg/Ca ratio showing aragonite (dark blue) vs calcite sea (light blue); data were sourced from earlier studies [1, 2, 10, 12, 75, 76].
Figure 5

Timeline of Symbiodiniaceae genome evolution and coral evolution; the estimated divergence timeline of the family Symbiodiniaceae is shown at the top, indicating representative taxa of S1, Ev, and S2; grey bars represent 95% confidence intervals of divergence times, and key genome signatures for each group related to pseudogenization, mRNA editing, and intron size are shown along the branch; the dotted line represents the yet-unknown timeline of Suessiales divergence from the rest of dinoflagellates, and evolutionary timescale along the different eras highlighting key geological events relevant to coral evolution (adapted from Pandolfi and Kiessling [11]), aligning with Symbiodiniaceae divergence, is shown at the bottom; mass of preserved corals and sponges represented by black line (left y-axis), low latitude (30°S–30°N) ocean temperature in red smoothed line (right y-axis), ocean chemistry of Mg/Ca ratio showing aragonite (dark blue) vs calcite sea (light blue); data were sourced from earlier studies [1, 2, 10, 12, 75, 76].

Effrenium voratum is estimated to have diverged 147 ± 40 MYA in the late Jurassic [1], during the early diversification of Symbiodiniaceae. We did not find evidence for genome streamlining in E. voratum, but instead found genomic hallmarks associated with earlier-branching free-living lineages external to family Symbiodiniaceae, e.g. more IE-containing genes and larger intron size than in S1 and S2. Given that Effrenium is also the sole genus that is free-living within Symbiodiniaceae, we postulate that this lineage has not been impacted by symbiogenesis, in contrast to S1 or S2. Since the lineage diverged from S1, global events such as the breakup of Pangea and the Cretaceous-Paleogene mass extinction have occurred, along with changes in coral reef biomass (Fig. 5). Without additional evidence from the fossil record or ancient DNA analysis (e.g. the oldest evidence of Suessiales is P. glacialis from just 9000 years ago [65]), we cannot explain why Effrenium retained a free-living lifestyle. Effrenium and most other Symbiodiniaceae lineages (including free-living species from Symbiodinium) can form endolithic relationships with bacteria, e.g. as calcified biofilms [66], but why Effrenium apparently cannot form an endosymbiotic relationship with a host is unknown.

At the estimated time when S2 lineages diversified (109 ± 30 MYA) [1], shallow-water corals were multiserial, flatter, and more efficient at harvesting light [63]. This is coincident with a rise in ocean temperature (i.e. “greenhouse Earth,” when no continental glaciers existed) and the switch in ocean chemistry from an aragonite sea to a calcite sea, which would have made it difficult for corals to build their aragonite skeletons [11] (Fig. 5). In contrast, bivalve rudists that could build aragonite or calcite shells [67] radiated and flourished [10, 12]. These taxa likely harboured photosymbionts [9], presumably ancestral Symbiodiniaceae, given that extant Symbiodiniaceae (e.g. S. tridacnidorum) can inhabit modern bivalves [47]. Although the genomes of S2 have a lower GC content than those of S1, our results indicate that both S1 and S2 underwent genome streamlining, which may have led to convergent evolution of genes associated with functions relevant to forming a symbiotic association, such as cell signalling, apoptosis, and photosynthesis (Fig. 3). Our observation of longer introns in Ev than in S1/S2 could be explained by two possible evolutionary scenarios: (i) intron expansion in Ev or (ii) intron contraction in S1/S2. In the first scenario, TE-mediated insertions drove intron expansion and were biased toward the 5′-end of genes to prevent disruption of functional elements [68], yielding larger intron sizes in these regions. We did not observe this trend in Ev (Supplementary Fig. 9, see online supplementary material for a colour version of this figure). In the second scenario, which has been observed in endosymbiotic/parasitic organisms, the reduction of intron size and density occurs as a result of genome reduction and/or streamlining induced by spatial confinement in the host organism or cell [69]. Considering the evolutionary history of Symbiodiniaceae, intron contraction in S1/S2 taxa due to their symbiotic lifestyle is a more-plausible explanation for the data than intron expansion in Ev. Large introns observed in Po (Fig. 2F), albeit at a lower intron density per gene (Fig. 2G), lend further support to this notion. Therefore, we posit that symbiogenesis drove genome evolution in Symbiodiniaceae and elicited common features such as pseudogenization, lowered mRNA editing and intron contraction, but some features (e.g. LINE retention and GC content) were affected differently in earlier versus later-branching symbiotic lineages.

Based on the taxa we studied here, our results provide strong evidence for a phase of genome reduction that occurred in the Suessiales ancestor. This pattern is reminiscent of cyanobacterial lineages which have undergone gene loss, namely Prochlorococcus species when compared with their sister group, Synechococcus [70]. Both lineages inhabit oligotrophic, open oceans and do not exhibit drastic phenotypic differences, despite the significant changes that have occurred in genome content and organization. A more-extreme scenario is provided by the red algae (Rhodophyta), whose common ancestor underwent massive genome reduction, precipitating the loss of canonical eukaryotic features such as flagellum-based motility, phytochromes, and autophagy [71]. These gene losses predated the split of the two monophyletic lineages that comprise this phylum: the extremophilic Cyanidiophytina that specialized to life in hot spring environments and the species-rich mesophilic lineages (e.g. red seaweeds) that inhabit a variety of aquatic environments. Most red algae have therefore smaller genomes when compared with the green lineage and have adapted to diverse habitats through gene family evolution and horizontal gene transfer.

In an analogous fashion, it appears that the common ancestor of Suessiales underwent significant genome reduction. An alternate explanation for our data is that genome size reduction in the psychrophilic P. glacialis may have resulted from independent genome streamlining in this branch of evolution (Fig. 1) due to its highly specialized lifestyle and is unrelated to symbiosis or the ancestral state in Symbiodiniaceae. Under this scenario, two independent phases of genome reduction occurred, once in the Symbiodiniaceae ancestor (N3) and once in the Polarella lineage (N4; Fig. 1A). We favour (but cannot prove) the more parsimonious hypothesis of a single major genome reduction event (N2), followed by a more minor event that occurred in the Polarella branch (N4), with the latter driven by adaptation to a psychrophilic lifestyle (Fig. 1A). This type of scenario also played out in the Rhodophyta, in which there are two cases of genome reduction, a massive one in the red algal ancestor and a smaller one in the Cyanidiophyceae that reflects its transition to an extremophilic lifestyle [71]. Therefore, we hypothesize that the Symbiodiniaceae have smaller genome sizes than most free-living dinoflagellates, not because of the coral symbiosis, but due to more ancient selective constraints.

Our results are consistent with (but do not prove) the appealing idea that symbiosis offered an “escape” from reduced functional capacity due to genome reduction during the early stages of Symbiodiniaceae evolution; see [72] for a perspective. Regardless, the mixture of obligate free-living to facultative lifestyles among extant Symbiodiniaceae has resulted in divergent paths of genome evolution [53]. These results demonstrate the retention of ancestral Symbiodiniaceae genome features in E. voratum (in contrast to symbiotic lineages) despite multiple emergences of symbiogenesis over the past 200 million years. These observations support the notion that evolution favoured a free-living lifestyle for E. voratum (and by extension the genus Effrenium), likely due to local selective pressures. Therefore, Effrenium presents a useful free-living outgroup for studying the structural and functional genome features of symbiotic Symbiodiniaceae, and the implications of these features on ecology and evolution, including but not limited to host specificity and the facultativeness of symbiotic associations.

Acknowledgements

We are grateful to Todd LaJeunesse and Hannah Reich who generously supplied the cell cultures of E. voratum strains used in this study. This project is supported by high-performance computing facilities at the National Computational Infrastructure (NCI) National Facility systems through the NCI Merit Allocation Scheme (Project d85) awarded to C.X.C., the University of Queensland Research Computing Centre, and computing facility at the Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences at The University of Queensland.

Author contributions

Conceptualization, S.S., K.E.D., D.B., and C.X.C.; methodology, S.S., K.E.D., Y.C., S.K.R., A.J.B., V.M., and C.X.C.; formal analysis, S.S., K.E.D., Y.C., R.L., G.L., M.D.A.F., and V.M.; investigation, S.S. and K.E.D.; resources, S.S., S.K.R., A.J.B., M.R.L., and C.X.C., writing—original draft preparation, S.S.; writing—review and editing, S.S., K.E.D., D.B., and C.X.C.; visualization, S.S.; supervision, K.E.D., D.B., and C.X.C.; funding acquisition, M.R.L., D.B., and C.X.C. All authors have read and agreed to the published version of the manuscript.

Conflicts of interest

The authors declare that they have no competing interests.

Funding

This work was supported by the Australian Research Council grant DP190102474 awarded to C.X.C. and D.B., The University of Queensland Genome Innovation Hub Collaborative Research grant, and the Australian Academy of Science Thomas Davies Research Grant for Marine, Soil and Plant Biology awarded to C.X.C. S.S. and Y.C. were supported by The University of Queensland Research Training Program scholarship. M.R.L. was supported by the National Science Foundation IOS CAREER 1453519. D.B. was supported by grants from the National Science Foundation (1756616, 2128073), the National Aeronautics and Space Administration (80NSSC19K0462), and the USDA National Institute of Food and Agriculture Hatch Formula project (NJ01180).

Data availability

All sequencing data generated from this study are available on NCBI GenBank via BioProject accession PRJEB61191. The assembled and annotated genomes for the three E. voratum isolates are available on GenBank (accessions GCA_963377175, GCA_963377275, and GCA_963377065). The assembled genomes, predicted gene models and proteins, identified organellar genome sequences, functional annotation of gene models, and the scripts associated with key analyses are available at https://doi.org/10.5281/zenodo.10894296. The scripts for complete genome annotation workflow of each E. voratum genome is available at https://doi.org/10.5281/zenodo.10896466 (RCC1521), https://doi.org/10.5281/zenodo.10896474 (rt-383), and https://doi.org/10.5281/zenodo.10896494 (CCMP421).

References

1.

LaJeunesse
TC
,
Parkinson
JE
,
Gabrielson
PW
et al.
Systematic revision of Symbiodiniaceae highlights the antiquity and diversity of coral endosymbionts
.
Curr Biol
2018
;
28
:
2570
2580.e6
. https://doi.org/10.1016/j.cub.2018.07.008

2.

Frankowiak
K
,
Roniewicz
E
,
Stolarski
J
.
Photosymbiosis in Late Triassic scleractinian corals from the Italian Dolomites
.
PeerJ
2021
;
9
:
e11062
. https://doi.org/10.7717/peerj.11062

3.

Guerrero
R
,
Margulis
L
,
Berlanga
M
.
Symbiogenesis: the holobiont as a unit of evolution
.
Int Microbiol
2013
;
16
:
133
43
. https://doi.org/10.2436/20.1501.01.188

4.

McCutcheon
JP
,
Moran
NA
.
Extreme genome reduction in symbiotic bacteria
.
Nat Rev Microbiol
2012
;
10
:
13
26
. https://doi.org/10.1038/nrmicro2670

5.

González-Pech
RA
,
Bhattacharya
D
,
Ragan
MA
et al.
Genome evolution of coral reef symbionts as intracellular residents
.
Trends Ecol Evol
2019
;
34
:
799
806
. https://doi.org/10.1016/j.tree.2019.04.010

6.

LaJeunesse
TC
,
Wiedenmann
J
,
Casado-Amezúa
P
et al.
Revival of Philozoon Geddes for host-specialized dinoflagellates, ‘zooxanthellae’, in animals from coastal temperate zones of northern and southern hemispheres
.
Eur J Phycol
2022
;
57
:
166
80
. https://doi.org/10.1080/09670262.2021.1914863

7.

Vulpius
S
,
Kiessling
W
.
New constraints on the last aragonite–calcite sea transition from early Jurassic ooids
.
Facies
2017
;
64
:
3
. https://doi.org/10.1007/s10347-017-0516-x

8.

Schettino
A
,
Turco
E
.
Breakup of Pangaea and plate kinematics of the Central Atlantic and Atlas regions
.
Geophys J Int
2009
;
178
:
1078
97
. https://doi.org/10.1111/j.1365-246X.2009.04186.x

9.

de Winter
NJ
,
Goderis
S
,
Van Malderen
SJM
et al.
Subdaily-scale chemical variability in a Torreites sanchezi rudist shell: implications for rudist paleobiology and the cretaceous day-night cycle
.
Paleoceanogr Paleoclimatol
2020
;
35
:
e2019PA003723
. https://doi.org/10.1029/2019PA003723

10.

Stanley
GD
,
van de Schootbrugge
B
; The evolution of the coral–algal symbiosis and coral bleaching in the geologic past. In:
van
Oppen
MJH
,
Lough
JMS
(eds).,
Coral Bleaching: Patterns, Processes, Causes and Consequences
.
Cham
:
Springer International Publishing
,
2018
,
9
26
. https://doi.org/10.1007/978-3-319-75393-5_2

11.

Pandolfi
JM
,
Kiessling
W
.
Gaining insights from past reefs to inform understanding of coral reef response to global climate change
.
Curr Opin Environ Sustain
2014
;
7
:
52
8
. https://doi.org/10.1016/j.cosust.2013.11.020

12.

Veron
JEN
. Scleractinia, evolution and taxonomy. In:
Hopley
D.S.
(ed.),
Encyclopedia of Modern Coral Reefs: Structure, Form and Process
.
Dordrecht
:
Springer Netherlands
,
2011
,
947
57
.

13.

Stephens
TG
,
González-Pech
RA
,
Cheng
Y
et al.
Genomes of the dinoflagellate Polarella glacialis encode tandemly repeated single-exon genes with adaptive functions
.
BMC Biol
2020
;
18
:
56
. https://doi.org/10.1186/s12915-020-00782-8

14.

Li
T
,
Yu
L
,
Song
B
et al.
Genome improvement and core gene set refinement of Fugacium kawagutii
.
Microorganisms
2020
;
8
:
102
. https://doi.org/10.3390/microorganisms8010102

15.

González-Pech
RA
,
Stephens
TG
,
Chen
Y
et al.
Comparison of 15 dinoflagellate genomes reveals extensive sequence and structural divergence in family Symbiodiniaceae and genus Symbiodinium
.
BMC Biol
2021
;
19
:
73
. https://doi.org/10.1186/s12915-021-00994-6

16.

Chen
Y
,
Shah
S
,
Dougan
KE
et al.
Improved Cladocopium goreaui genome assembly reveals features of a facultative coral symbiont and the complex evolutionary history of dinoflagellate genes
.
Microorganisms
2022
;
10
:
1662
. https://doi.org/10.3390/microorganisms10081662

17.

LaJeunesse
TC
,
Lambert
G
,
Andersen
RA
et al.
Symbiodinium (Pyrrhophyta) genome sizes (DNA content) are smallest among dinoflagellates
.
J Phycol
2005
;
41
:
880
6
. https://doi.org/10.1111/j.0022-3646.2005.04231.x

18.

Saad
OS
,
Lin
X
,
Ng
TY
et al.
Genome size, rDNA copy, and qPCR assays for Symbiodiniaceae
.
Front Microbiol
2020
;
11
:
847
. https://doi.org/10.3389/fmicb.2020.00847

19.

Dougan
KE
,
Deng
Z-L
,
Wöhlbrand
L
et al.
Multi-omics analysis reveals the molecular response to heat stress in a “red tide” dinoflagellate
.
Genome Biol
2023
;
24
:
265
. https://doi.org/10.1186/s13059-023-03107-4

20.

Hou
Y
,
Lin
S
.
Distinct gene number-genome size relationships for eukaryotes and non-eukaryotes: gene content estimation for dinoflagellate genomes
.
PLoS One
2009
;
4
:
e6978
. https://doi.org/10.1371/journal.pone.0006978

21.

Nitschke
MR
,
Craveiro
SC
,
Brandao
C
et al.
Description of Freudenthalidium gen. nov. and Halluxium gen. nov. to formally recognize Clades Fr3 and H as genera in the family Symbiodiniaceae (Dinophyceae)
.
J Phycol
2020
;
56
:
923
40
. https://doi.org/10.1111/jpy.12999

22.

Pochon
X
,
LaJeunesse
TC
.
Miliolidium n. gen, a new symbiodiniacean genus whose members associate with soritid foraminifera or are free-living
.
J Eukaryot Microbiol
2021
;
68
:
e12856
. https://doi.org/10.1111/jeu.12856

23.

Jeong
HJ
,
Lee
SY
,
Kang
N
et al.
Genetics and morphology characterize the dinoflagellate Symbiodinium voratum, n. sp., (Dinophyceae) as the sole representative of Symbiodinium clade E
.
J Eukaryot Microbiol
2014
;
61
:
75
94
. https://doi.org/10.1111/jeu.12088

24.

Lee
MJ
,
Jeong
HJ
,
Jang
SH
et al.
Most low-abundance "background" Symbiodinium spp. are transitory and have minimal functional significance for symbiotic corals
.
Microb Ecol
2016
;
71
:
771
83
. https://doi.org/10.1007/s00248-015-0724-2

25.

Yang
F
,
Wei
Z
,
Long
L
.
Response mechanisms to ocean warming exposure in Effrenium voratum (Symbiodiniaceae)
.
Mar Pollut Bull
2022
;
182
:
114032
. https://doi.org/10.1016/j.marpolbul.2022.114032

26.

Xiang
T
,
Hambleton
EA
,
DeNofrio
JC
et al.
Isolation of clonal axenic strains of the symbiotic dinoflagellate Symbiodinium and their growth and host specificity
.
J Phycol
2013
;
49
:
447
58
. https://doi.org/10.1111/jpy.12055

27.

Gabay
Y
,
Weis
VM
,
Davy
SK
.
Symbiont identity influences patterns of symbiosis establishment, host growth, and asexual reproduction in a model cnidarian-dinoflagellate symbiosis
.
Biol Bull
2018
;
234
:
1
10
. https://doi.org/10.1086/696365

28.

Grabherr
MG
,
Haas
BJ
,
Yassour
M
et al.
Full-length transcriptome assembly from RNA-Seq data without a reference genome
.
Nat Biotechnol
2011
;
29
:
644
52
. https://doi.org/10.1038/nbt.1883

29.

Kim
D
,
Paggi
JM
,
Park
C
et al.
Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype
.
Nat Biotechnol
2019
;
37
:
907
15
. https://doi.org/10.1038/s41587-019-0201-4

30.

Zimin
AV
,
Marçais
G
,
Puiu
D
et al.
The MaSuRCA genome assembler
.
Bioinformatics
2013
;
29
:
2669
77
. https://doi.org/10.1093/bioinformatics/btt476

31.

Weisenfeld
NI
,
Kumar
V
,
Shah
P
et al.
Direct determination of diploid genome sequences
.
Genome Res
2017
;
27
:
757
67
. https://doi.org/10.1101/gr.214874.116

32.

Xue
W
,
Li
J-T
,
Zhu
Y-P
et al.
L_RNA_scaffolder: scaffolding genomes with transcripts
.
BMC Genomics
2013
;
14
:
604
. https://doi.org/10.1186/1471-2164-14-604

33.

Hiltunen
M
,
Ryberg
M
,
Johannesson
H
.
ARBitR: an overlap-aware genome assembly scaffolder for linked reads
.
Bioinformatics
2020
;
37
:
2203
5
. https://doi.org/10.1093/bioinformatics/btaa975

34.

Johnson
LK
,
Alexander
H
,
Brown
CT
.
Re-assembly, quality evaluation, and annotation of 678 microbial eukaryotic reference transcriptomes
.
GigaScience
2019
;
8
:
giy158
. https://doi.org/10.1093/gigascience/giy158

35.

Laetsch
D
,
Blaxter
M
.
BlobTools: interrogation of genome assemblies [version 1; peer review: 2 approved with reservations]
.
F1000Res
2017
;
6
:1287. https://doi.org/10.12688/f1000research.12232.1

36.

Simão
FA
,
Waterhouse
RM
,
Ioannidis
P
et al.
BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs
.
Bioinformatics
2015
;
31
:
3210
2
. https://doi.org/10.1093/bioinformatics/btv351

37.

Marçais
G
,
Delcher
AL
,
Phillippy
AM
et al.
MUMmer4: a fast and versatile genome alignment system
.
PLoS Comput Biol
2018
;
14
:
e1005944
. https://doi.org/10.1371/journal.pcbi.1005944

38.

Hume
BCC
,
Smith
EG
,
Ziegler
M
et al.
SymPortal: a novel analytical framework and platform for coral algal symbiont next-generation sequencing ITS2 profiling
.
Mol Ecol Resour
2019
;
19
:
1063
80
. https://doi.org/10.1111/1755-0998.13004

39.

Katoh
K
,
Standley
DM
.
MAFFT multiple sequence alignment software version 7: improvements in performance and usability
.
Mol Biol Evol
2013
;
30
:
772
80
. https://doi.org/10.1093/molbev/mst010

40.

Capella-Gutiérrez
S
,
Silla-Martínez
JM
,
Gabaldón
T
.
trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
.
Bioinformatics
2009
;
25
:
1972
3
. https://doi.org/10.1093/bioinformatics/btp348

41.

Nguyen
L-T
,
Schmidt
HA
,
von Haeseler
A
et al.
IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies
.
Mol Biol Evol
2015
;
32
:
268
74
. https://doi.org/10.1093/molbev/msu300

42.

Beedessee
G
,
Kubota
T
,
Arimoto
A
et al.
Integrated omics unveil the secondary metabolic landscape of a basal dinoflagellate
.
BMC Biol
2020
;
18
:
139
. https://doi.org/10.1186/s12915-020-00873-6

43.

John
U
,
Lu
Y
,
Wohlrab
S
et al.
An aerobic eukaryotic parasite with functional mitochondria that likely lacks a mitochondrial genome
.
Sci Adv
2019
;
5
:
eaav1110
. https://doi.org/10.1126/sciadv.aav1110

44.

Lo
R
,
Dougan
KE
,
Chen
Y
et al.
Alignment-free analysis of whole-genome sequences from Symbiodiniaceae reveals different phylogenetic signals in distinct regions
.
Front Plant Sci
2022
;
13
:
815714
. https://doi.org/10.3389/fpls.2022.815714

45.

Emms
DM
,
Kelly
S
.
OrthoFinder: phylogenetic orthology inference for comparative genomics
.
Genome Biol
2019
;
20
:
238
. https://doi.org/10.1186/s13059-019-1832-y

46.

Shah
S
,
Dougan
KE
,
Chen
Y
et al.
Gene duplication is the primary driver of intraspecific genomic divergence in coral algal symbionts
.
Open Biol
2023
;
13
:
230182
. https://doi.org/10.1098/rsob.230182

47.

Shoguchi
E
,
Beedessee
G
,
Tada
I
et al.
Two divergent Symbiodinium genomes reveal conservation of a gene cluster for sunscreen biosynthesis and recently lost genes
.
BMC Genomics
2018
;
19
:
458
. https://doi.org/10.1186/s12864-018-4857-9

48.

Dougan
KE
,
Bellantuono
AJ
,
Kahlke
T
et al.
Whole-genome duplication in an algal symbiont serendipitously confers thermal tolerance to corals
.
bioRxiv
2022
;
2022.04.10.487810
. https://doi.org/10.1101/2022.04.10.487810

49.

Robbins
SJ
,
Singleton
CM
,
Chan
CX
et al.
A genomic view of the reef-building coral Porites lutea and its microbial symbionts
.
Nat Microbiol
2019
;
4
:
2090
100
. https://doi.org/10.1038/s41564-019-0532-4

50.

Shoguchi
E
,
Shinzato
C
,
Kawashima
T
et al.
Draft assembly of the Symbiodinium minutum nuclear genome reveals dinoflagellate gene structure
.
Curr Biol
2013
;
23
:
1399
408
. https://doi.org/10.1016/j.cub.2013.05.062

51.

Nand
A
,
Zhan
Y
,
Salazar
OR
et al.
Genetic and spatial organization of the unusual chromosomes of the dinoflagellate Symbiodinium microadriaticum
.
Nat Genet
2021
;
53
:
618
29
. https://doi.org/10.1038/s41588-021-00841-y

52.

Adl
SM
,
Bass
D
,
Lane
CE
et al.
Revisions to the classification, nomenclature, and diversity of eukaryotes
.
J Eukaryot Microbiol
2019
;
66
:
4
119
. https://doi.org/10.1111/jeu.12691

53.

Bhattacharya
D
,
Stephens
TG
,
Chille
EE
et al.
Facultative lifestyle drives diversity of coral algal symbionts
.
Trends Ecol Evol
2024
;
39
:
239
47
. https://doi.org/10.1016/j.tree.2023.10.005

54.

Hershberg
R
,
Petrov
DA
.
Evidence that mutation is universally biased towards AT in bacteria
.
PLoS Genet
2010
;
6
:
e1001115
. https://doi.org/10.1371/journal.pgen.1001115

55.

Nikbakht
H
,
Xia
X
,
Hickey
DA
.
The evolution of genomic GC content undergoes a rapid reversal within the genus Plasmodium
.
Genome
2014
;
57
:
507
11
. https://doi.org/10.1139/gen-2014-0158

56.

Blanc
G
,
Duncan
G
,
Agarkova
I
et al.
The Chlorella variabilis NC64A genome reveals adaptation to photosymbiosis, coevolution with viruses, and cryptic sex
.
Plant Cell
2010
;
22
:
2943
55
. https://doi.org/10.1105/tpc.110.076406

57.

Farhat
S
,
Le
P
,
Kayal
E
et al.
Rapid protein evolution, organellar reductions, and invasive intronic elements in the marine aerobic parasite dinoflagellate Amoebophrya spp
.
BMC Biol
2021
;
19
:
1
. https://doi.org/10.1186/s12915-020-00927-9

58.

Zhang
H
,
Lin
S
.
Mitochondrial cytochrome b mRNA editing in dinoflagellates: possible ecological and evolutionary associations?
J Eukaryot Microbiol
2005
;
52
:
538
45
. https://doi.org/10.1111/j.1550-7408.2005.00060.x

59.

Chan
KO
,
Hutter
CR
,
Wood
PL
et al.
Larger, unfiltered datasets are more effective at resolving phylogenetic conflict: introns, exons, and UCEs resolve ambiguities in golden-backed frogs (Anura: Ranidae; genus Hylarana)
.
Mol Phylogen Evol
2020
;
151
:
106899
. https://doi.org/10.1016/j.ympev.2020.106899

60.

Pessia
E
,
Popa
A
,
Mousset
S
et al.
Evidence for widespread GC-biased gene conversion in eukaryotes
.
Genome Biol Evol
2012
;
4
:
675
82
. https://doi.org/10.1093/gbe/evs052

61.

Kiessling
W
,
Roniewicz
E
,
Villier
L
et al.
An early Hettangian coral reef in southern France implications for the end-Triassic reef crisis
.
PALAIOS
2009
;
24
:
657
71
. https://doi.org/10.2110/palo.2009.p09-030r

62.

Stanley
GD
.
The evolution of modern corals and their early history
.
Earth-Sci Rev
2003
;
60
:
195
225
. https://doi.org/10.1016/S0012-8252(02)00104-6

63.

Enríquez
S
,
Méndez
ER
,
Hoegh-Guldberg
O
et al.
Key functional role of the optical properties of coral skeletons in coral ecology and evolution
.
Proc R Soc B
2017
;
284
:
20161667
. https://doi.org/10.1098/rspb.2016.1667

64.

Liew
YJ
,
Li
Y
,
Baumgarten
S
et al.
Condition-specific RNA editing in the coral symbiont Symbiodinium microadriaticum
.
PLoS Genet
2017
;
13
:
e1006619
. https://doi.org/10.1371/journal.pgen.1006619

65.

De Schepper
S
,
Ray
JL
,
Skaar
KS
et al.
The potential of sedimentary ancient DNA for reconstructing past sea ice evolution
.
ISME J
2019
;
13
:
2566
77
. https://doi.org/10.1038/s41396-019-0457-1

66.

Nitschke
MR
,
Fidalgo
C
,
Simões
J
et al.
Symbiolite formation: a powerful in vitro model to untangle the role of bacterial communities in the photosynthesis-induced formation of microbialites
.
ISME J
2020
;
14
:
1533
46
. https://doi.org/10.1038/s41396-020-0629-z

67.

Steuber
T
,
Löser
H
.
Species richness and abundance patterns of Tethyan Cretaceous rudist bivalves (Mollusca: Hippuritacea) in the central-eastern Mediterranean and Middle East, analysed from a palaeontological database
.
Palaeogeogr Palaeoclimatol Palaeoecol
2000
;
162
:
75
104
. https://doi.org/10.1016/S0031-0182(00)00106-1

68.

McCoy
MJ
,
Fire
AZ
.
Intron and gene size expansion during nervous system evolution
.
BMC Genomics
2020
;
21
:
360
. https://doi.org/10.1186/s12864-020-6760-4

69.

Cai
L
,
Arnold
BJ
,
Xi
Z
et al.
Deeply altered genome architecture in the endoparasitic flowering plant Sapria himalayana Griff. (Rafflesiaceae)
.
Curr Biol
2021
;
31
:
1002
1011.e9
. https://doi.org/10.1016/j.cub.2020.12.045

70.

Luo
H
,
Friedman
R
,
Tang
J
et al.
Genome reduction by deletion of paralogs in the marine cyanobacterium Prochlorococcus
.
Mol Biol Evol
2011
;
28
:
2751
60
. https://doi.org/10.1093/molbev/msr081

71.

Bhattacharya
D
,
Qiu
H
,
Lee
J
et al.
When less is more: red algae as models for studying gene loss and genome evolution in eukaryotes
.
Crit Rev Plant Sci
2018
;
37
:
81
99
. https://doi.org/10.1080/07352689.2018.1482364

72.

Ishida
H
,
John
U
,
Murray
SA
et al.
Developing model systems for dinoflagellates in the post-genomic era
.
J Phycol
2023
;
59
:
799
808
. https://doi.org/10.1111/jpy.13386

73.

Polne-Fuller
M
.
A novel technique for preparation of axenic cultures of Symbiodinium (Pyrrophyta) through selective digestion by amoebae
.
J Phycol
1991
;
27
:
552
4
. https://doi.org/10.1111/j.0022-3646.1991.00552.x

74.

Chang
FH
.
Winter phytoplankton and microzooplankton populations off the coast of Westland, New Zealand, 1979
.
N Z J Mar Freshwat Res
1983
;
17
:
279
304
. https://doi.org/10.1080/00288330.1983.9516003

75.

Quattrini
AM
,
Rodríguez
E
,
Faircloth
BC
et al.
Palaeoclimate ocean conditions shaped the evolution of corals and their skeletons through deep time
.
Nat Ecol Evol
2020
;
4
:
1531
8
. https://doi.org/10.1038/s41559-020-01291-1

76.

Grossman
EL
,
Joachimski
MM
.
Ocean temperatures through the Phanerozoic reassessed
.
Sci Rep
2022
;
12
:
8938
. https://doi.org/10.1038/s41598-022-11493-1

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.