Reference genome of the Monkeyface Prickleback, Cebidichthys violaceus

Assembly pipeline and software used.

Assembly	Software and options^a	Version
Filtering PacBio HiFi adapters	HiFiAdapterFilt	Commit 64d1c7b
K-mer counting	Meryl	1
Estimation of genome size and heterozygosity	GenomeScope HiFiasm (Hi-C mode, –primary, p_ctg, and a_ctg output)	2
De novo assembly (contiging)		0.16.1-r375
Remove low-coverage, duplicated contigs	purge_dups	1.2.6
Scaffolding
Omni-C scaffolding	SALSA (-DNASE, -i 20, -p yes)	2
Gap closing	YAGCloser (-mins 2 -f 20 -mcc 2 -prt 0.25 -eft 0.2 -pld 0.2)	Commit 20e2769
Omni-C contact map generation
Short-read alignment	BWA-MEM (-5SP)	0.7.17-r1188
SAM/BAM processing	samtools	1.11
SAM/BAM filtering	pairtools	0.3.0
Pairs indexing	pairix	0.3.7
Matrix generation	cooler	0.8.10
Matrix balancing	HicExplorer (hicCorrectmatrix correct -- filterThreshold -2 4)	3.6
	HiGlass	2.1.11
	PretextMap	0.1.4
	PretextView	0.1.5
Contact map visualization	PretextSnapshot	0.03
Organelle assembly
Mitogenome assembly	MitoHiFi (-r, -p 50, -o 1)	Commit c06ed3e
Genome quality assessment
Basic assembly metrics	QUAST (--est-ref-size)	5.0.2
	BUSCO (-m geno, -l actinopterygii)	5.0.0
	Merqury	2022-01-29
Assembly completeness	Repeat Masker (-s, “actinopterygii”)	4.1.2-p1
Contamination screening
General contamination screening	BlobToolKit	2.3.3
Local sequence alignment	BLAST+	2.1

Assembly	Software and options^a	Version
Filtering PacBio HiFi adapters	HiFiAdapterFilt	Commit 64d1c7b
K-mer counting	Meryl	1
Estimation of genome size and heterozygosity	GenomeScope HiFiasm (Hi-C mode, –primary, p_ctg, and a_ctg output)	2
De novo assembly (contiging)		0.16.1-r375
Remove low-coverage, duplicated contigs	purge_dups	1.2.6
Scaffolding
Omni-C scaffolding	SALSA (-DNASE, -i 20, -p yes)	2
Gap closing	YAGCloser (-mins 2 -f 20 -mcc 2 -prt 0.25 -eft 0.2 -pld 0.2)	Commit 20e2769
Omni-C contact map generation
Short-read alignment	BWA-MEM (-5SP)	0.7.17-r1188
SAM/BAM processing	samtools	1.11
SAM/BAM filtering	pairtools	0.3.0
Pairs indexing	pairix	0.3.7
Matrix generation	cooler	0.8.10
Matrix balancing	HicExplorer (hicCorrectmatrix correct -- filterThreshold -2 4)	3.6
	HiGlass	2.1.11
	PretextMap	0.1.4
	PretextView	0.1.5
Contact map visualization	PretextSnapshot	0.03
Organelle assembly
Mitogenome assembly	MitoHiFi (-r, -p 50, -o 1)	Commit c06ed3e
Genome quality assessment
Basic assembly metrics	QUAST (--est-ref-size)	5.0.2
	BUSCO (-m geno, -l actinopterygii)	5.0.0
	Merqury	2022-01-29
Assembly completeness	Repeat Masker (-s, “actinopterygii”)	4.1.2-p1
Contamination screening
General contamination screening	BlobToolKit	2.3.3
Local sequence alignment	BLAST+	2.1

Software citations are listed in the text.

^aOptions detailed for nondefault parameters.

Table 1.

Open in new tab Download slide

Assembly pipeline and software used.

Assembly	Software and options^a	Version
Filtering PacBio HiFi adapters	HiFiAdapterFilt	Commit 64d1c7b
K-mer counting	Meryl	1
Estimation of genome size and heterozygosity	GenomeScope HiFiasm (Hi-C mode, –primary, p_ctg, and a_ctg output)	2
De novo assembly (contiging)		0.16.1-r375
Remove low-coverage, duplicated contigs	purge_dups	1.2.6
Scaffolding
Omni-C scaffolding	SALSA (-DNASE, -i 20, -p yes)	2
Gap closing	YAGCloser (-mins 2 -f 20 -mcc 2 -prt 0.25 -eft 0.2 -pld 0.2)	Commit 20e2769
Omni-C contact map generation
Short-read alignment	BWA-MEM (-5SP)	0.7.17-r1188
SAM/BAM processing	samtools	1.11
SAM/BAM filtering	pairtools	0.3.0
Pairs indexing	pairix	0.3.7
Matrix generation	cooler	0.8.10
Matrix balancing	HicExplorer (hicCorrectmatrix correct -- filterThreshold -2 4)	3.6
	HiGlass	2.1.11
	PretextMap	0.1.4
	PretextView	0.1.5
Contact map visualization	PretextSnapshot	0.03
Organelle assembly
Mitogenome assembly	MitoHiFi (-r, -p 50, -o 1)	Commit c06ed3e
Genome quality assessment
Basic assembly metrics	QUAST (--est-ref-size)	5.0.2
	BUSCO (-m geno, -l actinopterygii)	5.0.0
	Merqury	2022-01-29
Assembly completeness	Repeat Masker (-s, “actinopterygii”)	4.1.2-p1
Contamination screening
General contamination screening	BlobToolKit	2.3.3
Local sequence alignment	BLAST+	2.1

Assembly	Software and options^a	Version
Filtering PacBio HiFi adapters	HiFiAdapterFilt	Commit 64d1c7b
K-mer counting	Meryl	1
Estimation of genome size and heterozygosity	GenomeScope HiFiasm (Hi-C mode, –primary, p_ctg, and a_ctg output)	2
De novo assembly (contiging)		0.16.1-r375
Remove low-coverage, duplicated contigs	purge_dups	1.2.6
Scaffolding
Omni-C scaffolding	SALSA (-DNASE, -i 20, -p yes)	2
Gap closing	YAGCloser (-mins 2 -f 20 -mcc 2 -prt 0.25 -eft 0.2 -pld 0.2)	Commit 20e2769
Omni-C contact map generation
Short-read alignment	BWA-MEM (-5SP)	0.7.17-r1188
SAM/BAM processing	samtools	1.11
SAM/BAM filtering	pairtools	0.3.0
Pairs indexing	pairix	0.3.7
Matrix generation	cooler	0.8.10
Matrix balancing	HicExplorer (hicCorrectmatrix correct -- filterThreshold -2 4)	3.6
	HiGlass	2.1.11
	PretextMap	0.1.4
	PretextView	0.1.5
Contact map visualization	PretextSnapshot	0.03
Organelle assembly
Mitogenome assembly	MitoHiFi (-r, -p 50, -o 1)	Commit c06ed3e
Genome quality assessment
Basic assembly metrics	QUAST (--est-ref-size)	5.0.2
	BUSCO (-m geno, -l actinopterygii)	5.0.0
	Merqury	2022-01-29
Assembly completeness	Repeat Masker (-s, “actinopterygii”)	4.1.2-p1
Contamination screening
General contamination screening	BlobToolKit	2.3.3
Local sequence alignment	BLAST+	2.1

Software citations are listed in the text.

^aOptions detailed for nondefault parameters.

Next, we identified sequences corresponding to haplotypic duplications and contig overlaps on the primary assembly with purge_dups (Guan et al. 2020), transferred them to the alternate assembly, and scaffolded both assemblies using the Omni-C data with SALSA (Ghurye et al. 2019).

The primary assembly was manually curated by generating and analyzing Omni-C contact maps and breaking the assembly where major misassemblies were found. No further joins were made after this step. To generate the contact maps, we aligned the Omni-C data against the corresponding reference with BWA-MEM (Li 2013), identified ligation junctions, and generated Omni-C pairs using pairtools (Goloborodko et al. 2018). We generated a multi-resolution Omni-C matrix with Cooler (Abdennur and Mirny 2020) and balanced it with hicExplorer (Ramírez et al. 2018). We used HiGlass (Kerpedjiev et al. 2018) and the PretextSuite (https://github.com/wtsi-hpag/PretextView; https://github.com/wtsi-hpag/PretextMap; https://github.com/wtsi-hpag/PretextSnapshot) to visualize the contact maps.

We closed the remaining gaps generated during scaffolding with the PacBio HiFi reads and YAGCloser (https://github.com/merlyescalona/yagcloser). We then checked for contamination using the BlobToolKit Framework (Challis et al. 2020). Finally, we trimmed remnants of sequence adaptors and mitochondrial contamination based on NCBI contamination screening.

Mitochondrial genome assembly

We assembled the mitochondrial genome of the Monkeyface Prickleback from the PacBio HiFi reads using the reference-guided pipeline MitoHiFi (https://github.com/marcelauliano/MitoHiFi; Allio et al. 2020). The mitochondrial sequence of Dictyosoma burgeri (family Stichaeidae; NCBI:NC_053709.1) was used as the starting reference sequence. After completion of the nuclear genome, we searched for matches of the resulting mitochondrial assembly sequence in the nuclear genome assembly using BLAST+ (Camacho et al. 2009) and filtered out contigs and scaffolds from the nuclear genome with a percentage of sequence identity >99% and size smaller than the mitochondrial assembly sequence.

Genome size estimation and quality assessment

We generated k-mer counts from the PacBio HiFi reads using meryl (https://github.com/marbl/meryl). The generated k-mer database was then used in GenomeScope2.0 (Ranallo-Benavidez et al. 2020) to estimate genome features including genome size, heterozygosity, and repeat content. To obtain general contiguity metrics, we ran QUAST (Gurevich et al. 2013). To evaluate genome quality and completeness we used BUSCO (Manni et al. 2021) with the Actinopterygii ortholog database (actinopterygii_odb10) which contains 3,640 genes. Assessment of base level accuracy (QV) and k-mer completeness was performed using the previously generated meryl database and merqury (Rhie et al. 2020a). We further estimated genome assembly accuracy via BUSCO gene set frameshift analysis using a pipeline previously described in Korlach et al. (2017). Following data availability and quality metrics established in Rhie et al. (2020a), we use the derived genome quality notation x·y·Q·C, where x = log10[contig NG50]; y = log10[scaffold NG50]; Q = Phred base accuracy QV (quality value); C = % genome represented by the first “n” scaffolds, following a known karyotype of 2n = 48 inferred from ancestral taxa. Quality metrics for the notation were calculated on the primary assembly.

Finally, using Repeat Masker (Smit, Hubley, and Green) we tabulated the repeat content of the assembled sequence by running a slow search and comparing our assembly to the library of known repeats from Actinopterygii (ray-finned fishes).

Results

Mitochondrial assembly

We assembled a mitochondrial genome with MitoHiFi. Final mitochondrial genome size was 16,511 bp. The base composition of the final assembly version is A = 26.62%, C = 27.68%, G = 17.85%, T = 27.85%, and consists of 22 unique transfer RNAs and 13 protein coding genes.

Nuclear assembly

We generated a de novo nuclear genome assembly of the Monkeyface Prickleback using 67.3 million read pairs of Omni-C data and 1.5 million PacBio HiFi reads. The latter yielded ~43.95-fold coverage (N50 read length 15,459 bp; minimum read length 43 bp; mean read length 15,332 bp; maximum read length 49,720 bp) based on the GenomeScope2.0 genome size estimation of 494.2 Mb. The k-mer spectrum output shows a distribution with a major peak, at ~14 (Fig. 2A). Based on PacBio HiFi reads, we estimated 0.234% sequencing error rate and 0.933% nucleotide heterozygosity rate.

Fig. 2.

Visual overview of genome assembly metrics. A) K-mer spectra output generated from PacBio HiFi data without adapters using GenomeScope2.0. B) BlobToolKit Snail plot showing a graphical representation of the quality metrics presented in Table 2 for the Cebidichthys violaceus primary assembly. The plot circle represents the full size of the assembly. From the inside-out, the central plot covers length-related metrics. The red line represents the size of the longest scaffold; all other scaffolds are arranged in size-order moving clockwise around the plot and drawn in gray starting from the outside of the central plot. Dark and light orange arcs show the scaffold N50 and scaffold N90 values. The central light gray spiral shows the cumulative scaffold count with a white line at each order of magnitude. White regions in this area reflect the proportion of Ns in the assembly. The dark versus light blue area around it shows mean, maximum and minimum GC versus AT content at 0.1% intervals (Challis et al. 2020). Omni-C contact maps for the primary (C) and alternate (D) genome assembly generated with PretextSnapshot. Hi-C contact maps translate proximity of genomic regions in 3D space to contiguous linear organization. Each cell in the contact map corresponds to sequencing data supporting the linkage (or join) between 2 of such regions.

The final assembly (fCebVio1) consists of 2 pseudo haplotypes, primary and alternate, both genome sizes are close but not identical to the estimated value from GenomeScope2.0 (Fig. 2A, Pflug et al. 2020). The primary assembly consists of 1,661 scaffolds spanning 575.6 Mb with contig N50 of 1 Mb, scaffold N50 of 16.3 Mb, longest contig of 8.6 Mb, and largest scaffold of 25.3 Mb. The alternate assembly consists of 1,413 scaffolds, spanning 606.1 Mb with contig N50 of 1.11 Mb, scaffold N50 of 12.9 Mb, largest contig 10.4 Mb, and largest scaffold of 27.3 Mb. Assembly statistics are reported in tabular form in Table 2, and graphical representation for the primary assembly in Fig. 2B.

We identified a total of 17 misassemblies, 10 on the primary assembly and 7 on the alternate, and broke the corresponding joins made by SALSA2 on both assemblies. We were able to close a total of 18 gaps, 9 per assembly. We further filtered out 5 contigs corresponding to arthropod contaminants (3 contigs from the primary assembly and 2 from the alternate). Finally, we filtered out a single contig from the alternate assembly corresponding to mitochondrial contamination. No further contigs were removed. The primary assembly has a BUSCO completeness score of 93.2% using the actinopterygii gene set, a per base quality (QV) of 35.77, a k-mer completeness of 94.11 and a frameshift indel QV of 46.54. The alternate assembly has a BUSCO completeness score of 96.5% using the actinopterygii gene set, a per base quality (QV) of 35.6, a k-mer completeness of 98.44 and a frameshift indel QV of 46.25. The Omni-C contact maps show that both assemblies are highly contiguous with some chromosome-length scaffolds (Fig. 2C and D). We have deposited scaffolds corresponding to both primary and alternate haplotype (see Table 2 and Data availability for details).

Table 2.

Sequencing and assembly statistics, and accession numbers.

BioProjects and Vouchers

CCGP NCBI BioProject

PRJNA720569

Genera NCBI BioProject

PRJNA766285

Species NCBI BioProject

PRJNA777152

NCBI BioSample

SAMN25872352

Specimen identification

CVI_PGR_0920_01

NCBI Genome accessions

Primary

Alternate

Assembly accession

JAKSXS000000000

JAKSXT000000000

Genome sequences

GCA_023349555.1

GCA_023349535.1

Genome Sequence

PacBio HiFi reads

Run

1 PACBIO_SMRT (Sequel II) run: 1.2M spots, 21.7G bases, 15.6 Gb

Accession

SRX15703629

Omni-C Illumina reads

Run

1 ILLUMINA (Illumina NovaSeq 6000) run: 48.8M spots, 14.7G bases, 4.7 Gb

Accession

SRX15703630

Genome Assembly Quality Metrics

Assembly identifier (quality code^a)

fCebVio1(6.7.Q35.C68)

HiFi read coverage^b

43.95×

Primary

Alternate

Number of contigs

1,661

1,413

Contig N50 (bp)

1,006,396

1,119,041

Contig NG50 (bp)^b

1,215,027

1,642,841

Longest contigs

8,638,030

10,494,032

Number of scaffolds

725

486

Scaffold N50 (bp)

16,359,613

12,913,723

Scaffold NG50 (bp)^b

16,819,117

14,679,334

Largest scaffold

25,343,235

27,304,300

Size of final assembly (bp)

575,660,146

606,177,218

Gaps per Gbp (#Gaps)

1,625 (936)

1,529 (927)

Indel QV (frameshift)

46.5463

46.1586

Base pair QV

35.7735

35.6002

Full assembly = 35.6837

K-mer completeness

94.1174

98.4403

Full assembly = 99.5766

BUSCO completeness (actinopterygii ), n = 3640

P^c

93.20%

92.50%

0.70%

0.80%

6.00%

A^c

97.40%

96.50%

0.90%

0.70%

1.90%

Organelles

1 partial mitochondrial sequence

JAKSXS010000725.1

^aAssembly quality code x·y·Q·C derived notation, from Rhie et al. (2020b). x = log10[contig NG50]; y = log10[scaffold NG50]; Q = Phred base accuracy QV (quality value); C = % genome represented by the first “n” scaffolds, following a known karyotype of 2n. In this case, 2n = 48 inferred from ancestral taxa. Quality code for all the assembly denoted by primary assembly (fCebVio1.0.p). BUSCO scores.

^bRead coverage and NGx statistics have been calculated based on the estimated genome size of 494.2 Mb.

^cP(rimary) and (A)lternate assembly values.

Table 2.

]. http://arxiv.org/abs/2109.04785

Sequencing and assembly statistics, and accession numbers.

BioProjects and Vouchers

CCGP NCBI BioProject

PRJNA720569

Genera NCBI BioProject

PRJNA766285

Species NCBI BioProject

PRJNA777152

NCBI BioSample

SAMN25872352

Specimen identification

CVI_PGR_0920_01

NCBI Genome accessions

Primary

Alternate

Assembly accession

JAKSXS000000000

JAKSXT000000000

Genome sequences

GCA_023349555.1

GCA_023349535.1

Genome Sequence

PacBio HiFi reads

Run

1 PACBIO_SMRT (Sequel II) run: 1.2M spots, 21.7G bases, 15.6 Gb

Accession

SRX15703629

Omni-C Illumina reads

Run

1 ILLUMINA (Illumina NovaSeq 6000) run: 48.8M spots, 14.7G bases, 4.7 Gb

Accession

SRX15703630

Genome Assembly Quality Metrics

Assembly identifier (quality code^a)

fCebVio1(6.7.Q35.C68)

HiFi read coverage^b

43.95×

Primary

Alternate

Number of contigs

1,661

1,413

Contig N50 (bp)

1,006,396

1,119,041

Contig NG50 (bp)^b

1,215,027

1,642,841

Longest contigs

8,638,030

10,494,032

Number of scaffolds

725

486

Scaffold N50 (bp)

16,359,613

12,913,723

Scaffold NG50 (bp)^b

16,819,117

14,679,334

Largest scaffold

25,343,235

27,304,300

Size of final assembly (bp)

575,660,146

606,177,218

Gaps per Gbp (#Gaps)

1,625 (936)

1,529 (927)

Indel QV (frameshift)

46.5463

46.1586

Base pair QV

35.7735

35.6002

Full assembly = 35.6837

K-mer completeness

94.1174

98.4403

Full assembly = 99.5766

BUSCO completeness (actinopterygii ), n = 3640

P^c

93.20%

92.50%

0.70%

0.80%

6.00%

A^c

97.40%

96.50%

0.90%

0.70%

1.90%

Organelles

1 partial mitochondrial sequence

JAKSXS010000725.1

^bRead coverage and NGx statistics have been calculated based on the estimated genome size of 494.2 Mb.

^cP(rimary) and (A)lternate assembly values.

In total, RepeatMasker identified 53,134,428 bp of repeat sequence (8.37% of the genome). Retroelements were estimated to make up 1.51% of the genome and DNA transposons were estimated to make up 2.12%. Simple repeats were the largest repeat group, making up 4.05% of the genome, while low complexity regions, satellites, and small RNA (rRNA, snRNA, tRNA) accounted for 0.45%, 0.04%, and 0.03%, respectively.

Discussion

As a recreationally important species and a candidate species for aquaculture, the Monkeyface Prickleback represents an important species for inclusion in CCGP. Despite the recreational and commercial value of the Monkeyface Prickleback, its stock size, annual take, and threat status are currently unknown/unevaluated (Froese and Pauly 2022). This, coupled with its slow growth and relatively long generation time (up to 7 yr) makes C. violaceus a potential species of conservation concern.

The majority of scientific research published to date on C. violaceus has been focused on digestion and ontogeny of the gut (German and Horn 2006; German et al. 2015; Heras et al. 2020). There has been little genetic work published on the Monkeyface Prickleback (Hinegardner and Rosen 1972; Kim et al. 2014; Heras et al. 2020) and we are unaware of any publications that employ molecular techniques to address distribution dynamics, dispersal potential, and/or adaptive variation across the species’ range.

In this study, we found that the genome size of C. violaceus is 575.6 Mb, which is smaller than the 792 Mb estimated by Hinegardner and Rosen (1972) and the 657 Mb published in the genome assembly by Heras et al. (2020) but consistent with the genome size of other shallow-water marine fishes included as part of CCGP (e.g. Clinocottus analis 538 Mb). Presently there are no known estimates for the karyotype for the Monkeyface Prickleback though 2n = 48 is typical for perciform fishes (Hinegardner and Rosen 1972). Scaffolds decrease evenly when they are arranged from largest to smallest, so the karyotype of the Monkeyface Prickleback remains unknown and additional research to establish the karyotype is warranted.

The high quality of the genome we are presenting here (contig N50 = 1 Mb, BUSCO completeness = 93.2%) will allow us to use it as a reference for the medium-coverage whole genome resequencing project for C. violaceus that comprises the next phase of the CCGP data collection pipeline (Shaffer et al. 2022). Our long-term goal is to use resequencing data from this and other species to help draw defensible, data-supported boundaries between genetically distinct marine ecoregions in California, as well as determine the degree of local adaptation among regions, and to use these data to delineate relevant protected areas that are grounded in strong genetic data. This genome is the first step in an important endeavor that will ultimately result in a sound protection plan for California’s natural marine resources.

Funding

This work was supported by the California Conservation Genomics Project, with funding provided to the University of California by the State of California, State Budget Act of 2019 [UC Award ID RSI-19-690224].

Acknowledgments

We would like to thank Pauline Blaimont (UCSC) for help in the field during collection of the sample. PacBio Sequel II library prep and sequencing were carried out at the DNA Technologies and Expression Analysis Cores at the UC Davis Genome Center, supported by NIH Shared Instrumentation Grant 1S10OD010786-01. Deep sequencing of Omni-C libraries used the NovaSeq S4 sequencing platforms at the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 OD018174 Instrumentation Grant. We thank the staff at the UC Davis DNA Technologies and Expression Analysis Cores and the UC Santa Cruz Paleogenomics Laboratory for their diligence and dedication to generating high-quality sequence data.

Data availability

Data generated for this study are available under NCBI BioProject PRJNA777152. Raw sequencing data for sample CVI_PGR_0920_01 (NCBI BioSample SAMN25872352) are deposited in the NCBI Short Read Archive (SRA) under SRX15703629 for PacBio HiFi sequencing data, and SRX15703630 for the Omni-C Illumina sequencing data. GenBank accessions for both primary and alternate assemblies are GCA_023349555 and GCA_023349535.1; and for genome sequences JAKSXS000000000 and JAKSXT000000000. The GenBank organelle genome assembly for the mitochondrial genome is CM041028.1. Assembly scripts and other data for the analyses presented can be found at the following GitHub repository: www.github.com/ccgproject/ccgp_assembly.

References

Abdennur

Mirny

LA.

Cooler: scalable storage for Hi-C data and other genomically labeled arrays

Bioinformatics

2020

;

(

311

–

316

Allio

Schomaker‐Bastos

Romiguier

Prosdocimi

Nabholz

Delsuc

2020

MitoFinder: Efficient automated large‐scale extraction of mitogenomic data in target enrichment phylogenomics.

Mol Ecol Resour

(

892

–

905

. doi:10.1111/1755-0998.13160.

Camacho

Coulouris

Avagyan

Papadopoulos

Bealer

Madden

TL.

BLAST+: architecture and applications

BMC Bioinformatics

2009

;

(

–

Challis

Richards

Rajan

Cochrane

Blaxter

BlobToolKit—interactive quality assessment of genome assemblies

2020

;

(

1361

–

1374

Cheng

Jarvis

Fedrigo

Koepfli

K-P

Urban

Gemmell

Robust haplotype-resolved assembly of diploid individuals without parental data.

2021

. [accessed

2022 Jul 1

Froese

Pauly

, editors.

FishBase

World Wide Web electronic publication

;

2022

. [accessed

2022 Feb 07

] www.fishbase.org, version (06/2022).

German

Gawlicka

Horn

MH.

Evolution of ontogenetic dietary shifts and associated gut features in prickleback fishes (Teleostei: Stichaeidae)

Comp Biochem Physiol B Biochem Mol Biol

2014

;

168

(

168

–

PubMed

German

Horn

MH.

Gut length and mass in herbivorous and carnivorous prickleback fishes (Teleostei: Stichaeidae): ontogenetic, dietary, and phylogenetic effects

Mar Biol

2006

;

148

(

1123

–

1134

German

Sung

Jhaveri

Agnihotri

More than one way to be an herbivore: convergent evolution of herbivory using different digestive strategies in prickleback fishes (Stichaeidae)

Zoology

2015

;

118

(

161

–

170

Ghurye

Rhie

Walenz

Schmitt

Selvaraj

Pop

Phillippy

Koren

Integrating Hi-C links with assembly graphs for chromosome-scale assembly

PLoS Comput Biol

2019

;

(

e1007273

Goloborodko

Abdennur

Venev

Brandao

Fudenberg

mirnylab/pairtools: v0.2.0

2018

. doi:10.5281/zenodo.1490831

Guan

McCarthy

Wood

Howe

Wang

Durbin

Identifying and removing haplotypic duplication in primary genome assemblies

Bioinformatics

2020

;

(

2896

–

2898

Gurevich

Saveliev

Vyahhi

Tesler

QUAST: quality assessment tool for genome assemblies

Bioinformatics

2013

;

(

1072

–

1075

Heras

Chakraborty

Emerson

German

DP.

Genomic and biochemical evidence of dietary adaptation in a marine herbivorous fish

Proc R Soc B Biol Sci

2020

;

287

(

1921

20192327

Hickerson

Cunningham

CW.

Contrasting quaternary histories in an ecologically divergent sister pair of low-dispersing intertidal fish (Xiphister) revealed by multilocus DNA analysis

Evolution

2005

;

(

344

–

360

PubMed

Hinegardner

Rosen

DE.

Cellular DNA content and the evolution of teleostean fishes

Am Nat

1972

;

106

(

951

621

–

644

Johnson

Freiwald

Bernardi

Genetic diversity affects the strength of population regulation in a marine fish

Ecology

2016

;

(

627

–

639

PubMed

Kerpedjiev

Abdennur

Lekschas

McCallum

Dinkla

Strobelt

Luber

Ouellette

Azhir

Kumar

, et al. .

HiGlass: web-based visual exploration and analysis of genome interaction maps

Genome Biol

2018

;

(

–

Kim

Horn

Sosa

German

DP.

Sequence and expression of an α-amylase gene in four related species of prickleback fishes (Teleostei: Stichaeidae): ontogenetic, dietary, and species-level effects

J Comp Physiol B

2014

;

184

(

221

–

234

Korlach

Gedman

Kingan

Chin

Howard

Audet

Cantin

Jarvis

ED.

De novo PacBio long-read and phased avian genome assemblies correct and add to reference genes generated with intermediate and short reads

GigaScience

2017

;

(

–

Leet

WS.

California’s living marine resources: a status report

. 4th ed.

Sacramento

California Dept. of Fish and Game

;

2001

Leis

The pelagic stage of reef fishes.

In:

Sales

, editor.

The ecology of fishes on coral reefs

San Diego (CA)

Academic Press Inc.

;

1991

. p.

182

–

229

Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM

2013.

doi:10.48550/arXiv.1303.3997

Love

Certainly more than you want to know about the fishes of the Pacific coast

Santa Barbara

Really Big Press

;

2011

Manni

Berkeley

Seppey

Simao

Zdobnov

EM.

BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes

2021

. doi:10.1093/molbev/msab199

Pflug

Holmes

Burrus

Johnston

Maddison

DR.

Measuring genome sizes using read-depth, k-mers, and flow cytometry: methodological comparisons in beetles (Coleoptera)

2020

;

(

3047

–

3060

Ramírez

Bhardwaj

Arrigoni

Lam

Grüning

Villaveces

Habermann

Akhtar

Manke

High-resolution TADs reveal DNA sequences underlying genome organization in flies

Nat Commun

2018

;

(

189

. doi:10.1038/s41467-017-02525-w

Ranallo-Benavidez

Jaron

Schatz

MC.

GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes

Nat Commun

2020

;

(

1432

. doi:10.1038/s41467-020-14998-3

Rhie

Walenz

Koren

Phillippy

AM.

Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies

Genome Biol

2020a

;

(

–

]. http://biorxiv.org/lookup/doi/10.1101/2020.05.22.110833.

Rhie

McCarthy

Fedrigo

Damas

Formenti

Koren

Uliano-Silva

Chow

Fungtammasan

Gedman

, et al. .

Towards complete and error-free genome assemblies of all vertebrate species.

Genomics

2020b

. [accessed

2020 Jun 18

Sahaka

Amara

Wattanakul

Gedi

Aldai

Parsiegla

Lecomte

Christeller

Gray

Gontero

, et al. .

The digestion of galactolipids and its ubiquitous function in Nature for the uptake of the essential α-linolenic acid

Food Funct

2020

;

(

6710

–

6744

Setran

Behrens

DW.

Transitional ecological requirements for early juveniles of two sympatric stichaeid fishes, Cebidichthys violaceus and Xiphister mucosus

Environ Biol Fishes

1993

;

(

381

–

395